Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2025-02-07] [Performance] Create PING for Pusher Connection #53826

Closed
6 tasks
muttmuure opened this issue Dec 10, 2024 · 34 comments
Closed
6 tasks
Assignees
Labels
AutoAssignerNewDotQuality Used to assign quality issues to engineers Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Engineering Internal Requires API changes or must be handled by Expensify staff Reviewing Has a PR in review Weekly KSv2

Comments

@muttmuure
Copy link
Contributor

muttmuure commented Dec 10, 2024

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!


What performance issue do we need to solve?

Intermittently, Pusher becomes unreachable for reasons out of our control and the client needs to reconnect to it, which causes delays or failure to get realtime updates to chat data.

We can't count on Pusher's internal PING to detect liveness -- we need our own application layer Pusher-ping.

What is the impact of this on end-users?

A more reliable Pusher-PING means users get realtime updates to their New Expensify data more reliably.

List any benchmarks that show the severity of the issue

This is a recurring issue: https://expensify.slack.com/archives/C05LX9D6E07/p1731299637345689

Proposed solution (if any)

Develop our own application layer Pusher-PING that works in the same way as our existing PING code, but instead tests the Pusher connection.

List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)

Note: These should be the same as the benchmarks collected before any changes.

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • Android: Native
  • Android: mWeb Chrome
  • iOS: Native
  • iOS: mWeb Safari
  • MacOS: Chrome / Safari
  • MacOS: Desktop

Version Number:
Reproducible in staging?: Y
Reproducible in production?: Y
Email or phone of affected tester (no customers):
Logs: https://stackoverflow.com/c/expensify/questions/4856
Notes/Photos/Videos: Any additional supporting documentation
Expensify/Expensify Issue URL:
Issue reported by: @quinthar
Slack conversation: https://expensify.slack.com/archives/C05LX9D6E07/p1731299637345689

View all open jobs on Upwork

Issue OwnerCurrent Issue Owner: @mallenexpensify
@muttmuure muttmuure added Engineering Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Dec 10, 2024
Copy link

melvin-bot bot commented Dec 10, 2024

Auto-assigning issues to engineers is no longer supported. If you think this issue should receive engineering attention, please raise it in #whatsnext.

Copy link

melvin-bot bot commented Dec 10, 2024

Triggered auto assignment to @mallenexpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

@muttmuure muttmuure changed the title [Performance] Add PING for Pusher Connection [Performance] Create PING for Pusher Connection Dec 10, 2024
@muttmuure
Copy link
Contributor Author

#52437

@mallenexpensify
Copy link
Contributor

Checking to see if Deeter wants assignment here. If not I'll add AutoAssignerNewDotQuality to assign an engineer since this is Quality, High status and the other issue is held on this.

Guessing it needs to be Internal

@melvin-bot melvin-bot bot added the Overdue label Dec 13, 2024
@mallenexpensify mallenexpensify added the AutoAssignerNewDotQuality Used to assign quality issues to engineers label Dec 13, 2024
Copy link

melvin-bot bot commented Dec 13, 2024

Triggered auto assignment to @cristipaval (AutoAssignerNewDotQuality)

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Overdue labels Dec 13, 2024
@mallenexpensify mallenexpensify added the Internal Requires API changes or must be handled by Expensify staff label Dec 13, 2024
@mallenexpensify
Copy link
Contributor

mallenexpensify commented Dec 13, 2024

@cristipaval , what's your level of knowledge with Pusher?!??! :)
Added AutoAssignerNewDotQuality cuz it's Quality and status is HIGH

@melvin-bot melvin-bot bot added the Overdue label Dec 16, 2024
@cristipaval
Copy link
Contributor

I don't know much about how Pusher works under the hood. Feel free to recruit someone else if you don't want to wait for me to gather knowledge before I start implementing.

@melvin-bot melvin-bot bot removed the Overdue label Dec 16, 2024
@mallenexpensify
Copy link
Contributor

Thanks @cristipaval , checking in #quality and #engineering to try to recruit someone.

@cristipaval cristipaval removed the Weekly KSv2 label Dec 18, 2024
@tgolen
Copy link
Contributor

tgolen commented Dec 18, 2024

I'd like to take this, so I will assign it to myself.

@tgolen tgolen assigned tgolen and unassigned cristipaval Dec 18, 2024
@tgolen tgolen added Weekly KSv2 and removed Daily KSv2 labels Dec 18, 2024
@tgolen
Copy link
Contributor

tgolen commented Jan 23, 2025

Daily Update

  • The Web-E PR has been merged and deployed to staging
  • I have bumped the App PR to get it reviewed as it has not been reviewed yet

Next Steps

  • Get the App PR merged

ETA

  • Still Friday, Jan 24 (tomorrow)

@tgolen
Copy link
Contributor

tgolen commented Jan 27, 2025

Daily Update

  • The App PR is in final stages of review and should be merged momentarily

Next Steps

  • Wait for the PR to be deployed
  • Check logs to ensure the ping is operated as I expect it to
  • Create a grafana chart to see how long the PINGs take

ETA

  • Not sure when the deploy would happen, but maybe by EOW

@tgolen
Copy link
Contributor

tgolen commented Jan 28, 2025

Daily Update

  • The App PR is still in review and just needs a final review by an internal engineer

Next Steps

ETA

  • Merged tomorrow, Jan. 29

@tgolen
Copy link
Contributor

tgolen commented Jan 29, 2025

Daily Update

Next Steps

  • Wait for it to deploy to production
  • @tgolen look at some logs to ensure it's working like I expect it to
  • @tgolen create a grafana chart to expose the ping latency
  • @tgolen create a grafana chart to expose how often the pusher ping causes clients to go offline

ETA

  • Depending on the timeline of deploys, I imagine this can be closed out early next week.

@melvin-bot melvin-bot bot added Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Daily KSv2 labels Jan 31, 2025
@melvin-bot melvin-bot bot changed the title [Performance] Create PING for Pusher Connection [HOLD for payment 2025-02-07] [Performance] Create PING for Pusher Connection Jan 31, 2025
Copy link

melvin-bot bot commented Jan 31, 2025

Reviewing label has been removed, please complete the "BugZero Checklist".

@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Jan 31, 2025
Copy link

melvin-bot bot commented Jan 31, 2025

The solution for this issue has been 🚀 deployed to production 🚀 in version 9.0.92-6 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2025-02-07. 🎊

For reference, here are some details about the assignees on this issue:

  • @ishpaul777 requires payment (Needs manual offer from BZ)

Copy link

melvin-bot bot commented Jan 31, 2025

@ishpaul777 @mallenexpensify @ishpaul777 The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed. Please copy/paste the BugZero Checklist from here into a new comment on this GH and complete it. If you have the K2 extension, you can simply click: [this button]

@tgolen
Copy link
Contributor

tgolen commented Jan 31, 2025

Daily Update

  • This PR caused a deploy blocker because it was marking clients as being offline more than normal
  • I submitted a PR that only turned off the part of the ping that forces the client into offline mode

Next Steps

  • I need to dig in and analyze the logs to understand what was going on, then formulate a plan for how to prevent false alarms from happening

ETA

  • Friday, Feb. 7

@melvin-bot melvin-bot bot added Daily KSv2 Reviewing Has a PR in review Weekly KSv2 and removed Weekly KSv2 Daily KSv2 labels Feb 6, 2025
@tgolen
Copy link
Contributor

tgolen commented Feb 6, 2025

Update

  • Submitted a new E/App PR to relax some of the PONG requirements to try and dial in the PING PONG mechanism further
  • It still doesn't change the online/offline state of the client yet (that's currently disabled)
  • My goal is to keep an eye on logs and graphs to see if the behavior is less extreme

Next Steps

  • Have that PR deployed
  • @tgolen monitor logs and graphs

ETA

  • Next Friday, Feb. 14

Copy link

melvin-bot bot commented Feb 7, 2025

Payment Summary

Upwork Job

BugZero Checklist (@mallenexpensify)

  • I have verified the correct assignees and roles are listed above and updated the neccesary manual offers
  • I have verified that there are no duplicate or incorrect contracts on Upwork for this job (https://www.upwork.com/ab/applicants//hired)
  • I have paid out the Upwork contracts or cancelled the ones that are incorrect
  • I have verified the payment summary above is correct

@mallenexpensify
Copy link
Contributor

I see payment was due a few days ago but I feel like there's more work to do. @ishpaul777 ,
Is there more than just this PR?

@ishpaul777
Copy link
Contributor

ishpaul777 commented Feb 11, 2025

This is #55326 the only one i reviewed,. @tgolen can you confirm if there will be more E/App PR planned where I can help? if no I think we can release the payment

@tgolen
Copy link
Contributor

tgolen commented Feb 11, 2025 via email

@mallenexpensify
Copy link
Contributor

@ishpaul777 can you please accept the job and reply here once you have?
https://www.upwork.com/jobs/~021889470488141539583

@ishpaul777
Copy link
Contributor

There might be a couple more frontend PRs, but I think they should all be separate from this.

happy to help! whenever required 😄


@ishpaul777 can you please accept the job and reply here once you have?
https://www.upwork.com/jobs/~021889470488141539583

Accepted, Thanks!

@mallenexpensify
Copy link
Contributor

Upwork is buggy now so I can't pay, will get to next week, sorry @ishpaul777

@mallenexpensify
Copy link
Contributor

Contributor+: @ishpaul777 paid $250 via Upwork

I don't think we need a regression test here, comment if you disagree and I can get one created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoAssignerNewDotQuality Used to assign quality issues to engineers Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Engineering Internal Requires API changes or must be handled by Expensify staff Reviewing Has a PR in review Weekly KSv2
Projects
Status: Done
Development

No branches or pull requests

5 participants