-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18552 +/- ##
=======================================
Coverage 82.7% 82.7%
=======================================
Files 436 437 +1
Lines 123512 123549 +37
=======================================
+ Hits 102163 102194 +31
- Misses 21349 21355 +6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 paths that are writing votes to crds table:
ClusterInfo::push_vote
:
https://github.com/solana-labs/solana/blob/761de8b1a/gossip/src/cluster_info.rs#L1047-L1092ClusterInfo::refresh_vote
:
https://github.com/solana-labs/solana/blob/761de8b1a/gossip/src/cluster_info.rs#L1094-L1126
Both are computing the vote_index
under a gossip read lock, releasing the lock and then writing the votes to crds table under a 2nd write lock. So technically they have a race condition with each other where both may compute the same vote-index and overwrite each other.
Last time I checked both paths are called from the same thread, and so to my knowledge we are not hitting that race condition. But this commit is adding a new thread, so making the race condition a runtime possibility.
One possibility is just to compute the vote-index under an exclusive lock and hold on to that lock until the vote is written to the crds table. There are only 32 lookups to compute vote-index, so maybe it would not be that bad.
But, can we please hold on this approach until I can finish some tests with gossip locking and see where those go? Adding a new thread adds its own subtle complexities like the race condition above. I would rather use this separate thread approach as a last resort.
Yea, let’s see the results of the locking optimization. couldn’t we just use the same thread to do both the writes? They would be ordered by the channel, right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM aside from the fact that it is adding a new thread, and I would prefer exhaust all optimaztions before adding new threads. Your call to merge this now or wait until gossip lock reworks are complete, and see if this is still needed.
Pull request has been modified.
(cherry picked from commit 0f8bcf6) # Conflicts: # core/src/replay_stage.rs # core/src/tvu.rs
Co-authored-by: sakridge <[email protected]>
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
This reverts commit 0f8bcf6.
Problem
Voting is slow and blocks replay of forks.
Summary of Changes
Push voting to another thread
Fixes #