-
Notifications
You must be signed in to change notification settings - Fork 20.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #20582. Non-executable transactions should not be evicted. #21119
Closes #20582. Non-executable transactions should not be evicted. #21119
Conversation
@karalabe - I'm thinking it may be a good idea to spin this up on one of the bootnodes. Want me to do that? |
@holiman Pls do. The PR seems tiny enough, will try to review tomorrow. |
I'll rebase it first |
953b9fe
to
72ece8e
Compare
This PR lgtm, has been running stable for a while now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
the Now we use And for the former, it's even worse. We only record the beat when we successfully promote a transaction. But if a user caches lots of queue transaction but never successfully promote any one. Then the queue transactions belong to this user will never be dropped. |
@rjl493456442
Yes, this would be ideal.
You are right, if an account continues to have enqueued transactions yet it never makes it to pending, then these queued transactions will remain while they continue to actively include new queued transactions. However, in the current scenario, these just get dropped on every eviction tick. A more optimal solution would be to:
If you would like, I'm happy to put up a PR on this optimal solution (I've been in the txpool code for a while at this point and wouldn't mind). |
I'll update the PR soon with the optimal solution and an updated explanation - it doesn't appear to add much complexity. Sorry for the late reply on this thread - I missed the notifications. |
4174a3a
to
5cee963
Compare
@holiman & @rjl493456442 I've made the updates described above and this is ready for review. The pool is stable locally with the updates. I've updated to:
I've added thorough testing to check the edge cases. Also, I updated |
core/tx_pool.go
Outdated
@@ -363,7 +366,10 @@ func (pool *TxPool) loop() { | |||
// Any non-locals old enough should be removed | |||
if time.Since(pool.beats[addr]) > pool.config.Lifetime { | |||
for _, tx := range pool.queue[addr].Flatten() { | |||
pool.removeTx(tx.Hash(), true) | |||
if time.Since(pool.queued_ts[tx.Hash()]) > pool.config.Lifetime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get a mismatch between what's in pool.queued_tx
and other pools, the pool.queued_ts[tx.Hash()]
will return 0
, and flush out the tx. I suppose that is indeed the correct way to handle that error.
For testing though, it might be interesting to make it panic
, and run it for a few days with a panic
in there, just to see that we don't have any inconsistencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just updated this to panic, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was a bit unclear. We can't merge it to master
with a panic
in there, it was more of a note-to-self (or whoever puts it on a prod server) to put a panic check there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll drop the last commit and rebase it on master, ok?
core/tx_pool.go
Outdated
@@ -363,7 +367,13 @@ func (pool *TxPool) loop() { | |||
// Any non-locals old enough should be removed | |||
if time.Since(pool.beats[addr]) > pool.config.Lifetime { | |||
for _, tx := range pool.queue[addr].Flatten() { | |||
pool.removeTx(tx.Hash(), true) | |||
if pool.queued_ts[tx.Hash()].IsZero() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@holiman here we include the check and panic.
… on each tick if there are no promote transactions or if a pending/reset empties the pending list. Tests and logging expanded to handle these cases in the future.
…heartbeat on promotion or pending replaced
2423562
to
45334d9
Compare
@holiman updates applied 💯 |
Great! Did you by any chance run it with panics enabled yourself?
|
Yes I did for a few days and it didn't panic. |
Closed in favour of #21300 |
Closes #20582.
Description
Queued transactions should not be evicted until after
evictionInterval
has passed. Successful promotion of transactions viapromoteExecutables
, pending replacement viaadd(...)
, or insertion into the queue viaenqueueTx
should update the last heartbeat. The heartbeat is no longer removed when thepending
list is emptied or after a minute has passed in the future queue.Tests are expanded to ensure queued transactions are not removed before the
evictionInterval
duration has passed. Additionally, tests are expanded to ensure a reset or emptied pending list does not clear out the heart beat.Logging is expanded to detect evictions as a method to prevent/monitor issues in the future.
Monitoring Before & After
We see a 5x reduction in queue evictions and the queue is able to reach capacity. Previously, I was seeing 5 tx/s evicted from the queue, that has been reduced to 1 tx/s evicted. Additionally, the queue is able to grow to its limit, from ~500 to 1024 in the case of my config.
Before:


After:


Questions
There could be a case for removing heartbeat updates on pending replaced and in promote executables, however, it seems optimal to keep the queue fresh when new activity is seen and to manage ordering appropriately on truncation.