Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure txPricedList reheaps regularly (avoid tx memory leak) #1816

Merged
merged 3 commits into from
Jan 25, 2022

Conversation

hbandura
Copy link
Contributor

@hbandura hbandura commented Jan 25, 2022

This PR fixes a memory leak found while running v1.5.0 nodes before the Espresso fork.

txPricedList keeps both a list of all txs in the pool, and also a heap for fast access to the cheapest txs. These heaps are not always synchronized with the txs list: they increment at the same time as the list, but they do not decrement at the same time. To decrement, the txpool marks how many txs are 'stale', and after a critical point, a Reheap() is called, where the whole memory of the heaps is released and they are recreated.

The issue in the code was that the critical point was never reached, and the heaps grew indefinitely. Strangely, the behavior is different after Expresso HF activation, since then Reheap() is called periodically on changes for the base fee. But before the HF nothing forces this so the heaps grow forever. One of the reasons for the critical point never being reached is the tx pool not respecting the contract of calling Removed every time it should.

The issue seems to be similar to ethereum/go-ethereum#23690, which was workarounded by reheaping after every pool reorg. Here we also add a couple of safeguards based on the size of the tx pool to ensure that, even if the node is stuck (due to a bad block or any other reason), proper reheaps are triggered if necessary.

A follow-up investigation should be made to have an exhaustive list of all the reasons why the de-synchronization occurs. An example of those is the method demoteUnexecutables, where many txs are removed from the pool but not from the pricedlist.

@hbandura hbandura requested a review from a team as a code owner January 25, 2022 20:51
@hbandura hbandura requested review from gastonponti, piersy, mcortesi and nategraf and removed request for a team January 25, 2022 20:51
@piersy
Copy link
Contributor

piersy commented Jan 25, 2022

Coverage from tests in ./e2e_test/... for ./consensus/istanbul/... at commit e243a36

coverage: 49.4% of statements across all listed packages
coverage:  65.9% of statements in consensus/istanbul
coverage:  39.8% of statements in consensus/istanbul/announce
coverage:  54.5% of statements in consensus/istanbul/backend
coverage:   0.0% of statements in consensus/istanbul/backend/backendtest
coverage:  24.3% of statements in consensus/istanbul/backend/internal/replica
coverage:  65.3% of statements in consensus/istanbul/core
coverage:  45.0% of statements in consensus/istanbul/db
coverage:   0.0% of statements in consensus/istanbul/proxy
coverage:  75.3% of statements in consensus/istanbul/uptime
coverage: 100.0% of statements in consensus/istanbul/uptime/store
coverage:  51.8% of statements in consensus/istanbul/validator
coverage:  79.2% of statements in consensus/istanbul/validator/random
CommentID: a350172000

@diwu1989
Copy link
Contributor

diwu1989 commented Jan 25, 2022

the ratio:

// Reheap if the ratio of stales is more than 25% of the heaps sizes
overStalesRatio := int(stales) > (urgentSize+floatingSize)/4

is it possible for the 4 to be taken in as a geth argument / flag, with default being 4?
in case the 25% threshold is not ideal, we can easily tune that without recompiling

or even make maxStales a flag of its own that can be passed in, whichever one is the cleanest way to have some manual control over the frequency of these intentional reheaps.

Copy link
Contributor

@mcortesi mcortesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -336,7 +336,8 @@ func NewTxPool(config TxPoolConfig, chainconfig *params.ChainConfig, chain block
pool.reset(nil, chain.CurrentBlock().Header())
// TODO: Does this reordering cause a problem?
// priced list depends on the current ctx which is set in reset
pool.priced = newTxPricedList(pool.all, &pool.currentCtx)
// Use the global slots as the max amount of stale transactions in the priced heap before a re-heap.
pool.priced = newTxPricedList(pool.all, &pool.currentCtx, int64(pool.config.GlobalSlots))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess logic here is that we can have as many stales a tx have in the pool, if we consider that a tx usually is 1 slot, which i'm not sure if true in average.

Still, it's works a default number, just a bit ackward

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if stales > as globalslots and a reheap hadn't been done before, it means that the heaps sizes are at a minimum 4x the size (due to the 25% rule) so something is off. A lower number could be set, but I feel like as a ceiling it should be ok.

@hbandura
Copy link
Contributor Author

the ratio:

// Reheap if the ratio of stales is more than 25% of the heaps sizes
overStalesRatio := int(stales) > (urgentSize+floatingSize)/4

is it possible for the 4 to be taken in as a geth argument / flag, with default being 4? in case the 25% threshold is not ideal, we can easily tune that without recompiling

or even make maxStales a flag of its own that can be passed in, whichever one is the cleanest way to have some manual control over the frequency of these intentional reheaps.

While adding them as arguments is possible, it probable adds more noise than anything, since now the Reheap will be run on every pool reorg with a reset, which happens on every ChainHeadEvent i.e every new block. We could argue that it's not even needed, except for extreme cases where the blockchain is not advancing.

@hbandura hbandura merged commit 3e312ce into master Jan 25, 2022
@hbandura hbandura deleted the hbandura/always_reheap_master branch January 25, 2022 22:54
gastonponti pushed a commit that referenced this pull request Jan 26, 2022
This PR fixes a memory leak found while running v1.5.0 nodes before the Espresso fork.

txPricedList keeps both a list of all txs in the pool, and also a heap for fast access to the cheapest txs. These heaps are not always synchronized with the txs list: they increment at the same time as the list, but they do not decrement at the same time. To decrement, the txpool marks how many txs are 'stale', and after a critical point, a Reheap() is called, where the whole memory of the heaps is released and they are recreated.

The issue in the code was that the critical point was never reached, and the heaps grew indefinitely. Strangely, the behavior is different after Expresso HF activation, since then Reheap() is called periodically on changes for the base fee. But before the HF nothing forces this so the heaps grow forever. One of the reasons for the critical point never being reached is the tx pool not respecting the contract of calling Removed every time it should.

The issue seems to be similar to ethereum/go-ethereum#23690, which was workarounded by reheaping after every pool reorg. Here we also add a couple of safeguards based on the size of the tx pool to ensure that, even if the node is stuck (due to a bad block or any other reason), proper reheaps are triggered if necessary.

A follow-up investigation should be made to have an exhaustive list of all the reasons why the de-synchronization occurs. An example of those is the method demoteUnexecutables, where many txs are removed from the pool but not from the pricedlist.
gastonponti pushed a commit that referenced this pull request Jan 26, 2022
This PR fixes a memory leak found while running v1.5.0 nodes before the Espresso fork.

txPricedList keeps both a list of all txs in the pool, and also a heap for fast access to the cheapest txs. These heaps are not always synchronized with the txs list: they increment at the same time as the list, but they do not decrement at the same time. To decrement, the txpool marks how many txs are 'stale', and after a critical point, a Reheap() is called, where the whole memory of the heaps is released and they are recreated.

The issue in the code was that the critical point was never reached, and the heaps grew indefinitely. Strangely, the behavior is different after Expresso HF activation, since then Reheap() is called periodically on changes for the base fee. But before the HF nothing forces this so the heaps grow forever. One of the reasons for the critical point never being reached is the tx pool not respecting the contract of calling Removed every time it should.

The issue seems to be similar to ethereum/go-ethereum#23690, which was workarounded by reheaping after every pool reorg. Here we also add a couple of safeguards based on the size of the tx pool to ensure that, even if the node is stuck (due to a bad block or any other reason), proper reheaps are triggered if necessary.

A follow-up investigation should be made to have an exhaustive list of all the reasons why the de-synchronization occurs. An example of those is the method demoteUnexecutables, where many txs are removed from the pool but not from the pricedlist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants