Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChainDB: batch garbage collections #1932

Merged
merged 3 commits into from
Apr 10, 2020
Merged

ChainDB: batch garbage collections #1932

merged 3 commits into from
Apr 10, 2020

Conversation

mrBliss
Copy link
Contributor

@mrBliss mrBliss commented Apr 10, 2020

Previously, we scheduled a garbage collection for each block for 10 seconds in
the future. This meant that our scheduled GCs queue was blocks/s * 10 long.
When tracing the queue length on my machine, it hovered between 5000 and 6000
entries. Moreover, a VolatileDB garbage collection is triggered at blocks/s,
which should result in a lot of contention for the VolatileDB state.

Even worse is that a 10 second delay is too short to reliably ensure the block
will have been flushed to disk (in the ImmutableDB) before it is garbage
collected. However, increasing this delay would make the queue significantly
longer.

To fix these issues, we introduce a GC interval (in seconds). We batch all GCs
in the same interval together. This means that the queue length is now at most
⌈delay / interval⌉ + 1, e.g., 60s / 10s = 7, which is much shorter than
5000-6000. Moreover, there will be at most one GC every interval seconds,
e.g., 10s.

The cost of switching to a longer GC delay is that the in-memory index of the
VolatileDB will be larger, making operations on it, such as lookups, more
expensive (most operations are O(n*log(n))). See the docstring of
'defaultSpecificArgs' for what the new default values of gcDelay and
gcInterval mean in practice.

mrBliss added 3 commits April 9, 2020 10:03
Previously, we scheduled a garbage collection for each block for 10 seconds in
the future. This meant that our scheduled GCs queue was blocks/s * 10 long.
When tracing the queue length on my machine, it hovered between 5000 and 6000
entries. Moreover, a VolatileDB garbage collection is triggered at blocks/s,
which should result in a lot of contention for the VolatileDB state.

Even worse is that a 10 second delay is too short to reliably ensure the block
will have been flushed to disk (in the ImmutableDB) before it is garbage
collected. However, increasing this delay would make the queue significantly
longer.

To fix these issues, we introduce a GC interval (in seconds). We batch all GCs
in the same interval together. This means that the queue length is now at most
⌈delay / interval⌉ + 1, e.g., 60s / 10s = 7, which is much shorter than
5000-6000. Moreover, there will be at most one GC every `interval` seconds,
e.g., 10s.

The cost of switching to a longer GC delay is that the in-memory index of the
VolatileDB will be larger, making operations on it, such as lookups, more
expensive (most operations are `O(n*log(n))`). See the docstring of
'defaultSpecificArgs' for what the new default values of `gcDelay` and
`gcInterval` mean in practice.
@mrBliss mrBliss added the consensus issues related to ouroboros-consensus label Apr 10, 2020
@mrBliss mrBliss requested a review from edsko April 10, 2020 06:57
@mrBliss
Copy link
Contributor Author

mrBliss commented Apr 10, 2020

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Apr 10, 2020

@iohk-bors iohk-bors bot merged commit 5fc0470 into master Apr 10, 2020
@iohk-bors iohk-bors bot deleted the mrBliss/batch-gc branch April 10, 2020 08:03
coot pushed a commit that referenced this pull request May 16, 2022
1932: ChainDB: batch garbage collections r=mrBliss a=mrBliss

Previously, we scheduled a garbage collection for each block for 10 seconds in
the future. This meant that our scheduled GCs queue was blocks/s * 10 long.
When tracing the queue length on my machine, it hovered between 5000 and 6000
entries. Moreover, a VolatileDB garbage collection is triggered at blocks/s,
which should result in a lot of contention for the VolatileDB state.

Even worse is that a 10 second delay is too short to reliably ensure the block
will have been flushed to disk (in the ImmutableDB) before it is garbage
collected. However, increasing this delay would make the queue significantly
longer.

To fix these issues, we introduce a GC interval (in seconds). We batch all GCs
in the same interval together. This means that the queue length is now at most
⌈delay / interval⌉ + 1, e.g., 60s / 10s = 7, which is much shorter than
5000-6000. Moreover, there will be at most one GC every `interval` seconds,
e.g., 10s.

The cost of switching to a longer GC delay is that the in-memory index of the
VolatileDB will be larger, making operations on it, such as lookups, more
expensive (most operations are `O(n*log(n))`). See the docstring of
'defaultSpecificArgs' for what the new default values of `gcDelay` and
`gcInterval` mean in practice.

Co-authored-by: Thomas Winant <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus issues related to ouroboros-consensus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants