Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: minimize clones when saving blocks #12870

Merged
merged 1 commit into from
Nov 27, 2024

Conversation

hai-rise
Copy link
Contributor

@hai-rise hai-rise commented Nov 26, 2024

Read another page of samply while waiting for a meeting today 😅.

image

There seem to be a lot of these (redundant allocations) in mempool, persistence, and root calculation hot paths. One PR at a time 🙏.

@mattsse
Copy link
Collaborator

mattsse commented Nov 26, 2024

very nice, appreciate this so much, ty

Copy link
Collaborator

@shekhirin shekhirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that Arc::unwrap_or_clone will unwrap in these cases? From what I can see, persistence codepath gets the block from the in-memory state, which in turn isn't pruned at the same time as the persistence is advanced, and is pruned later.

Copy link
Collaborator

@shekhirin shekhirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me if Arc::unwrap_or_clone actually unwraps, deferring to engine people here.

@shekhirin shekhirin added C-perf A change motivated by improving speed, memory usage or disk footprint A-engine Related to the engine implementation A-db Related to the database labels Nov 26, 2024
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smol nit, pending sanity review @klkvr

@@ -1,3 +1,5 @@
use std::sync::Arc;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please no more newline separations, this is driving me bananas 🙃

Suggested change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 🥲.

@@ -155,15 +157,15 @@ where
+ StaticFileProviderFactory,
{
/// Writes executed blocks and receipts to storage.
pub fn save_blocks(&self, blocks: &[ExecutedBlock]) -> ProviderResult<()> {
pub fn save_blocks(&self, blocks: Vec<ExecutedBlock>) -> ProviderResult<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo this makes sense

Comment on lines 207 to 208
// TODO: These writes should take ownership instead of cloning through
// the reference just to (likely) drop the original afterwards.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear what this todo means, what is todo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant write_hashed_state and write_trie_updates should take the full data instead of a reference just to clone it within. That only makes sense if we want to persist the data afterwards, then we only need to clone the values instead of the whole structure. However, in this case, we don't need to persist the input data so passing the whole thing in would avoid any cloning. Should I elaborate or just remove the comment for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment. Will just do it in upcoming PRs that give better context anyway.

@hai-rise
Copy link
Contributor Author

hai-rise commented Nov 26, 2024

Are we sure that Arc::unwrap_or_clone will unwrap in these cases? From what I can see, persistence codepath gets the block from the in-memory state, which in turn isn't pruned at the same time as the persistence is advanced, and is pruned later.

Yeah, that's why I chose "minimize" over "remove" like in previous PRs. I think the rule of thumb is to always use Arc::unwrap_or_clone to not need to worry about a more global scope or how each function caller manages input references. The extra reference counting check is fairly cheap.

That said, looking at persist_block_after_db_tx_creation -> remove_persisted_blocks was very helpful! If we're sure successful persistence is the hot path then we can remove (with ownership instead of cloning a get) blocks from memory to persist to disk, and re-insert into memory on persistence failures. Unless we still need the blocks for something before they're persisted.

Edited: We don't need to re-insert as persistence failures are critical and shut down the node?

@mattsse mattsse merged commit acfcfbd into paradigmxyz:main Nov 27, 2024
34 of 40 checks passed
@hai-rise hai-rise deleted the save-blocks branch November 27, 2024 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-db Related to the database A-engine Related to the engine implementation C-perf A change motivated by improving speed, memory usage or disk footprint
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants