Startup optimization in shrink - don't shrink non-shrinkable slots #17405

sakridge · 2021-05-22T09:37:10Z

Problem

Startup shrink and clean operations are less efficient than they could be

Summary of Changes

Skip shrinking slots which do not save any space or mappings.
Only add keys to dirty set when there is more than 1 version of the key in the index and thus has a chance to be cleaned. Detected in generate_index

Fixes #

sakridge · 2021-05-22T09:48:02Z

This PR:
datapoint: verify_snapshot_bank clean_us=9408005i shrink_all_slots_us=10999445i
clean > 9.4s, shrink > 10.9s
master
datapoint: verify_snapshot_bank clean_us=104287378i shrink_all_slots_us=55167563i
clean > 104s, shrink > 55s

codecov · 2021-05-22T11:21:05Z

Codecov Report

Merging #17405 (395ada1) into master (7ce910f) will increase coverage by 0.0%.
The diff coverage is 96.2%.

❗ Current head 395ada1 differs from pull request most recent head d223978. Consider uploading reports for the commit d223978 to get more accurate results

@@           Coverage Diff            @@
##           master   #17405    +/-   ##
========================================
  Coverage    82.6%    82.7%            
========================================
  Files         425      425            
  Lines      118840   118543   -297     
========================================
- Hits        98269    98057   -212     
+ Misses      20571    20486    -85

runtime/src/accounts_db.rs

runtime/src/accounts_index.rs

runtime/src/accounts_db.rs

runtime/src/accounts_index.rs

carllin · 2021-05-24T23:23:16Z

runtime/src/accounts_db.rs

        drop(accounts_index_map_lock);
        index_read_elapsed.stop();
        let aligned_total: u64 = self.page_align(alive_total);

+        // If not saving memory maps or more than 1 page of data, then skip the shrink
+        if num_stores == 1 && aligned_total + PAGE_SIZE >= original_bytes {


I think it's also a good idea to extract this into a reusable is_bytes_saved_sufficient_for_shrink() that can be added to the check here where we add to the shrink_candidates: https://github.com/solana-labs/solana/blob/master/runtime/src/accounts_db.rs#L4592-L4607.

Otherwise, we'll continually add to the shrink candidates and incur the overhead of running/scanning in the beginning of this do_shrink_slot_stores() function as we delete accounts from the same slot, even if they continually fail this new aligned_total + PAGE_SIZE >= original_bytes check

Also, just out of curiosity, do we know the snapshot sizes before and after?

Theoretically if every slot had an extra PAGE_SIZE = 4096 bytes, we might get an extra 432000 slots * 496 bytes ~= 2GB raw extra size before compression, but wonder if the results reflect this.

I ran this for a while on mainnet. snapshot sizes didn't seem to be affected:

Every 2.0s: ls -alh mainnet-beta/validator-ledger/snapshot-* -rw-rw-r-- 1 stephen_solana_com stephen_solana_com 6.3G May 4 15:00 mainnet-beta/validator-ledger/snapshot-76776676-EcV4U9gqqdXuhqgP69WaLhWmoyDQcJYxM1ZE5YuSspdF.tar.zst -rw-rw-r-- 1 stephen_solana_com stephen_solana_com 6.6G May 27 15:13 mainnet-beta/validator-ledger/snapshot-80249526-9NwTfazvvgmdb4Wr8VnAhC1kEWqnU9APZjfpmrktjpRS.tar.zst -rw-rw-r-- 1 stephen_solana_com stephen_solana_com 6.6G May 27 15:18 mainnet-beta/validator-ledger/snapshot-80249917-2WYkAHaKCYFEyCtA8gBF4Qxh6iNbrcxRbXUrqUQSvPai.tar.zst

Left side is this change, right side master starting with that huge spike at startup.

Looks great 😃

carllin · 2021-05-24T23:38:00Z

runtime/src/accounts_db.rs

+        if num_stores == 1 && aligned_total + PAGE_SIZE >= original_bytes {
+            for pubkey in unrefed_pubkeys {
+                if let Some(locked_entry) = self.accounts_index.get_account_read_entry(pubkey) {
+                    locked_entry.addref();


I wonder if we could avoid this addref() by either:

Removing the earlier unref(), and instead just iterate through the unrefed_pubkeys and call unref() after we've passed the aligned_total + PAGE_SIZE >= original_bytes check above. Downside is an extra pass through the unrefed_pubkeys keys

We have the utility methods store.alive_bytes() and store.total_bytes() for tracking the alive/dead sizes. These should be accurate as long as a full clean() runs before shrink() which i think is true in verify_snapshot_bank(). Moreover, clean() needs to run before shrink() for the current is_alive check in shrink to be accurate anyways:

let is_alive = locked_entry.slot_list().iter().any(|(_slot, i)| { i.store_id == stored_account.store_id && i.offset == stored_account.account.offset });

Maybe using these we could modify and move the aligned_total + PAGE_SIZE >= original_bytes check earlier in the function before the scan, and exit there if the check fails (would also avoid the overhead of the additional scan through the storages). Something like:

if num_stores == 1 && self.page_align(store.alive_bytes() as u64) + PAGE_SIZE >= store.total_bytes() { return 0;}

Right for 1, I am thinking this is the not-common case since we queued it up to shrink and we have other signals that tell us that shrinking is likely productive like alive_bytes/written_bytes. So we should optimize for the case where we are actually doing the shrink.

Clean does run before this shrink. I was thinking those may not be accurate, hence the location of this check, but I think you might be right, they might actually be somewhat accurate.

jeffwashington

lgtm

Pull request has been modified.

runtime/src/accounts_db.rs

sakridge · 2021-06-01T18:56:21Z

@carllin any other concerns on this?

carllin

Did one final pass, looks good!

(cherry picked from commit 14c52ab) # Conflicts: # runtime/src/accounts_db.rs

…ackport #17405) (#17792) * Skip shrink when it doesn't save anything (#17405) (cherry picked from commit 14c52ab) # Conflicts: # runtime/src/accounts_db.rs * fix merge error Co-authored-by: sakridge <[email protected]> Co-authored-by: Jeff Washington (jwash) <[email protected]>

sakridge force-pushed the startup-opt branch 2 times, most recently from edaa72d to 409af0d Compare May 22, 2021 09:56

sakridge requested review from jeffwashington and carllin May 24, 2021 14:36