-
Notifications
You must be signed in to change notification settings - Fork 2.7k
try-runtime-cli: parallel key scraping #14090
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for having a go at this, added some comments
…ndex out of bounds issues (#14085)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding these tests, they'll help a lot getting this over the line.
I see three areas for improvement:
- A
StorageKey
is a base-16, meaning we need to cover0
through to15
(0
->f
in hex) to make sure we cover the entire prefix. The distance between each tuple of keys should be as equal as possible, while covering0
through15
. - The prefix needs to remain the same on your output keys, notice I added the extra zero in my example below. Currently you're actually iterating over the entire
0,0,0
prefix just in(StorageKey(vec![0, 0, 0]), StorageKey(vec![0, 0, 1])),
. - It probably needs a check to make sure the function won't spit out invalid storage keys if a prefix is passed that cannot be subdivided further.
Example of 1, 2 changes applied to your num_parts = 5 test: the test should output something more like this:
// prefix is 0000. so we want to cover 0000a,0000b,...,0000e,0000f.
assert_eq!(
result,
vec![
// covers 00000 - 00002
(StorageKey(vec![0, 0, 0, 0, 0]), StorageKey(vec![0, 0, 0, 0, 2])),
// covers 00003 - 00005
(StorageKey(vec![0, 0, 0, 0, 3]), StorageKey(vec![0, 0, 0, 0, 5])),
// covers 00006 - 00008
(StorageKey(vec![0, 0, 0, 0, 6]), StorageKey(vec![0, 0, 0, 0, 8])),
// covers 00009 - 0000b
(StorageKey(vec![0, 0, 0, 0, 9]), StorageKey(vec![0, 0, 0, 0, 11])),
// covers 0000c - 0000f
(StorageKey(vec![0, 0, 0, 0, 12]), StorageKey(vec![0, 0, 0, 0, 15])),
]
);
Please let me know if anything in unclear.
Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions. |
Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions. |
Thanks for the comments @liamaharon, I'm working on the PR with the improvements. |
This PR implements try-runtime-cli: parallel key scraping process. Fixes #14085
The implementation divides the prefix into
NUM_PARTS
parts and creates a separaterpc_get_keys_paged_part
task for each part, which is executed concurrently usingjoin_all
. Each task fetches keys from its own range of the prefix and returns them to the main task, which aggregates the keys from all the parts and filters out default child storage.We should select the value of
N
(the number of parts) carefully to optimize the balance between parallelism and efficiency. If we choose a value ofN
that is too large, we may end up creating too many parallel tasks, which could result in additional overhead and slow down the overall process. On the other hand, if we choose a value ofN
that is too small, we may not be able to fully utilize the available resources, resulting in slower overall performance.