Continuous backlog population #3999

pwojcikdev · 2022-11-15T21:51:39Z

Backlog population is a process in which a node scans all accounts in the ledger, with or without any confirmed blocks, and forwards (activates) those accounts which do not have all their blocks confirmed to election scheduler for prioritization and eventual queuing in proper bucket. It is necessary to do this periodically, because the amount of space in each bucket is limited (currently ~2000 entries) and number of accounts needing confirmations can be much higher than that, especially during bootstrap or network spam attack.

The problem with current implementation is that this process runs every 5 minutes and scans the whole ledger at once, leading to situations where we run out of accounts to prioritize before the next run has started. This is especially visible during bootstrapping, a graph showing such situation is included below. We can clearly see the bumps in AEC occupancy where prioritization queue is filled, followed by periods of idleness when priority queue is emptied:

This PR fixes that by modifying the way the ledger scan is done. Instead of 5 minute interval, we run the scan all the time (unless disabled by setting frontiers_confirmation = disabled node config setting), but we throttle the rate at which the scan is done to limit consumption of node resources. The rate and frequency is controlled by two new node-config.toml settings: backlog_scan_batch_size and backlog_scan_frequency. By default it scans 10000 accounts per second divided into 10 batches, so 1000 accounts per batch. This is rather conservative and should be later adjusted with feedback from beta node operators (before this PR we dit it in batches of 64k).

The result of this PR is the AEC that stays full almost all the time (except the initial phase of the bootstrap):

dsiganos · 2022-11-16T17:04:44Z

It looks like the unit test 'request_aggregator.cannot_vote' failed 4 times.

pwojcikdev · 2022-11-16T17:19:42Z

@dsiganos I see, there seems to be two tests that are still failing, I'm looking into that. Appears to only break on GH runners, so a bit annoying to debug.

dsiganos · 2022-11-16T17:32:55Z

It fails when the system is under heavy load. Starting a parallel build from scratch seems to make it crash.
I am currently fixing the qwahzi unit test and then I will come back to this.

I got this crash on my laptop:

[ RUN      ] request_aggregator.cannot_vote
/Users/ds/CLionProjects/nano-node/nano/core_test/request_aggregator.cpp:501: Failure
Value of: system.poll ()
  Actual: Deadline expired
Expected:
zsh: segmentation fault  ./core_test --gtest_repeat=1000 --gtest_filter=request_aggregator.cannot_vote

nano/node/backlog_population.hpp

nano/node/backlog_population.cpp

nano/core_test/backlog.cpp

nano/node/nodeconfig.cpp

dsiganos · 2022-11-18T01:02:01Z

I left a number of minor comments but it looks good to me overall.

nano/core_test/active_transactions.cpp

…xpects that it will not receive a vote for send1 because it has not made such a request, however, if the election is still or recently active, it may receive a broadcast vote before it makes a request. Check that the election has ended on node1 and allow some time for in-flight votes broadcasts to finish before starting node2.

dsiganos reviewed Nov 18, 2022

View reviewed changes

clemahieu force-pushed the continuous-backlog-3 branch from 64218b6 to ce52a8f Compare November 30, 2022 11:02

dsiganos reviewed Dec 2, 2022

View reviewed changes

nano/core_test/active_transactions.cpp Outdated Show resolved Hide resolved

dsiganos mentioned this pull request Jan 10, 2023

Backlog population is done by a linear scan separated by a 5 minute w… #3648

Closed

qwahzi mentioned this pull request Jan 10, 2023

Reduce Bandwidth Utilization by Reducing Fanout #993

Closed

pwojcikdev added 7 commits January 16, 2023 10:42

Continuous backlog population

b060837

Add backlog population stats

133f803

Invert dependencies

c907837

Add backlog population test

9252318

Rework backlog config

b9164bd

Add node config options

83b4b1e

Fix compilation

8f33766

clemahieu force-pushed the continuous-backlog-3 branch from 98227de to 3e60600 Compare January 16, 2023 11:16

clemahieu force-pushed the continuous-backlog-3 branch from 3e60600 to e848e09 Compare January 16, 2023 11:25

clemahieu added 2 commits January 16, 2023 11:42

Adding review edits.

2584098

Renaming config option per review comments.

55c6d1b

clemahieu merged commit 02bffc2 into nanocurrency:develop Jan 16, 2023

thsfs added enhancement documentation This item indicates the need for or supplies updated or expanded documentation non-functional change labels Mar 1, 2023

qwahzi mentioned this pull request May 24, 2023

Update understanding-the-code.md w/ continuous backlog population info nanocurrency/nano-docs#678

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous backlog population #3999

Continuous backlog population #3999

pwojcikdev commented Nov 15, 2022 •

edited by clemahieu

Loading

dsiganos commented Nov 16, 2022

pwojcikdev commented Nov 16, 2022 •

edited

Loading

dsiganos commented Nov 16, 2022 •

edited

Loading

dsiganos commented Nov 18, 2022

Continuous backlog population #3999

Continuous backlog population #3999

Conversation

pwojcikdev commented Nov 15, 2022 • edited by clemahieu Loading

dsiganos commented Nov 16, 2022

pwojcikdev commented Nov 16, 2022 • edited Loading

dsiganos commented Nov 16, 2022 • edited Loading

dsiganos commented Nov 18, 2022

pwojcikdev commented Nov 15, 2022 •

edited by clemahieu

Loading

pwojcikdev commented Nov 16, 2022 •

edited

Loading

dsiganos commented Nov 16, 2022 •

edited

Loading