-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved removal of dead network channels #3993
Conversation
ef895f0
to
d225eb2
Compare
d225eb2
to
618052a
Compare
nano/node/network.cpp
Outdated
@@ -769,7 +769,7 @@ void nano::network::ongoing_cleanup () | |||
{ | |||
cleanup (std::chrono::steady_clock::now () - node.network_params.network.cleanup_cutoff ()); | |||
std::weak_ptr<nano::node> node_w (node.shared ()); | |||
node.workers.add_timed_task (std::chrono::steady_clock::now () + node.network_params.network.cleanup_period, [node_w] () { | |||
node.workers.add_timed_task (std::chrono::steady_clock::now () + std::chrono::seconds (1), [node_w] () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is happening to cleanup_period, are we dropping its use?
Isn't checking all sockets every second a bit excessive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup period was set to 60 seconds by default and is still used in multiple different contexts, with interdependencies not being very clear. Here we just iterate all sockets (~1000 max by default) every second to see if any of them are dead. In the grand scheme of things this iteration is a small blip on the profiler chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But 1 second might be a bit excessive, adjusting it to 5 seconds should not negatively impact the fixes in any way.
Use lower value for dev network to speed up timeout unit tests
This PR improves eviction of channels when their underlying socket becomes dead (due to graceful shutdown or network error). Before we relied only on timeout to purge those, which was suboptimal, because even when socket was known to be dead we still waited ~5 minutes. That prevented us from quickly reestablishing connections to peers when problems occurred.