Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved removal of dead network channels #3993

Merged
merged 6 commits into from
Nov 10, 2022

Conversation

pwojcikdev
Copy link
Contributor

This PR improves eviction of channels when their underlying socket becomes dead (due to graceful shutdown or network error). Before we relied only on timeout to purge those, which was suboptimal, because even when socket was known to be dead we still waited ~5 minutes. That prevented us from quickly reestablishing connections to peers when problems occurred.

@@ -769,7 +769,7 @@ void nano::network::ongoing_cleanup ()
{
cleanup (std::chrono::steady_clock::now () - node.network_params.network.cleanup_cutoff ());
std::weak_ptr<nano::node> node_w (node.shared ());
node.workers.add_timed_task (std::chrono::steady_clock::now () + node.network_params.network.cleanup_period, [node_w] () {
node.workers.add_timed_task (std::chrono::steady_clock::now () + std::chrono::seconds (1), [node_w] () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is happening to cleanup_period, are we dropping its use?
Isn't checking all sockets every second a bit excessive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup period was set to 60 seconds by default and is still used in multiple different contexts, with interdependencies not being very clear. Here we just iterate all sockets (~1000 max by default) every second to see if any of them are dead. In the grand scheme of things this iteration is a small blip on the profiler chart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But 1 second might be a bit excessive, adjusting it to 5 seconds should not negatively impact the fixes in any way.

dsiganos
dsiganos previously approved these changes Nov 10, 2022
Use lower value for dev network to speed up timeout unit tests
@pwojcikdev pwojcikdev merged commit eb8c1aa into nanocurrency:develop Nov 10, 2022
@qwahzi qwahzi added this to the V24.0 milestone Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants