Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Clients may disconnect from trusted validators #3479

Open
vicsn opened this issue Feb 10, 2025 · 0 comments · May be fixed by #3504
Open

[Bug] Clients may disconnect from trusted validators #3479

vicsn opened this issue Feb 10, 2025 · 0 comments · May be fixed by #3504
Assignees
Labels
bug Incorrect or unexpected behavior

Comments

@vicsn
Copy link
Collaborator

vicsn commented Feb 10, 2025

🐛 Bug Report

During stress tests, it became possible for all clients to not be connected to any validators, causing the clients to halt.

Steps to Reproduce

Start a network with 16 validators and 2 clients per validator. All nodes have all other nodes as peer.

Over 4 test runs, clients reliably halt within 4 hours.

Note that clients only have 21 peer slots. So the above test causes 32 clients to fill that up instead of the 16 validators).

heartbeat.rs should be fixed.

Expected Behavior

The affected logic is all in files called heartbeat.rs

Option 1

Clients should stay connected to nodes in their trusted peers list which advertise themselves as validator. In more detail:
Clients should not disconnect from nodes in their trusted peers list which are also validators. Given the current "core client" model, this should in production only be a single validator. However, other nodes can advertise themselves as validators, so there are edge cases to take into consideration.

Option 2

Another option is for peer rotation to continue when the clients are not synced.

@vicsn vicsn added the bug Incorrect or unexpected behavior label Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect or unexpected behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants