You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During stress tests, it became possible for all clients to not be connected to any validators, causing the clients to halt.
Steps to Reproduce
Start a network with 16 validators and 2 clients per validator. All nodes have all other nodes as peer.
Over 4 test runs, clients reliably halt within 4 hours.
Note that clients only have 21 peer slots. So the above test causes 32 clients to fill that up instead of the 16 validators).
heartbeat.rs should be fixed.
Expected Behavior
The affected logic is all in files called heartbeat.rs
Option 1
Clients should stay connected to nodes in their trusted peers list which advertise themselves as validator. In more detail:
Clients should not disconnect from nodes in their trusted peers list which are also validators. Given the current "core client" model, this should in production only be a single validator. However, other nodes can advertise themselves as validators, so there are edge cases to take into consideration.
Option 2
Another option is for peer rotation to continue when the clients are not synced.
The text was updated successfully, but these errors were encountered:
🐛 Bug Report
During stress tests, it became possible for all clients to not be connected to any validators, causing the clients to halt.
Steps to Reproduce
Start a network with 16 validators and 2 clients per validator. All nodes have all other nodes as peer.
Over 4 test runs, clients reliably halt within 4 hours.
Note that clients only have 21 peer slots. So the above test causes 32 clients to fill that up instead of the 16 validators).
heartbeat.rs should be fixed.
Expected Behavior
The affected logic is all in files called heartbeat.rs
Option 1
Clients should stay connected to nodes in their trusted peers list which advertise themselves as validator. In more detail:
Clients should not disconnect from nodes in their trusted peers list which are also validators. Given the current "core client" model, this should in production only be a single validator. However, other nodes can advertise themselves as validators, so there are edge cases to take into consideration.
Option 2
Another option is for peer rotation to continue when the clients are not synced.
The text was updated successfully, but these errors were encountered: