-
Notifications
You must be signed in to change notification settings - Fork 4.7k
parallelizes gossip packets receiver with processing of requests #17647
Conversation
5065a58
to
9d49350
Compare
Codecov Report
@@ Coverage Diff @@
## master #17647 +/- ##
=========================================
- Coverage 82.8% 82.7% -0.1%
=========================================
Files 431 431
Lines 120590 120629 +39
=========================================
- Hits 99866 99862 -4
- Misses 20724 20767 +43 |
Gossip packet processing is composed of two stages: * The first is consuming packets from the socket, deserializing, sanitizing and verifying them: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521 * The second is actually processing the requests/messages: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605 The former does not acquire any locks and so can be parallelized with the later, allowing better pipelineing properties and smaller latency in responding to gossip requests or propagating messages.
9d49350
to
d53eaa1
Compare
gossip/src/cluster_info.rs
Outdated
@@ -2659,10 +2683,35 @@ impl ClusterInfo { | |||
Ok(()) | |||
} | |||
|
|||
pub fn listen( | |||
pub(crate) fn consume( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: functions in the code base that start threads generally start with start_*_thread
, how about start_socket_consume_thread()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
gossip/src/cluster_info.rs
Outdated
// Consumes packets received from the socket, deserializing, sanitizing and | ||
// verifying them and then sending them down the channel for the actual | ||
// handling of requests/messages. | ||
fn run_consume( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename to run_socket_consume
to be more specific, as "consume" is pretty generic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
} | ||
}; | ||
let thread_name = String::from("gossip-consume"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "gossip-consume" -> "gossip-socket-consume"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thread names in linux are limited to 15 characters, and everything after the 15th character is trimmed when inspecting running processes from shell, which will make it confusing if I use a longer name here.
gossip/src/gossip_service.rs
Outdated
@@ -56,11 +56,15 @@ impl GossipService { | |||
1, | |||
); | |||
let (response_sender, response_receiver) = channel(); | |||
let t_responder = streamer::responder("gossip", gossip_socket, response_receiver); | |||
let (consume_sender, listen_receiver) = channel(); | |||
let t_consume = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit t_consume -> t_socket_consume
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
match self.run_consume(&receiver, &sender, &thread_pool) { | ||
Err(GossipError::RecvTimeoutError(RecvTimeoutError::Disconnected)) => break, | ||
Err(GossipError::RecvTimeoutError(RecvTimeoutError::Timeout)) => (), | ||
Err(err) => error!("gossip consume: {}", err), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, the only other potential error other than the Timeout is Ok(sender.send(packets)?)
, https://github.com/solana-labs/solana/pull/17647/files#diff-b07406ad1c913d21d39a4163cb34f48c55cef6d472c814419a732f3937ffceddR2631. Hopefully this one doesn't have a chance of being too spammy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I guess should be fine.
) Gossip packet processing is composed of two stages: * The first is consuming packets from the socket, deserializing, sanitizing and verifying them: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521 * The second is actually processing the requests/messages: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605 The former does not acquire any locks and so can be parallelized with the later, allowing better pipelineing properties and smaller latency in responding to gossip requests or propagating messages. (cherry picked from commit cab30e2)
) (#17807) Gossip packet processing is composed of two stages: * The first is consuming packets from the socket, deserializing, sanitizing and verifying them: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521 * The second is actually processing the requests/messages: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605 The former does not acquire any locks and so can be parallelized with the later, allowing better pipelineing properties and smaller latency in responding to gossip requests or propagating messages. (cherry picked from commit cab30e2) Co-authored-by: behzad nouri <[email protected]>
) Gossip packet processing is composed of two stages: * The first is consuming packets from the socket, deserializing, sanitizing and verifying them: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521 * The second is actually processing the requests/messages: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605 The former does not acquire any locks and so can be parallelized with the later, allowing better pipelineing properties and smaller latency in responding to gossip requests or propagating messages. (cherry picked from commit cab30e2) # Conflicts: # core/src/cluster_info.rs
…kport #17647) (#19474) * parallelizes gossip packets receiver with processing of requests (#17647) Gossip packet processing is composed of two stages: * The first is consuming packets from the socket, deserializing, sanitizing and verifying them: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521 * The second is actually processing the requests/messages: https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605 The former does not acquire any locks and so can be parallelized with the later, allowing better pipelineing properties and smaller latency in responding to gossip requests or propagating messages. (cherry picked from commit cab30e2) # Conflicts: # core/src/cluster_info.rs * removes backport merge conflicts Co-authored-by: behzad nouri <[email protected]>
Problem
Gossip packet processing is composed of two stages:
sanitizing and verifying them:
https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521
https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605
The former does not acquire any locks and so can be parallelized with
the later, allowing better pipelineing properties and smaller latency in
responding to gossip requests or propagating messages.
Summary of Changes
Added a new stage to gossip consuming packets received from the socket,
deserializing, sanitizing and verifying them and then sending them down the
channel for the actual handling of requests/messages.