Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1680 #1682

Merged
merged 22 commits into from
May 2, 2022
Merged

Fix #1680 #1682

merged 22 commits into from
May 2, 2022

Conversation

asonnino
Copy link
Contributor

More info: #1680

@asonnino asonnino self-assigned this Apr 30, 2022
@asonnino asonnino added Priority: Medium A nice to have feature or annoying bug, non-blocking and no delays expected if we punt on it Status: Needs Review PR is ready for review sui-node Type: Bug Something isn't working labels Apr 30, 2022
@asonnino asonnino added this to the Pre Testnet milestone Apr 30, 2022
@asonnino asonnino linked an issue Apr 30, 2022 that may be closed by this pull request
@asonnino asonnino requested a review from LefKok April 30, 2022 00:06
@asonnino asonnino enabled auto-merge (squash) April 30, 2022 00:07
Copy link
Collaborator

@gdanezis gdanezis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff -- document the magic constant, and it seems v low. Second, there seems to be complex separate tasks to maintain the hashmap -- is all this necessary as separate components?

rx_sui_to_consensus,
rx_consensus_to_sui,
/* max_pending_transactions */ 10_000,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the implication of this? Make a const and document what it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a comment

/// Receive consensus outputs.
rx_consensus_output: Receiver<ConsensusOutput>,
/// The maximum number of pending replies.
max_pending_transactions: usize,
/// Keep a map of all consensus inputs that are currently being sequenced.
pending: HashMap<ConsensusTransactionDigest, Vec<Replier>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the full certificate instead of just the digest? We might want to re-submit, or share at some point? Should this be persisted to disk, or are we ok if it goes away upon crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be the job of the client (and gateway) to handle resubmit in case of timeouts or crashes. It is ok if this goes away upon crash, the client will simply have to re-submit the certificate.

}
}
}

/// Main loop receiving messages input to consensus and notifying the caller once the inputs
/// are sequenced (of if an error happened).
async fn run(&mut self) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key question: why is all this happening through message passing and separate tasks instead of the task that interacts with consensus keeping the hashmap and updating when transactions go in, and when transactions go out? Are there more than 1 threads injecting and consuming messages to/from consensus?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me how to best embed this with the current Sui logic. There is one task (ConsensusListener) that receives consensus outputs. The ConsensusAdapter is not a task, it simply provides helpers to interact with consensus that the AuthorityServer uses.

We have many tasks injecting transactions to consensus (the same tasks running the AuthorityServer, one per client). We however have a single task handling the output of consensus (ConsensusListener).

@oxade
Copy link
Contributor

oxade commented May 2, 2022

@asonnino
I thought we generally agreed to favor BTreeMaps over HashMaps in Sui, although it may not matter much here

@asonnino
Copy link
Contributor Author

asonnino commented May 2, 2022

@asonnino
I thought we generally agreed to favor BTreeMaps over HashMaps in Sui, although it may not matter much here

It doesn't matter here, we don't need ordering of any kind

@@ -838,9 +838,7 @@ impl<const ALL_OBJ_VER: bool, S: Eq + Serialize + for<'de> Deserialize<'de>>
// sequence number (`OBJECT_START_VERSION`). Otherwise use the `scheduled` map to
// to assign the next sequence number.
let version = v.unwrap_or_else(|| OBJECT_START_VERSION);
let next_version = v
.map(|v| v.increment())
.unwrap_or_else(|| SequenceNumber::from(2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you unwrap_or_else a const at OBJECT_START_VERSION + 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could, but this is even simpler let next_version = version.increment();

Comment on lines 133 to 139
Ok(reply) => reply.expect("Channel with consensus listener dropped"),
Err(e) => {
let message = ConsensusListenerMessage::Cleanup(serialized);
self.tx_consensus_listener
.send(message)
.await
.expect("Channel with consensus listener dropped");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Nit: we're reusing the same message in two places. Local var?
  2. Could we also use warn! logging under a suitable span?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not the same, once it is New the other is Cleanup

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant error message: "Channel with consensus listener dropped"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, made clearer error messages (they could be made more specific)

ConsensusListenerMessage::Cleanup(transaction) => {
let digest = Self::hash(&transaction);
let _ = self.pending.get_mut(&digest).and_then(|x| x.pop());
if self.pending.get(&digest).map_or_else(|| false, |x| x.is_empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't wait for inspect to stabilize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really necessary, or could we just periodically self.pending.retain(|x| !x.is_empty())?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but it's extra LOC

Comment on lines 133 to 139
Ok(reply) => reply.expect("Channel with consensus listener dropped"),
Err(e) => {
let message = ConsensusListenerMessage::Cleanup(serialized);
self.tx_consensus_listener
.send(message)
.await
.expect("Channel with consensus listener dropped");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant error message: "Channel with consensus listener dropped"

@asonnino asonnino merged commit f9e4b5e into main May 2, 2022
@asonnino asonnino deleted the fix-1680 branch May 2, 2022 15:51
/// than this cap, the transactions will be handled by consensus as usual but this module won't
/// be keeping track of when they are sequenced. Its only purpose is to ensure the field called
/// `pending` has a maximum size.
max_pending_transactions: usize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if this cap is exceeded there can be clients waiting for an ack forever? Shouldn't we be rejecting instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel drops so the client will knows automatically. Although you're right that this can be more explicit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me make a note for this

longbowlu pushed a commit that referenced this pull request May 12, 2022
Fix 1680
punwai pushed a commit that referenced this pull request Jul 27, 2022
Fix 1680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Medium A nice to have feature or annoying bug, non-blocking and no delays expected if we punt on it Status: Needs Review PR is ready for review sui-node Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make ConsensusListener constant memory
5 participants