Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle Account Locker and other refactors #54

Merged
merged 14 commits into from
Jun 28, 2022
Merged

Bundle Account Locker and other refactors #54

merged 14 commits into from
Jun 28, 2022

Conversation

buffalu
Copy link
Contributor

@buffalu buffalu commented Jun 26, 2022

Problem

It's possible that BankingStage can execute transactions that interfere with transactions in a bundle that have already executed or not already executed. This would cause the validator to produce an invalid block or different blockhash than the rest of the network after replaying due to invalid account cache state.

Furthermore, it's possible that BundleStage may get blocked on AccountInUse from BankingStage when executing a bundle. In order to minimize this, we will pre-lock those accounts.

Summary of Changes

  • Add BundleAccountLocker and integrate into BundleStage.
  • Integrate BundleAccountLocker into the TransactionBatch returned by Bank::prepare_sanitized_batch_with_results
  • Massive cleanup on BundleStage to execute tip program txs as bundles.
  • Add unlock fix for BundleNotContinuous error

TODO

  • Add BundleAccountLocker tests

@buffalu buffalu changed the title Bundle Account Locker (DNR) Bundle Account Locker and other refactors (DNR) Jun 26, 2022
@buffalu buffalu changed the title Bundle Account Locker and other refactors (DNR) Bundle Account Locker and other refactors Jun 27, 2022
@@ -417,7 +418,8 @@ impl BankingStage {
gossip_vote_sender: ReplayVoteSender,
cost_model: Arc<RwLock<CostModel>>,
connection_cache: Arc<ConnectionCache>,
tip_manager: Arc<Mutex<TipManager>>,
tip_manager: TipManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why rm the Arc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use the lock() method to lock

@@ -1113,7 +1123,8 @@ impl BankingStage {
data_budget: &DataBudget,
cost_model: Arc<RwLock<CostModel>>,
connection_cache: Arc<ConnectionCache>,
tip_manager: Arc<Mutex<TipManager>>,
tip_manager: TipManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about just injecting with tip_program pubkey here? none of the other methods are used on this object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, will fix that

pub batch: PacketBatch,
pub uuid: Uuid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to have this be a deterministic id? Something like hash of all tx signatures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't matter too much i think, mainly just a way for tracking. imagine we can include this in metrics and cross-reference to our system at some point

num_bundle_batches_prelock: u64,
unlocked_bundles: VecDeque<PacketBundle>,
locked_bundles: VecDeque<LockedBundle>,
read_locks: HashMap<Pubkey, u64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt about defining these as HashMap<Rc<Pubkey>, u64> since these are copies of LockedBundle locks

Copy link
Contributor

@whistlinwilly whistlinwilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for group review today - love the conversation very helpful.

Long term comments:

  • Should we drain the locked bundle queue when we stop being leader?
  • Should we keep a counter for bundle execution errors and retry somehow?

}

pub struct BundleAccountLocker {
num_bundle_batches_prelock: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving the comment here instead of throughout - this is a magic number to the new function in many cases - lets make a constant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added constant in tpu.rs that gets passed in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants