Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(proto, conductor, cd)!: changes for Forma migration #1843

Closed
wants to merge 35 commits into from

Conversation

ethanoroshiba
Copy link
Contributor

@ethanoroshiba ethanoroshiba commented Dec 3, 2024

Summary

Makes the changes necessary in the monorepo to migrate Forma to Mainnet.

Background

Migrating Forma to mainnet necessitates introducing stop and start logic into the Conductor. This makes the proto and Conductor changes necessary for that. In addition to this, this will allow for future needs to stop and start EVMs on our networks. These changes will also need to be reflected in the GenesisInfo sent by rollups to the conductor.

The shutdown and restart logic for the Conductor should be as follows:

  • Firm Only Mode: Once the stop height is reached for the firm block stream, the firm block at this height is executed (and commitment state updated) before restarting the internal conductor, prompting the rollup for a new GenesisInfo with new start/stop heights (and potentially chain IDs).
  • Soft and Firm Mode: Once the stop height is reached for the soft block stream, no more soft blocks are executed (including the one at the stop height). The firm block is still executed for this height, and then Conductor restarts.
  • Soft Only Mode: Once the stop height is reached for the soft block stream, no more soft blocks are executed (including the one at the stop height). Conductor is then restarted.

The aforementioned rollup changes to be made in astria-geth necessitate changing our current charts config for the following. "Forks" now describes a set of instructions for when and how to stop and restart the EVM at different heights, such that changes (such as network migration) can be made.

config:
        # < non astria-prefixed items... >
        AstriaOverrideGenesisExtraData:
        AstriaRollupName:
        AstriaForks: []
                # - < fork name >:
                #         height: ""
                #         halt: ""
                #         snapshotChecksum: ""
                #         extraDataOverride: ""
                #         feeCollector: ""
                #         EIP1559Params: {}
                #         sequencer:
                #                 chainId: ""
                #                 addressPrefix: ""
                #                 startHeight: ""
                #                 stopHeight: ""
                #         celestia:
                #                 chainId: ""
                #                 startHeight: ""
                #                 heightVariance: ""
                #         bridgeAddresses: []

Changes

  • Changed sequencer_genesis_block_height to sequencer_start_block_height in GenesisInfo (in the execution API).
  • Added sequencer_stop_block_height, rollup_start_block_height, sequencer_chain_id, and celestia_chain_id, and halt_at_stop_height to GenesisInfo.
  • Changed sequencer to rollup height calculation to start from rollup_start_block_height in GenesisInfo, so that Conductor doesn't start from rollup height 1 again after restarting.
  • Removed sequencer and celestia chain ID environment vars from Conductor, and changed chain ID checks to rely on obtaining these from the initialized executor state.
  • Added astriaForks to EVM rollup config, to reflect changes in astria-geth: https://github.com/astriaorg/astria-geth/pull/59/files.
  • Added restart and halt logic for when the stop height is reached, where Conductor will shut down at the stop height if halt_at_stop_height is true, and restart otherwise.

Testing

  • Added restart logic to check_for_restart_ok unit test.
  • Added blackbox tests to ensure conductor correctly restarts in each mode, and continues to execute at the correct height after fetching updated genesis info and commitment state.
  • Added blackbox test in SoftAndFirm mode to ensure the conductor shuts down at the stop height if GenesisInfo.halt_at_stop_height is true.
  • Added smoke test which restarts EVM and Conductor mid-way through the regular smoke test.

Changelogs

Changelogs updated

Breaking Changelist

  • Proto changes are not technically breaking, since fields were only added, but the old genesis info shape will be rejected since it does not contain the chain IDs to validate against the networks the conductor is connecting to (and would fail even if this was the case, due to sequencer_stop_block_height being read from wire format as 0). Thus, rollups will need to provide an updated GenesisInfo to conductor (includingastria-geth).
  • Broke domain type GenesisInfo

Related Issues

closes #1841

@github-actions github-actions bot added conductor pertaining to the astria-conductor crate proto pertaining to the Astria Protobuf spec cd labels Dec 3, 2024
@ethanoroshiba
Copy link
Contributor Author

ethanoroshiba commented Dec 9, 2024

Opening this for review despite failing status checks so that review isn't blocking on Geth changes. Once a Geth image can be pulled, we'll be able to run the smoke tests.
Smoke tests passing now.

@ethanoroshiba ethanoroshiba marked this pull request as ready for review December 9, 2024 21:36
@ethanoroshiba ethanoroshiba requested review from a team as code owners December 9, 2024 21:36
@ethanoroshiba ethanoroshiba requested a review from a team as a code owner January 3, 2025 14:58
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 3, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought about this code some more and I still think that regular operation should not be conflated with errors.

At the same time I think we are not being good citizens in that we keep hitting an endpoint (Sequencer) and then just throwing away the data as is currently the case in execute_soft. Instead, Conductor should actively shut down the channels to the sequencer and celestia tasks after having reached the stop-height.

Then you can have a fall-through case where the executor task just shuts down normally if both channels are closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempted to do so via dropping in 00b10a4. Let me know if this works!

self.state.sequencer_stop_block_height(),
))
.wrap_err(
"soft height met or exceeded sequencer stop height, attempting restart with new \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executor does not get to decide this. The decision on whether or not to restart lies with the conductor task one level up.

@@ -457,19 +505,36 @@ impl Executor {
err,
))]
async fn execute_firm(&mut self, block: ReconstructedBlock) -> eyre::Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done in this PR. The current restart logic is with respect to an error encountered from an RPC. This new restart logic is due to regular operation.

I don't think a cancellation token is necessary here. I think it's sufficient to have the executor task actively close its channel to the celestia/sequencer tasks. Then you get those 2 tasks to shut down for free while maintaining a consistent view. Only need to remember to not kill the entire conductor if one of those goes down

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a field here to indicate to conductor whether or not it should stop or restart on having reached the stop height?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's strictly necessary, but I wonder if it may be better UX. I'll confer with @mycodecrafting to see what he thinks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surfacing offline discussion with @mycodecrafting that this would be better UX. Noting that the changes required in Geth to accomplish this involve peeking at the next fork in the Geth genesis to see whether halt: true, and then including that information in GenesisInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced this field in 680d984 and implemented the logic for it in 02d04d7

&mut self,
block: ReconstructedBlock,
) -> eyre::Result<Option<StopHeightExceded>> {
if self.is_firm_block_height_exceded(&block) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also close the firm block channel here, similar to how we close the soft block channel?

@ethanoroshiba
Copy link
Contributor Author

Closing in favor of #1928.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
allow-breaking-proto cd ci issues that are related to ci and github workflows conductor pertaining to the astria-conductor crate documentation Improvements or additions to documentation override-freeze proto pertaining to the Astria Protobuf spec
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Forma migration
3 participants