Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(kt-devnet): add batcher failover kurtosis test #33

Open
wants to merge 4 commits into
base: feat--op-batcher-altda-failover-to-ethda
Choose a base branch
from

Conversation

samlaf
Copy link
Collaborator

@samlaf samlaf commented Feb 26, 2025

This PR adds a golang failover test which uses the kurtosis devnet as backend.
This test is used to test the new batcher failover behavior from #34.
We will merge this test into that PR, and then that PR into eigenda-develop (our master branch).

Note to reviewers

Please be very harsh on the golang test, as this same framework will be reused to add new tests for future features.

The "proper" way to implement this would probably have been to reuse op's op-e2e framework, but create a way to populate this system (equivalent to our harness in this PR) from a kurtosis backend. This would have taken me a lot more time to figure out however, and I feel like its something that the OP team might create at some point, so prefer to let them put in the work there and figure out how to do it properly, and we can potentially move to using that at a future point.

@samlaf samlaf marked this pull request as draft February 26, 2025 18:12
@samlaf samlaf force-pushed the feat--op-batcher-altda-failover-to-ethda branch from c548b50 to 2a78e16 Compare February 26, 2025 18:20
@samlaf samlaf force-pushed the test--add-batcher-failover-kurtosis-test branch from 360b1f6 to c52d9f2 Compare February 26, 2025 18:24
@samlaf samlaf marked this pull request as ready for review February 26, 2025 19:58
}

// Fetches all the batch-inbox posted commitments from blockNum (inclusive) to current block.
func fetchBatcherTxsSinceBlock(gethL1Endpoint string, batchInbox string, blockNum uint64) ([]BatcherTx, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there no constructions that are already exist in state derivation pipeline that can be leveraged for reading batcher txs instead?

// We assume that this enclave is already running.
const enclaveName = "eigenda-memstore-devnet"

// TestFailover tests the failover behavior of the batcher, in response to the proxy returning 503 errors.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a comment describing the parallelism limitations?

// The test then toggles the failover back off and checks that the batcher starts submitting EigenDA batches again.
// The batches inbox transactions are queried via geth's GraphQL API.
// TODO: We will also need to test the failover behavior of the node, which currently doesn't finalize after failover (fixed in https://github.com/Layr-Labs/optimism/pull/23)
func TestFailover(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we eventually wanna add tests ensuring failover to?

  • keccak commitment mode
  • 4844

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keccak yes probably, I'll add a comment. 4844 I don't know... it's hard to implement and not a priority for me.

// 1. Check that the original commitments are EigenDA
harness.requireBatcherTxsToBeFromLayer(t, sinceBlock, DALayerEigenDA)

// 2. Failover and check that the commitments are now EthDA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EthDA is just calldata here?

})

// assume kurtosis is running and is at least at block 10 (just deploying the contracts takes more than 10 blocks)
require.GreaterOrEqual(t, harness.testStartL1BlockNum, uint64(10), "Test started too early in the chain")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic number 10 is used in a lot of places - consider using a variable

// Test Harness, which contains all the state needed to run the tests.
// harness also defines some higher-level "require" methods that are used in the tests.
type harness struct {
enclaveCtx *enclaves.EnclaveContext
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not currently used. is it planned for future use, or just there as a template for future tests that might need it?

if batcherTx.daLayer != expectedLayer {
wrongCommitmentsToDiscard++
}
// as soon as we see 3 ethDA commitments, or an EigenDA commitment, we stop
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc seems to describe a previous version of the code which wasn't generalized to expectedLayer

// Get the kurtosis tag
tag := field.Tag.Get("kurtosis")
if tag == "" {
continue // Skip fields without tags
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the reason for fields to exist without the tag? Is the idea that you'd add kurtosis fields to production structs?

// NewServiceEndpoint string `kurtosis:"new-service-name,port-name"`
}

func getEndpointsFromKurtosis(enclaveCtx *enclaves.EnclaveContext) (*EnclaveServiceEndpoints, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A doc here would be helpful. It looks like this method replaces the service name in the EnclaveServiceEndpoints struct with a valid localhost endpoint?

// Update the proxy's memstore config to start returning 503 errors
// Note: we have to GetConfig, update it and then UpdateConfig because the client doesn't implement a "patch" method,
//
// even though the API does support it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: weird formatting here, with newline and tab mid-sentence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants