Conversation
Configure timeout based on rollup configs.
This PR adds an enqueuing queue to the orchestrator in order to give the event loop a chance to interleave other async work.
## Summary
Transactions whose gas limits exceed the block or checkpoint mana limit
are currently silently dropped during block building, causing users'
`.wait()` calls to hang indefinitely. This PR adds early rejection at
the gossip, RPC, and pending pool entry points by validating both L2 and
DA gas limits against protocol limits and operator-configured validator
block gas limits.
## Changes
### Promote `rollupManaLimit` to `L1RollupConstants`
- Added `rollupManaLimit: number` to the `L1RollupConstants` type,
`EmptyL1RollupConstants` (defaults to `Number.MAX_SAFE_INTEGER`), and
the Zod schema
- Removed the ad-hoc `& { rollupManaLimit?: number }` extensions from
the archiver, sequencer, and block-builder types — they now get it from
the base type
- Updated `EpochCache.create()` and
`RollupContract.getRollupConstants()` to fetch and include
`rollupManaLimit` from L1
### Validate L2 and DA gas limits at tx entry points
- `GasLimitsValidator` now accepts `{ rollupManaLimit?, maxBlockL2Gas?,
maxBlockDAGas?, bindings? }`:
- Effective L2 limit = `min(MAX_PROCESSABLE_L2_GAS, rollupManaLimit,
maxBlockL2Gas)`
- Effective DA limit = `min(MAX_PROCESSABLE_DA_GAS_PER_CHECKPOINT,
maxBlockDAGas)`
- `rollupManaLimit` applies to L2 gas only (not DA)
- `GasTxValidator` forwards these options to its inner
`GasLimitsValidator`
- All factory functions
(`createFirstStageTxValidationsForGossipedTransactions`,
`createTxValidatorForAcceptingTxsOverRPC`,
`createTxValidatorForTransactionsEnteringPendingTxPool`) accept and pass
through the limits
### Use validator block gas limits for tx validation
The existing `VALIDATOR_MAX_L2_BLOCK_GAS` and
`VALIDATOR_MAX_DA_BLOCK_GAS` env vars (introduced in #21060 for block
proposal validation) are now also used for tx acceptance validation.
Derived block limits (from the sequencer timetable) are only used for
proposals — not for validation.
- **P2P config**: Added `validateMaxL2BlockGas` and
`validateMaxDABlockGas` fields reading the existing
`VALIDATOR_MAX_L2_BLOCK_GAS` / `VALIDATOR_MAX_DA_BLOCK_GAS` env vars
- **Gossip path** (`libp2p_service.ts`): Passes `rollupManaLimit` from
L1 constants and validator block gas limits from P2P config
- **RPC path** (`aztec-node/server.ts`): Passes `rollupManaLimit` from
L1 constants and validator block gas limits from node config
- **Pending pool migration** (`client/factory.ts`): Passes
`rollupManaLimit` and validator block gas limits from config
### Unit tests
Tests in `gas_validator.test.ts` covering:
- Rejection when exceeding `rollupManaLimit` (L2), `maxBlockL2Gas`, or
`maxBlockDAGas`
- Min-of-all-limits behavior (L2)
- Acceptance at exactly the effective L2 and DA limits
- Fallback to `MAX_PROCESSABLE_L2_GAS` /
`MAX_PROCESSABLE_DA_GAS_PER_CHECKPOINT` when no additional limits are
set
- Forwarding L2 and DA limits through `GasTxValidator`
## Notes
- When `VALIDATOR_MAX_L2_BLOCK_GAS` / `VALIDATOR_MAX_DA_BLOCK_GAS` are
not set, only the protocol-level limits (`MAX_PROCESSABLE_L2_GAS`,
`MAX_PROCESSABLE_DA_GAS_PER_CHECKPOINT`) and `rollupManaLimit` (L2 only)
are enforced
- No new env vars — reuses the existing `VALIDATOR_MAX_L2_BLOCK_GAS` and
`VALIDATOR_MAX_DA_BLOCK_GAS` from #21060
- ~20 test files updated to include `rollupManaLimit` in their
`L1RollupConstants` objects
Fixes A-68
Fixes A-639
```
/mnt/user-data/sean/docs/aztec-packages/yarn-project/ivc-integration/src/chonk_browser.test.ts
92:21 warning Caution: `puppeteer` also has a named export `launch`. Check if you meant to write `import {launch} from 'puppeteer'` instead import-x/no-named-as-default-member
```
fix this warning on yarn lint
The default was changed in #21235 so this flag needs to be manually turned on.
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
## Summary - Fixes a bug in `AztecNodeService.findLeavesIndexes` where the block number lookup used the wrong array index whenever `findLeafIndices` returned `undefined` gaps, causing misaligned results (wrong block numbers/hashes mapped to wrong leaves). - Replaces array-index-based lookups with `Map`s (`indexToBlockNumber`, `blockNumberToHash`) to avoid the misalignment entirely. - Adds 7 unit tests covering: all found, none found, mixed found/not-found (the bug case), multiple leaves in same block, empty input, and error cases. ## Test plan - [x] Unit tests pass: `yarn workspace @aztec/aztec-node test src/aztec-node/server.test.ts -t 'findLeavesIndexes'` - [x] Build passes: `yarn build` - [x] Lint passes: `yarn lint aztec-node` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
If we didn't register a block proposal handler on libp2p service, we'd log that validation had failed. This commit increases the log level for an unregistered handler, which should not happen, and supresses the validation failed error in that case.
…21320) Drops the default MAX_PARALLEL_BLOCKS from 32 to 4 to make better use of available CPUs.
In #20846 we showed the hash.js's sha256 implementation is very slow on large buffers and we know we're pushing hundreds of KB to megabytes to the broker every seconds, this would speed up proving job ID generation a lot. I have not used the async variant through `crypto.subtle` because ID generation is synchronous. The sync API is much faster than the async but won't be capable of as high throughput.
…21336) Adds a wait condition in the block-proposal-handler so it waits for the archiver L1 synchronizer to have synced to the slot `N-1` before processing a block proposal for slot `N`. This tackles race conditions where a checkpoint prune has yet to be processed by a validator, but the proposer was too fast and sent us a block proposal for that same block number, which gets rejected by the validator with a block-number-already-exists. In addition, it renames `getL2SlotNumber`/`getL2EpochNumber` to `getSyncedL2SlotNumber`/`getSyncedL2EpochNumber` from the archiver across the codebase, so that they return the last L2 slot/epoch that has been completely synched from L1.
…21279) Otherwise a tx could be sent with a max priority fee greater than its max fee, which cannot be realized (see `computeEffectiveGasFees`), and get a better priority in the mempool than it should. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Fixes [A-551](https://linear.app/aztec-labs/issue/A-551/properly-compute-finalized-block) ## Description Replaces the heuristic finalized block computation (`provenBlock - 2 * epochDuration`) with L1 finality. On each archiver sync iteration, we now: 1. Fetch the finalized L1 block via `getBlock({ blockTag: 'finalized' })` 2. Query the rollup contract for the proven checkpoint number at that L1 block 3. Persist that as the finalized checkpoint, from which the finalized L2 block number is derived Failures in this step are caught and logged as warnings so they don't disrupt the rest of the sync loop (e.g. if the RPC node can't serve state at the finalized block). ## Changes - `RollupContract.getProvenCheckpointNumber` now accepts an optional `{ blockNumber }` to query historical contract state - `BlockStore` stores a `lastFinalizedCheckpoint` singleton and derives `getFinalizedL2BlockNumber` from it instead of the old arithmetic heuristic - `ArchiverL1Synchronizer` gains `updateFinalizedCheckpoint()`, called every sync iteration - `KVArchiverDataStore` constructor no longer takes `l1Constants` (the `epochDuration` it was used for is no longer needed) - `FakeL1State` updated to support `blockTag: 'finalized'` and `getProvenCheckpointNumber` with a block number, enabling new sync tests
…21361) ## Summary - Fixes yarn build error (`TS2554: Expected 1-2 arguments, but got 3`) in `archiver-misc.test.ts` - The `KVArchiverDataStore` constructor was changed to accept only `(db, logsMaxPageSize?)` but the test still passed a third `{ epochDuration }` argument ## Test plan - CI build should pass with this fix ClaudeBox log: https://claudebox.work/s/b321e7afd3132cf4?run=1
revert l2 slot time 72 -> 36.
…21367) The archiver block store checks that blocks added as proposed are not already checkpointed, and fails if so. But this can still happen if the processing of a block proposal is too slow, and the checkpointed data from L1 comes in first. Still, if the proposed block matches the checkpointed one, we should not err. In a separate commit, simplifies the implementation of `addProposedBlock` so it takes only one block at a time, which is the only usage in teh codebase.
Despite the request in claude.md, claude still tries to do this. It suggested adding a rule to stress this, so here it goes.
## Summary - Fixed flaky test "should stop after max retry attempts" in `reqresp.test.ts` - The test was timing-dependent: with `maxRetryAttempts=3` (default), rate-limited requests were retried on swapped peers after the GCRA rate limiter leaked tokens, causing all 12 requests to succeed instead of the expected 10 - Fix: set `maxRetryAttempts=1` so only one pass occurs, and filter the sparse response array to compare only successful entries ## Test plan - [x] All 15 tests in `reqresp.test.ts` pass - [x] Test is now deterministic — no timing dependency on rate limiter token replenishment Full analysis: https://gist.github.com/AztecBot/14459a6a0dea94a175c536f9ac3802b8 ClaudeBox log: https://claudebox.work/s/147f7fe7916437d0?run=2
## Summary - Fixed flaky test "should stop after max retry attempts" in `reqresp.test.ts` - The test was timing-dependent: with `maxRetryAttempts=3` (default), rate-limited requests were retried on swapped peers after the GCRA rate limiter leaked tokens, causing all 12 requests to succeed instead of the expected 10 - Fix: set `maxRetryAttempts=1` so only one pass occurs, and filter the sparse response array to compare only successful entries ## Test plan - [x] All 15 tests in `reqresp.test.ts` pass - [x] Test is now deterministic — no timing dependency on rate limiter token replenishment Full analysis: https://gist.github.com/AztecBot/14459a6a0dea94a175c536f9ac3802b8 ClaudeBox log: https://claudebox.work/s/147f7fe7916437d0?run=2
…x bump loop truncation (#21323) The blob validateBlobs estimateGas call was intermittently failing with "max fee per blob gas less than block blob gas fee" because it computed a precise maxFeePerBlobGas that could go stale between the getBlobBaseFee RPC call and the estimateGas RPC call. Since gas estimation is a read-only, we now use a 2x buffer to pass EIP-4844 validation. Also fix integer truncation in the base fee bump loop where ceil is needed to ensure fees increase at small values (e.g. 1 wei). Co-authored-by: danielntmd <danielntmd@nethermind.io>
## Summary Fixes flaky `deploy_method.test.ts` failure on `merge-train/spartan` caused by interaction with PR #21279 (priority fee capping). The test used `GasFees(1n, 0n)` to give the deploy tx higher priority, but DA gas fees are zero in the test environment. Since priority fees are now capped by `maxFeesPerGas`, `min(0, 1) = 0` made both txs have equal priority — the deploy tx was no longer guaranteed to be ordered first. Switched to `GasFees(0n, 1n)` so the L2 priority fee is effective (L2 gas fees are non-zero). Full analysis: https://gist.github.com/AztecBot/6ac6f06f68d7507d726c596a67ae350b ## Test plan - All 11 tests in `deploy_method.test.ts` pass locally (ran twice) ClaudeBox log: https://claudebox.work/s/e9857814f97604f8?run=3
Ref: A-634 ## The test is set up this way: - Block proposers are connected via gossip between each other. - All other nodes drop all gossip messages. - Each proposer gets a part of the entire tx set. In a normal scenario they exchange these txs via gossip, so every proposer has every tx. ## What happened: - P2 sends Tx2 to P1 — this send succeeds. - P1 sends Tx1 to P2 — this send fails. - Tx2 gets mined by P1. - P2 proposes an empty block because the only tx it had was mined by P1. - This is where the test ends and we start waiting for all txs to get mined (including Tx1). ## Changes I didn't find any evidence for why the gossip failure happened. This PR adds more logs to the gossip pipeline to try to find why this failure occurred. The flake could not be reproduced after hundreds of test runs. Also found an unrelated flake: after transactions are mined, we query checkpoints, but the checkpoint might not yet be published to L1 and synced by the archiver, causing the assertion to fail.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BEGIN_COMMIT_OVERRIDE
fix: (A-623) increase committee timeout in scenario smoke test (#21193)
feat: orchestrator enqueues via serial queue (#21247)
feat: rollup mana limit gas validation (#21219)
fix: make e2e HA test more deterministic (#21199)
chore: fix chonk_browser lint warning (#21265)
chore: deploy SPONSORED_FPC in test networks (#21254)
fix: (A-635) e2e bot flake on nonce mismatch (#21288)
chore: deflake duplicate attestations and proposals slash tests (#21294)
fix(sequencer): fix log when not enough txs (#21297)
chore: send env var to pods (#21307)
fix: Simulate gas in n tps test. Set min txs per block to 1 (#21312)
fix: update dependabot dependencies (#21238)
test: run nightly bench of block capacity (#20726)
fix: update block_capacity test to use new send() result types (#21345)
fix(node): fix index misalignment in findLeavesIndexes (#21327)
fix(log): do not log validation error if unregistered handler (#21111)
fix: limit parallel blocks in prover to max AVM parallel simulations (#21320)
fix: use native sha256 to speed up proving job id generation (#21292)
chore: remove v4-devnet-1 (#21044)
fix(validator): wait for l1 sync before processing block proposals (#21336)
fix(txpool): cap priority fee with max fees when computing priority (#21279)
chore: Properly compute finalized block (#21156)
fix: remove extra argument in KVArchiverDataStore constructor call (#21361)
chore: revert l2 slot time 72 -> 36 on scenario network (#21291)
fix(archiver): do not error if proposed block matches checkpointed (#21367)
fix(claude): rule to not append echo exit (#21368)
chore: reduce severity of errors due to HA node not acquiring signature (#21311)
fix: make reqresp batch retry test deterministic (#21322)
fix: (A-643) add buffer to maxFeePerBlobGas for gas estimation and fix bump loop truncation (#21323)
fix(e2e): use L2 priority fee in deploy_method same-block test (#21373)
fix: reqresp flake & add logging (#21334)
END_COMMIT_OVERRIDE