Skip to content

feat: merge-train/spartan#22235

Open
AztecBot wants to merge 19 commits intonextfrom
merge-train/spartan
Open

feat: merge-train/spartan#22235
AztecBot wants to merge 19 commits intonextfrom
merge-train/spartan

Conversation

@AztecBot
Copy link
Copy Markdown
Collaborator

@AztecBot AztecBot commented Apr 1, 2026

BEGIN_COMMIT_OVERRIDE
fix: deflake HA governance voting test by polling for L1/DB convergence (#22220)
feat(world-state): add genesis timestamp support and GenesisData type (#22201)
chore: revert: feat(world-state): add genesis timestamp support and GenesisData type (#22201) (#22255)
fix(archiver): handle duplicate checkpoint from L1 reorg (#22252)
chore: update dashboard (#22260)
fix: remove detailed revert codes (#22274)
chore: use ESO in grafana (#22271)
chore: (A-751) robust response error handling in json-rpc client (#22246)
fix: separate fisherman StatefulSet from rpc-node and stop archiver pollution (#22183)
fix: restore mainnet prover agents to 4 replicas (#22305)
END_COMMIT_OVERRIDE

…ce (#22220)

## Summary

- Fixes a race condition in the HA governance voting test where the DB
records a duty as "signed" (crypto signing done) before the L1
transaction mines, causing `l1VoteCount` to be less than
`uniqueSlots.size`
- Merges the two-step approach (poll for votes > 0, then one-shot
snapshot) into a single convergence loop that retries until L1 vote
count matches DB duties

Fixes [A-891](https://linear.app/aztec-labs/issue/A-891)
spalladino and others added 3 commits April 1, 2026 17:19
…#22201)

## Motivation

The genesis block has `timestamp=0`, which forces a block 1 special case
for transaction expiration validation. Transactions anchored to genesis
get an expiration clamped to `0 + MAX_TX_LIFETIME` (~86400 = Jan 2
1970), making them impossible to include after block 1. This complicates
e2e test setup by requiring empty block 1 mining and `minTxsPerBlock`
manipulation.

## Approach

Adds a `genesisTimestamp` parameter that flows through the full world
state stack (C++ → NAPI → TypeScript), allowing the genesis block header
to have a non-zero timestamp. Introduces a `GenesisData` type that
bundles `prefilledPublicData` and `genesisTimestamp`, replacing the two
separate parameters that were threaded everywhere. The e2e setup
automatically passes the current time as the genesis timestamp.

## Changes

- **New `GenesisData` type** (`stdlib/src/world-state/genesis_data.ts`)
— bundles `prefilledPublicData` and `genesisTimestamp` into a single
type
- **C++ world state** — accepts `genesis_timestamp` parameter, uses it
in the genesis block header hash
- **TS world state stack** — `NativeWorldState`,
`NativeWorldStateService`, factory, and synchronizer all take
`GenesisData`
- **Node/CLI** — `AztecNodeService.createAndSync`, `createAztecNode`,
`start_node.ts`, `standby.ts` all use `GenesisData`
- **E2e setup** — passes `genesisTimestamp: Date.now()` to genesis
values; block 1 wait logic preserved
- **~30 e2e/p2p test files** — `prefilledPublicData` references replaced
with `genesis`
- **New e2e test** — `e2e_genesis_timestamp.test.ts` verifies
genesis-anchored txs work after block 1

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@ludamad ludamad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved

@AztecBot AztecBot enabled auto-merge April 1, 2026 21:13
@AztecBot
Copy link
Copy Markdown
Collaborator Author

AztecBot commented Apr 1, 2026

🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass.

…enesisData type (#22201) (#22255)

## Summary

Reverts #22201 which
broke merge-train/spartan.

This cleanly reverts all 48 files changed by the original PR, removing
the `GenesisData` type and `genesisTimestamp` parameter that was
threaded through the world state stack.

ClaudeBox log: https://claudebox.work/s/d0871478ee07a313?run=1
@AztecBot AztecBot added this pull request to the merge queue Apr 2, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 2, 2026
AztecBot and others added 3 commits April 2, 2026 10:13
## Summary

Fixes a mainnet issue where an L1 reorg moved a checkpoint to a
different L1 block, causing the archiver to re-discover it and crash
with `InitialCheckpointNumberNotSequentialError` in an infinite loop.

When `addCheckpoints` receives a checkpoint that's already stored, it
now:
- **Accepts it** if the archive root matches (same content, just
different L1 block)
- **Updates the L1 metadata** (block number, timestamp, hash) and
attestations
- **Throws** if the archive root doesn't match (content mismatch —
genuine conflict)

## Changes

- **`block_store.ts`**: Added `skipOrUpdateAlreadyStoredCheckpoints`
method that handles duplicate checkpoints at the start of a batch.
Verifies archive roots match and updates L1 info.
- **`fake_l1_state.ts`**: Added `moveCheckpointToL1Block` helper for
simulating L1 reorgs that move checkpoints.
- **`archiver-sync.test.ts`**: Added e2e test `handles L1 reorg that
moves a checkpoint to a later L1 block` in the reorg handling suite.
- **`kv_archiver_store.test.ts`**: Added unit tests for accepting
matching duplicates with updated L1 info, accepting fully-duplicate
batches, and rejecting mismatching duplicates.

## Test plan

- [x] Unit tests: 207 passed in `kv_archiver_store.test.ts`
- [x] Sync tests: 37 passed in `archiver-sync.test.ts`
- [x] Build, format, lint all pass

ClaudeBox log: https://claudebox.work/s/e5247344b8df94ca?run=2

Co-authored-by: PhilWindle <60546371+PhilWindle@users.noreply.github.com>
Adds a panel to track tx mining latency
AztecBot and others added 10 commits April 2, 2026 16:22
)

Better error handling for json-rpc client responses in the case that
non-standard errors are returned if using reverse proxies.

Co-authored-by: danielntmd <danielntmd@nethermind.io>
…ollution (#22183)

The mainnet `rpc-node` pod runs in fisherman mode, causing it to push
locally-built blocks into the archiver every slot. This triggers a
conflict-prune-reorg cascade on every block (~every 72 seconds), leaving
the node in a perpetually unstable state and exposing fake blocks to
connected RPC clients — which invalidates any transactions anchored
against them.

**Part 1 — Code fix** (`checkpoint_proposal_job.ts`):
`syncProposedBlockToArchiver` now skips the archiver push when
`fishermanMode` is `true`. The fisherman continues building blocks for
fee analysis and validation, but those blocks no longer pollute the
local archiver.

**Part 2 — Infrastructure split** (`mainnet.env`, `main.tf`,
`variables.tf`, `deploy_network.sh`): Adds a dedicated
`FISHERMAN_REPLICAS` variable and a new `fisherman` Helm release
(separate StatefulSet). `mainnet.env` now uses `FISHERMAN_REPLICAS=1`
instead of `FISHERMAN_MODE=true` on the rpc-node, so the rpc-node
becomes a clean archiving/RPC node and the fisherman runs as
`mainnet-fisherman-aztec-node-0`.

Fixes
[A-889](https://linear.app/aztec-labs/issue/A-889/separate-fisherman-node-from-rpc-node-and-prevent-archiver-pollution)
@AztecBot
Copy link
Copy Markdown
Collaborator Author

AztecBot commented Apr 4, 2026

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/27b967790da99264�27b967790da992648;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/rediscovery.test.ts (132s) (code: 0) group:e2e-p2p-epoch-flakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants