Skip to content

rpc/jsonrpc: filter redundant collapse-sibling nodes from debug_executionWitness#21484

Closed
awskii wants to merge 9 commits into
mainfrom
awskii/witness-collapse-sibling-filter
Closed

rpc/jsonrpc: filter redundant collapse-sibling nodes from debug_executionWitness#21484
awskii wants to merge 9 commits into
mainfrom
awskii/witness-collapse-sibling-filter

Conversation

@awskii
Copy link
Copy Markdown
Member

@awskii awskii commented May 28, 2026

Fixes the "more nodes than Geth" side of #21312.

Problem

debug_executionWitness emits collapse-sibling nodes that Geth keeps inline as a 32-byte hash inside the parent branch. Erigon's HPH witness encoding emits the sibling as an explicit node so the parent branch hashes correctly; standard MPT does not. Result: +1..+18 nodes per block on the issue's test table.

Fix

Filter at the output boundary inside buildWitnessTrie (rpc/jsonrpc/debug_execution_witness.go). A collapse-sibling node — identified via siblingPaths recorded by detectCollapseSiblings — is dropped iff its branch has no expanded child (trie.HasExpandedChild == false). Branches on the path to an independently-accessed key have an expanded child and are kept, so load-bearing siblings the verifier must descend into stay in the witness.

Two helpers in execution/commitment/trie/trie.go:

  • HasExpandedChild(n Node) bool — branch-collapse classifier.
  • (*Trie).NodeHash(n Node) (common.Hash, error) — mirrors RLPEncode's dedup identity (keccak of hashChildren(n,0) with the same valueNodesRLPEncoded flag), so drop-set keys match result.State's identity exactly.

verifyWitnessStateless runs on the filtered result.State as a hard error: any wrongly dropped node fails verification. No silent fallback. toWitnessTrie / HPH conversion is untouched. Pre-Amsterdam re-execution path unchanged.

DFS order preserved; root (encoded[0]) is never dropped.

Tests

rpc/jsonrpc/debug_witness_collapse_test.go:

  • TestExecutionWitnessCollapseSiblingAccount — deterministic 2→1 account collapse, redundant sibling absent, verify passes.
  • TestExecutionWitnessCollapseSiblingStorage — same shape for storage trie.
  • TestExecutionWitnessCollapseSiblingAccessedSubtree — sibling whose subtree is independently accessed is retained.
  • TestExecutionWitnessCollapseSiblingLoadBearing — sibling required for re-rooting is retained (over-filter guard).
  • TestExecutionWitnessNoCollapseUnchanged — block with no collapses, filter is a strict no-op.

Out of scope

  • A zkevm-corpus run shows ~4,128 fixtures where Erigon returns FEWER nodes than EEST. That's a separate bug — a filter can only remove, not add — not addressed here.
  • Exact node-set parity on Amsterdam blocks additionally needs a BAL-driven access-set provider; separate PR.
  • The zkevm test suite is a separate branch; used here as a local cross-check, not a dependency.

awskii added 9 commits May 27, 2026 22:03
… probe

Task 1 of the collapse-sibling filter plan. Authors bespoke chains via
execmoduletester + blockgen with deterministic shared-prefix key search
(self-validating; addresses start above the precompile range). Adds:

- TestExecutionWitnessCollapseSiblingAccount: account-trie 2-child branch
  collapsing onto a 2-account subtree (redundant surviving sibling).
- TestExecutionWitnessCollapseSiblingStorage: storage-trie analogue.
- TestExecutionWitnessNoCollapseUnchanged: no-op (already minimal).

Minimality is asserted criterion-free via probeRedundantNodes (remove each
non-root node, re-run execBlockStatelessly; a removal that still reproduces
the root means the node was redundant). Collapse cases are RED for the right
reason (account removable node [5], storage [6]); no-collapse passes. The
2-leaf collapse is load-bearing, not redundant -> reserved for Task 4.
…tness

Drop collapse-sibling witness nodes that resolve to a branch
(*FullNode/*DuoNode): the stateless verifier folds such a sibling inline
by hash on re-root (trie.convertToShortNode), so its preimage is never
read. Leaf/extension/account siblings stay (load-bearing). Adds
(*trie.Trie).NodeHash so the drop-set keys match RLPEncode's dedup hash;
filter preserves DFS order and never drops the root (index 0).

Account collapse 6->5 nodes, storage 8->7; verify guardrail unchanged.
Task 4: a genuine 2-leaf 2->1 collapse whose surviving sibling is a
ShortNode leaf the verifier must read to promote. The filter must keep it;
the test decodes result.State and asserts the sibling path resolves to a
concrete leaf node (not a bare HashNode), pins the node count, and probes
for minimality with the verify guardrail on.
…Witness

The collapse-sibling filter dropped any sibling resolving to a branch node on
the theory the verifier always folds it inline by hash. That is unsound when a
key below the sibling subtree is independently accessed in the same block: the
verifier must descend through the branch to reach the accessed leaf, so its
preimage is load-bearing. Dropping it produced a witness that panics during
stateless re-root (unresolved HashNode), regressing valid blocks that succeed
on main.

Only drop a collapse-sibling branch whose subtree is entirely unexpanded (all
children are bare hashes); a branch with any expanded child is kept. Adds a
trie.HasExpandedChild helper (DuoNode internals are unexported) and a
reproduction test for the accessed-subtree shape.
…s with shipped HasExpandedChild guard

Tighten HasExpandedChild docstring to the conservative invariant (drop the
scenario-specific over-claim), and correct the archived plan's DECISION/Proven-scope
to document the expanded-child guard that the final fix added after archival.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces debug_executionWitness’s state output to better match Geth by filtering redundant “collapse-sibling” trie nodes at the witness output boundary, while keeping stateless verification as a hard guardrail.

Changes:

  • Add an output filter in buildWitnessTrie that drops collapse-sibling branch nodes whose subtree is entirely unexpanded.
  • Introduce trie helpers to (a) classify whether a branch has any expanded child and (b) hash nodes with the same identity used by RLPEncode deduplication.
  • Add focused regression tests covering account/storage collapse, accessed-subtree retention, load-bearing sibling retention, and strict no-op behavior when no collapses occur.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
rpc/jsonrpc/debug_witness_collapse_test.go Adds deterministic tests to assert minimal witness output and guard against over-filtering.
rpc/jsonrpc/debug_execution_witness.go Filters redundant collapse-sibling nodes from the encoded witness node list before returning.
execution/commitment/trie/trie.go Adds HasExpandedChild and (*Trie).NodeHash helpers used by the witness filter.
docs/plans/completed/20260527-witness-collapse-sibling-filter.md Documents the rationale, criterion, and test coverage for the filter.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1823 to +1829
expanded := func(c Node) bool {
switch c.(type) {
case nil, *HashNode:
return false
default:
return true
}
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment on lines +1823 to +1829
expanded := func(c Node) bool {
switch c.(type) {
case nil, *HashNode:
return false
default:
return true
}
Copy link
Copy Markdown
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Addresses the "+1..+18 nodes vs Geth" side of #21312 by filtering collapse-sibling branch nodes from the witness output. Architecturally clean:

  • Filter at the output boundary in buildWitnessTrie — no changes to HPH conversion (toWitnessTrie).
  • Drop criterion: a collapse-sibling that resolves to a *FullNode/*DuoNode with no expanded child (HasExpandedChild == false) — i.e. an entirely-unfolded subtree the standard verifier folds inline by hash.
  • Hard safety net: verifyWitnessStateless runs on result.State after the filter assigns it (rpc/jsonrpc/debug_execution_witness.go:695,704), so any wrongly-dropped node fails stateless re-execution.
  • NodeHash identity: uses hashChildren(n, 0) + Keccak256Hash with the same valueNodesRLPEncoded flag as RLPEncode (trie.go:1457,1468,1480,1495), so drop-set keys match encoded[i]'s dedup identity exactly.
  • Tests pass locally (5/5 in rpc/jsonrpc, ~1.2s).

Findings

F1 — HasExpandedChild only recognizes *HashNode, not value-typed HashNode (Copilot already flagged)

execution/commitment/trie/trie.go:1820-1842:

case nil, *HashNode:
    return false

The trie package uses both pointer and value forms — proof.go:142,244, witness_builder.go:101,103, hack.go:49,61 produce HashNode{...} (value), and resolveHashNodes at trie.go:1774,1783 explicitly handles both. The PR's specific caller is safe today: sdCtx.Witness()toWitnessTrie constructs HashNodes only via trie.NewHashNode(...) (pointer) — see hex_patricia_hashed.go:1435,1542,1609,1620,1639,1645,1697 — and decodeTrieRef returns &HashNode{...} (pointer). So the filter does fire in practice.

But HasExpandedChild is exported from the trie package; defensive programming says match the established two-form pattern. Trivial:

case nil, HashNode, *HashNode:
    return false

One-token change, no downside. Worth taking before merge.

F2 — Failing CI mainnet-rpc-integ-tests is unrelated

The failures are in erigon_getLogsByHash, eth_getProof, eth_simulateV1, eth_getBlockReceipts — nothing under debug_*. The same workflow failed on main at bea27b3 with the same set, so it's a pre-existing red on the merge target. Informational.

F3 — Test setup mutates package-global statecfg.Schema

debug_witness_collapse_test.go:190-192 saves/restores statecfg.Schema around statecfg.EnableHistoricalCommitment(). Works because none of the new tests call t.Parallel(); if anyone adds t.Parallel() later, the four tests will race on the global. Project-wide pattern, calling it out for awareness.

F4 — Archived plan doc carries forensic detail (nit)

docs/plans/completed/20260527-witness-collapse-sibling-filter.md includes a hardcoded user path (/Users/awskii/org/wrk/erigon-exec-witness) and branch name references. CLAUDE.md discourages forensic detail in source; the docs/plans/completed/ archive seems to be an established exception. Not a blocker.

F5 — Tests pin exact witness node counts (informational)

require.Equal(t, 5, len(result.State), ...) etc. — intentional given the tests are named "minimal"; just note these will need adjustment if commitment internals legitimately change the encoded set.

Approve

The design is sound, the safety net is genuine, and the test coverage is honest (positive case + over-filter regression + no-op guard + load-bearing leaf + storage variant). F1 is the only thing I'd want before merging — one-line change to make the public helper robust to other call sites. Everything else is informational.

@awskii
Copy link
Copy Markdown
Member Author

awskii commented May 28, 2026

Closed in favor of #21491 - better zkEVM passing rate (89% here vs 94% there)

@awskii awskii closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants