execution: trace EIP-7702 state-gas refund per auth and incarnation#21490
execution: trace EIP-7702 state-gas refund per auth and incarnation#21490yperbasis wants to merge 6 commits into
Conversation
…fund per auth and incarnation
Adds a new ERIGON_TRACE_AUTH_REFUND debug flag that, when set, logs:
- one line per authorization processed in verifyAuthorities, with the
(block, txIdx, incarnation), recovered authority, auth.address,
auth.nonce vs state-side nonce, codeEmpty, hasDelegation, exists, and
the per-auth +sgna / +sgab contributions (or the skip reason)
- a TOTAL line per tx with the resulting stateIgasRefund
- a FINALIZE line in the parallel apply loop with which incarnation's
TxResult reached the receipt and its GasUsed / CumulativeGasUsed
Motivation: the eest-spec-blocktests-devnet shard's
test_pointer_resets_an_empty_code_account_with_storage[fork_Amsterdam]
flakes under ERIGON_EXEC3_PARALLEL=true. Local instrumentation confirms
the race shape — tx 2's first incarnation captures wrong stateIgasRefund
(missing StateGasNewAccount = 183_600) because pointer/sender appear
non-existent before tx 0/tx 1 flush their writes, while BAL gives that
incarnation a non-stale NoncePath read — but couldn't be triggered on
non-bare-metal hardware (380+ local iterations, full Amsterdam shard,
Docker linux/amd64 emulated all green). With the flag set on the
failing CI shard, the next flake will record exactly which incarnation
of the receipt slipped past the validator and what its per-auth refund
was, narrowing the residual investigation to ≤1 hour.
No behaviour change when the flag is unset. Adding the flag itself is
observability code, matching the existing TraceXxx / LogHashMismatchReason
pattern (no dedicated tests for any of those either, per established
project convention for env-gated debug flags).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ests-devnet Sets ERIGON_TRACE_AUTH_REFUND=true and ERIGON_LOG_HASH_MISMATCH_REASON=true for the blocktests-devnet shard only — the one that has flaked on test_pointer_resets_an_empty_code_account_with_storage[fork_Amsterdam] under parallel exec. The next mismatch fires the per-auth refund trace and the full receipts-on-failure dump, turning the residual diagnosis into a single log read. Scoped to this one shard because TRACE_AUTH_REFUND emits a line per EIP-7702 auth across all txs; spreading it across every shard would add unnecessary log volume to runs that don't need it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a targeted debug/trace facility to help diagnose flaky receipt-hash mismatches in parallel execution for EIP-7702 SET_CODE_TX scenarios (especially around Amsterdam state-gas refund differences between incarnations), and wires it into CI for the affected EEST shard.
Changes:
- Introduces
TRACE_AUTH_REFUND(ERIGON_TRACE_AUTH_REFUND) debug flag incommon/dbgand emits per-authorization / per-tx refund traces fromTxnExecutor.verifyAuthorities. - Emits a parallel-exec “FINALIZE” trace line indicating which incarnation’s
TxResultproduced the receipt used for the block result. - Updates the
test-eest-specGitHub Actions workflow to enable the new diagnostic flag (andERIGON_LOG_HASH_MISMATCH_REASON) only for theblocktests-devnetshard.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
execution/stagedsync/exec3_parallel.go |
Adds FINALIZE receipt/incarnation trace line when the auth-refund tracing flag is enabled. |
execution/protocol/txn_executor.go |
Adds per-auth and per-tx totals tracing for Amsterdam state-gas refund contributions in verifyAuthorities. |
common/dbg/experiments.go |
Registers the new TraceAuthRefund env flag. |
.github/workflows/test-eest-spec.yml |
Enables the new tracing flag (and hash mismatch diagnostics) only on the blocktests-devnet shard. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Splits the misleading single `verified=N` count in the per-tx TOTAL line
into two:
- `recovered=N` — auths past signer recovery (= `len(verifiedAuthorities)`,
which is appended right after step 2 and includes auths that later
skip on the code-empty/delegation check or the nonce check, so its
address still goes into the BAL access list)
- `applied=N` — auths that reached step 6 and contributed to the
refund computation (or to the legacy pre-Amsterdam refund)
For the eest-spec-blocktests-devnet flake the trace is meant to triage,
`applied` is the load-bearing number — it directly answers "how many
auths actually paid into stateIgasRefund," which is the value that
diverges between incarnations. `recovered` is kept because the access-
list-side count is independently useful when reconciling against the
per-auth lines.
Flagged by Copilot review on PR #21490.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…FINALIZE trace
`txResult.Version().Incarnation` is dereferenced unconditionally on the
same `txResult` elsewhere in this function (e.g. resultIncarnation at
line ~2540, and the txVersion := res.Version() pattern at lines 2207 /
2276), so the `if rv.TxIndex >= 0 { ... } else { -1 sentinel }` guard
in the FINALIZE trace was inconsistent with the surrounding invariant
and a little misleading to a reader (suggests a real failure mode
worth handling, when none exists at this point in the flow).
Inline the Incarnation read; if TxIndex is ever observed negative
here, every nearby unconditional dereference would fault first and
that's the bug to fix, not a guard to add.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…jsonout The eest-spec runner pipes cmd/evm's stdout into a temp file, strips [UPPERCASE] log lines via grep, then jq-parses the rest as the JSON test-results array. The previous fmt.Printf(...) trace lines went to stdout — [auth-refund] is lowercase + hyphen so it survives the grep filter and lands in the JSON payload, breaking jq with: jq: parse error: Invalid numeric literal at line 1, column 13 (column 13 is the closing ']' of '[auth-refund]'.) This made the eest-spec-blocktests-devnet shard fail deterministically as soon as the workflow turned the flag on — see CI Gate run 26596131682, job 78367787351 on this branch. Switch every TraceAuthRefund Fprintf target to os.Stderr; this matches where Erigon's normal log output already goes, which is exactly why the [EROR]/[INFO] lines never broke the runner. No format change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Dispatched 3 extra runs of
Plus the auto-CI Gate from the push: https://github.com/erigontech/erigon/actions/runs/26598227996 (4 total attempts). If all four pass cleanly, the trace is now CI-safe but the flake is rarer than the original failed run suggested and we may need more attempts. If any fire the receipt-hash mismatch, the diagnostic will be in the failing job's stderr (now visible in the GH log) — search for |
|
10 more dispatches fired on
|
Summary
Adds a new
ERIGON_TRACE_AUTH_REFUNDdebug flag to diagnose flaky receipt-hash mismatches under parallel exec on EIP-7702 SET_CODE_TX block tests (e.g. thetest_pointer_resets_an_empty_code_account_with_storage[fork_Amsterdam]flake oneest-spec-blocktests-devnet), and enables the flag (plus the existingERIGON_LOG_HASH_MISMATCH_REASON) on that one shard via the workflow.Original flaky CI run: erigontech/erigon#26571263592 job 78278757674 (PR #21485) —
receiptHash mismatch: fea59da…40c66 != dd4f3a7d…64e8be5on block 1.When the flag is set, three new traces are emitted:
verifyAuthorities): authority, auth.address, auth.nonce vs state-side nonce, codeEmpty, hasDelegation, exists, and the resulting+sgna/+sgabcontributions (or the skip reason).stateIgasRefundtotal and verified-auth count.TxResultreached the receipt, plus itsGasUsed/CumulativeGasUsed.Sample output on the flaky test (local, parallel exec, passing run — tx 2 had to re-execute once):
The 183,600 delta between incarnations 0 and 1 is exactly
StateGasNewAccount— the trace cleanly identifies the per-auth refund divergence. The local pass means inc=1 won; the CI flake is when inc=0's receipt slips through instead.Motivation
Local repro of the actual receipt-hash mismatch was not reachable (380+ macOS-arm64 iterations, 6× GOMAXPROCS settings, full Amsterdam shard on Docker linux/amd64 emulated under qemu — all green) so a code-level fix without a real divergence trace is guessing. The flake's signal on CI is just
receiptHash mismatch: <a> != <b>with no per-tx breakdown; this PR is the minimum diagnostic surface needed to turn that into "incarnation X of tx Y had refund Z, expected W" on the next CI fire.Workflow wiring
The second commit (
.github: enable …) sets bothERIGON_TRACE_AUTH_REFUND=trueandERIGON_LOG_HASH_MISMATCH_REASON=trueon theeest-spec-blocktests-devnetmatrix entry only — so the next time that specific shard flakes, the run's logs already carry the diagnostic, no manual re-trigger needed. Scoped to one shard becauseTRACE_AUTH_REFUNDemits a line per EIP-7702 auth across all txs; spreading it everywhere would add log volume to runs that don't need it.What stays free when the flag is off
if dbg.TraceAuthRefund(which is a package-levelboolinitialised once at process start)The flag follows the existing
TraceXxx/LogHashMismatchReasonenv-flag pattern incommon/dbg. Per the established convention for those flags (TraceTransactionIO,TraceDomainIO,TraceGas, etc.), no dedicated unit test is added — the value is only realized when the flag is set in a CI re-run.How to use manually
Set
ERIGON_TRACE_AUTH_REFUND=true(and optionallyERIGON_LOG_HASH_MISMATCH_REASON=true) when running any blocktest that might surface receipt-hash mismatches on EIP-7702 SET_CODE_TX paths. The CI shard now does this automatically.Test plan
make lintclean (0 issues, run multiple times due to non-determinism)go test -short ./common/dbg/ ./execution/state/ ./execution/protocol/greengo test -short ./execution/stagedsync/...greenyq; new step is correctly scoped viaif: matrix.shard == 'blocktests-devnet'🤖 Generated with Claude Code