Skip to content

execution: trace EIP-7702 state-gas refund per auth and incarnation#21490

Draft
yperbasis wants to merge 6 commits into
mainfrom
yperbasis/trace-auth-refund
Draft

execution: trace EIP-7702 state-gas refund per auth and incarnation#21490
yperbasis wants to merge 6 commits into
mainfrom
yperbasis/trace-auth-refund

Conversation

@yperbasis
Copy link
Copy Markdown
Member

@yperbasis yperbasis commented May 28, 2026

Summary

Adds a new ERIGON_TRACE_AUTH_REFUND debug flag to diagnose flaky receipt-hash mismatches under parallel exec on EIP-7702 SET_CODE_TX block tests (e.g. the test_pointer_resets_an_empty_code_account_with_storage[fork_Amsterdam] flake on eest-spec-blocktests-devnet), and enables the flag (plus the existing ERIGON_LOG_HASH_MISMATCH_REASON) on that one shard via the workflow.

Original flaky CI run: erigontech/erigon#26571263592 job 78278757674 (PR #21485)receiptHash mismatch: fea59da…40c66 != dd4f3a7d…64e8be5 on block 1.

When the flag is set, three new traces are emitted:

  1. Per-auth, per-incarnation (from verifyAuthorities): authority, auth.address, auth.nonce vs state-side nonce, codeEmpty, hasDelegation, exists, and the resulting +sgna / +sgab contributions (or the skip reason).
  2. Per-tx total: stateIgasRefund total and verified-auth count.
  3. Apply-loop FINALIZE: which incarnation's TxResult reached the receipt, plus its GasUsed / CumulativeGasUsed.

Sample output on the flaky test (local, parallel exec, passing run — tx 2 had to re-execute once):

[auth-refund] block=1 txIdx=2 inc=0 i=0 authority=c0f6... auth.addr=0000... auth.nonce=1 stateNonce=1 codeEmpty=true  hasDelegation=false exists=false +sgna=0      +sgab=35190
[auth-refund] block=1 txIdx=2 inc=0 i=1 authority=1ad9... auth.addr=0000... auth.nonce=4 stateNonce=4 codeEmpty=false hasDelegation=true  exists=true  +sgna=183600 +sgab=35190
[auth-refund] block=1 txIdx=2 inc=0 TOTAL refund=253980 verified=2 auths=2

[auth-refund] block=1 txIdx=2 inc=1 i=0 authority=c0f6... auth.addr=0000... auth.nonce=1 stateNonce=1 codeEmpty=false hasDelegation=true  exists=true  +sgna=183600 +sgab=35190
[auth-refund] block=1 txIdx=2 inc=1 i=1 authority=1ad9... auth.addr=0000... auth.nonce=4 stateNonce=4 codeEmpty=false hasDelegation=true  exists=true  +sgna=183600 +sgab=35190
[auth-refund] block=1 txIdx=2 inc=1 TOTAL refund=437580 verified=2 auths=2

[auth-refund] block=1 txIdx=2 FINALIZE resultInc=1 receiptGas=36000 cumGas=758684

The 183,600 delta between incarnations 0 and 1 is exactly StateGasNewAccount — the trace cleanly identifies the per-auth refund divergence. The local pass means inc=1 won; the CI flake is when inc=0's receipt slips through instead.

Motivation

Local repro of the actual receipt-hash mismatch was not reachable (380+ macOS-arm64 iterations, 6× GOMAXPROCS settings, full Amsterdam shard on Docker linux/amd64 emulated under qemu — all green) so a code-level fix without a real divergence trace is guessing. The flake's signal on CI is just receiptHash mismatch: <a> != <b> with no per-tx breakdown; this PR is the minimum diagnostic surface needed to turn that into "incarnation X of tx Y had refund Z, expected W" on the next CI fire.

Workflow wiring

The second commit (.github: enable …) sets both ERIGON_TRACE_AUTH_REFUND=true and ERIGON_LOG_HASH_MISMATCH_REASON=true on the eest-spec-blocktests-devnet matrix entry only — so the next time that specific shard flakes, the run's logs already carry the diagnostic, no manual re-trigger needed. Scoped to one shard because TRACE_AUTH_REFUND emits a line per EIP-7702 auth across all txs; spreading it everywhere would add log volume to runs that don't need it.

What stays free when the flag is off

  • No allocations
  • No branches in hot paths beyond a single if dbg.TraceAuthRefund (which is a package-level bool initialised once at process start)
  • No behavior change

The flag follows the existing TraceXxx / LogHashMismatchReason env-flag pattern in common/dbg. Per the established convention for those flags (TraceTransactionIO, TraceDomainIO, TraceGas, etc.), no dedicated unit test is added — the value is only realized when the flag is set in a CI re-run.

How to use manually

Set ERIGON_TRACE_AUTH_REFUND=true (and optionally ERIGON_LOG_HASH_MISMATCH_REASON=true) when running any blocktest that might surface receipt-hash mismatches on EIP-7702 SET_CODE_TX paths. The CI shard now does this automatically.

Test plan

  • make lint clean (0 issues, run multiple times due to non-determinism)
  • go test -short ./common/dbg/ ./execution/state/ ./execution/protocol/ green
  • go test -short ./execution/stagedsync/... green
  • Smoke test: flag on, blocktest produces the documented log lines and the test still passes (no behavior change)
  • Smoke test: flag off, no output produced (default state matches main)
  • Workflow YAML parses cleanly via yq; new step is correctly scoped via if: matrix.shard == 'blocktests-devnet'

🤖 Generated with Claude Code

…fund per auth and incarnation

Adds a new ERIGON_TRACE_AUTH_REFUND debug flag that, when set, logs:

  - one line per authorization processed in verifyAuthorities, with the
    (block, txIdx, incarnation), recovered authority, auth.address,
    auth.nonce vs state-side nonce, codeEmpty, hasDelegation, exists, and
    the per-auth +sgna / +sgab contributions (or the skip reason)
  - a TOTAL line per tx with the resulting stateIgasRefund
  - a FINALIZE line in the parallel apply loop with which incarnation's
    TxResult reached the receipt and its GasUsed / CumulativeGasUsed

Motivation: the eest-spec-blocktests-devnet shard's
test_pointer_resets_an_empty_code_account_with_storage[fork_Amsterdam]
flakes under ERIGON_EXEC3_PARALLEL=true. Local instrumentation confirms
the race shape — tx 2's first incarnation captures wrong stateIgasRefund
(missing StateGasNewAccount = 183_600) because pointer/sender appear
non-existent before tx 0/tx 1 flush their writes, while BAL gives that
incarnation a non-stale NoncePath read — but couldn't be triggered on
non-bare-metal hardware (380+ local iterations, full Amsterdam shard,
Docker linux/amd64 emulated all green). With the flag set on the
failing CI shard, the next flake will record exactly which incarnation
of the receipt slipped past the validator and what its per-auth refund
was, narrowing the residual investigation to ≤1 hour.

No behaviour change when the flag is unset. Adding the flag itself is
observability code, matching the existing TraceXxx / LogHashMismatchReason
pattern (no dedicated tests for any of those either, per established
project convention for env-gated debug flags).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ests-devnet

Sets ERIGON_TRACE_AUTH_REFUND=true and ERIGON_LOG_HASH_MISMATCH_REASON=true
for the blocktests-devnet shard only — the one that has flaked on
test_pointer_resets_an_empty_code_account_with_storage[fork_Amsterdam]
under parallel exec. The next mismatch fires the per-auth refund trace
and the full receipts-on-failure dump, turning the residual diagnosis
into a single log read.

Scoped to this one shard because TRACE_AUTH_REFUND emits a line per
EIP-7702 auth across all txs; spreading it across every shard would add
unnecessary log volume to runs that don't need it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a targeted debug/trace facility to help diagnose flaky receipt-hash mismatches in parallel execution for EIP-7702 SET_CODE_TX scenarios (especially around Amsterdam state-gas refund differences between incarnations), and wires it into CI for the affected EEST shard.

Changes:

  • Introduces TRACE_AUTH_REFUND (ERIGON_TRACE_AUTH_REFUND) debug flag in common/dbg and emits per-authorization / per-tx refund traces from TxnExecutor.verifyAuthorities.
  • Emits a parallel-exec “FINALIZE” trace line indicating which incarnation’s TxResult produced the receipt used for the block result.
  • Updates the test-eest-spec GitHub Actions workflow to enable the new diagnostic flag (and ERIGON_LOG_HASH_MISMATCH_REASON) only for the blocktests-devnet shard.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
execution/stagedsync/exec3_parallel.go Adds FINALIZE receipt/incarnation trace line when the auth-refund tracing flag is enabled.
execution/protocol/txn_executor.go Adds per-auth and per-tx totals tracing for Amsterdam state-gas refund contributions in verifyAuthorities.
common/dbg/experiments.go Registers the new TraceAuthRefund env flag.
.github/workflows/test-eest-spec.yml Enables the new tracing flag (and hash mismatch diagnostics) only on the blocktests-devnet shard.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread execution/protocol/txn_executor.go
info@weblogix.biz and others added 4 commits May 28, 2026 18:07
Splits the misleading single `verified=N` count in the per-tx TOTAL line
into two:

  - `recovered=N` — auths past signer recovery (= `len(verifiedAuthorities)`,
    which is appended right after step 2 and includes auths that later
    skip on the code-empty/delegation check or the nonce check, so its
    address still goes into the BAL access list)
  - `applied=N`   — auths that reached step 6 and contributed to the
    refund computation (or to the legacy pre-Amsterdam refund)

For the eest-spec-blocktests-devnet flake the trace is meant to triage,
`applied` is the load-bearing number — it directly answers "how many
auths actually paid into stateIgasRefund," which is the value that
diverges between incarnations. `recovered` is kept because the access-
list-side count is independently useful when reconciling against the
per-auth lines.

Flagged by Copilot review on PR #21490.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…FINALIZE trace

`txResult.Version().Incarnation` is dereferenced unconditionally on the
same `txResult` elsewhere in this function (e.g. resultIncarnation at
line ~2540, and the txVersion := res.Version() pattern at lines 2207 /
2276), so the `if rv.TxIndex >= 0 { ... } else { -1 sentinel }` guard
in the FINALIZE trace was inconsistent with the surrounding invariant
and a little misleading to a reader (suggests a real failure mode
worth handling, when none exists at this point in the flow).

Inline the Incarnation read; if TxIndex is ever observed negative
here, every nearby unconditional dereference would fault first and
that's the bug to fix, not a guard to add.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…jsonout

The eest-spec runner pipes cmd/evm's stdout into a temp file, strips
[UPPERCASE] log lines via grep, then jq-parses the rest as the JSON
test-results array. The previous fmt.Printf(...) trace lines went to
stdout — [auth-refund] is lowercase + hyphen so it survives the grep
filter and lands in the JSON payload, breaking jq with:

  jq: parse error: Invalid numeric literal at line 1, column 13

(column 13 is the closing ']' of '[auth-refund]'.) This made the
eest-spec-blocktests-devnet shard fail deterministically as soon as
the workflow turned the flag on — see CI Gate run 26596131682, job
78367787351 on this branch.

Switch every TraceAuthRefund Fprintf target to os.Stderr; this matches
where Erigon's normal log output already goes, which is exactly why
the [EROR]/[INFO] lines never broke the runner. No format change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@yperbasis
Copy link
Copy Markdown
Member Author

Dispatched 3 extra runs of test-eest-spec.yml on this branch (yperbasis/trace-auth-refund @ 0506770512) to (a) verify the stderr-redirect fix unblocks the JSON pipeline on the blocktests-devnet shard, and (b) try to flush out the original receiptHash mismatch flake under the trace so we get a real per-incarnation refund dump:

Plus the auto-CI Gate from the push: https://github.com/erigontech/erigon/actions/runs/26598227996 (4 total attempts).

If all four pass cleanly, the trace is now CI-safe but the flake is rarer than the original failed run suggested and we may need more attempts. If any fire the receipt-hash mismatch, the diagnostic will be in the failing job's stderr (now visible in the GH log) — search for [auth-refund].

@yperbasis yperbasis marked this pull request as draft May 28, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants