Skip to content

feat(tests): EIP-7778 admission gate uses pre-refund gas#2932

Draft
danceratopz wants to merge 1 commit into
ethereum:forks/amsterdamfrom
danceratopz:eip7778-gas-fix
Draft

feat(tests): EIP-7778 admission gate uses pre-refund gas#2932
danceratopz wants to merge 1 commit into
ethereum:forks/amsterdamfrom
danceratopz:eip7778-gas-fix

Conversation

@danceratopz
Copy link
Copy Markdown
Member

@danceratopz danceratopz commented May 28, 2026

🗒️ Description

Add test_extra_tx_admission_uses_pre_refund_gas: a 2-tx regression covering the case where a refund-bearing first tx is followed by a tx whose gas_limit exceeds its actual usage. The existing test_multi_transaction_gas_accounting parametrisations pin extra_tx.gas_limit to the intrinsic cost, so a post-refund admission gate is masked by the subsequent header.gas_used > gas_limit check. The added slack makes the bypass observable: a buggy implementation admits the trailing tx and computes a header.gas_used that no longer matches the fixture.

TODO revisit one 8037 hits forks/amsterdam

This captures the bug when 8037 is not enabled. Verified by disabling 8037 on top of NethermindEth/nethermind@82590cbb10 and rebuilding nethtest. I.e.,

diff --git a/src/Nethermind/Nethermind.Specs/Forks/25_Amsterdam.cs b/src/Nethermind/Nethermind.Specs/Forks/25_Amsterdam.cs
index 11fafa37ba..3e30278883 100644
--- a/src/Nethermind/Nethermind.Specs/Forks/25_Amsterdam.cs
+++ b/src/Nethermind/Nethermind.Specs/Forks/25_Amsterdam.cs
@@ -20,7 +20,6 @@ public class Amsterdam() : NamedReleaseSpec<Amsterdam>(BPO2.Instance)
         spec.IsEip7954Enabled = true;
         spec.MaxCodeSize = CodeSizeConstants.MaxCodeSizeEip7954;
         spec.IsEip8024Enabled = true;
-        spec.IsEip8037Enabled = true;
     }
 
     public static IReleaseSpec NoEip8037Instance { get; } = new Amsterdam { IsEip8037Enabled = false };

The behavior can be verified using nethtest and manually verifying that the expected exception is produced with/without the bug. However, when 8037 is enabled, it masks the bug and it appears not to be reproducible. Will investigate further once 8037.

image

fixtures.tar.gz

🔗 Related Issues or PRs

This coverage gap was discovered by:

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

Add `test_extra_tx_admission_uses_pre_refund_gas`: a 2-tx regression
covering the case where a refund-bearing first tx is followed by a tx
whose `gas_limit` exceeds its actual usage. The existing
`test_multi_transaction_gas_accounting` parametrisations pin
`extra_tx.gas_limit` to the intrinsic cost, so a post-refund admission
gate is masked by the subsequent `header.gas_used > gas_limit` check.
The added slack makes the bypass observable: a buggy implementation
admits the trailing tx and computes a `header.gas_used` that no longer
matches the fixture.

Regression scenario from
NethermindEth/nethermind#11794.
@danceratopz danceratopz self-assigned this May 28, 2026
@danceratopz danceratopz added A-tests Area: Consensus tests. C-feat Category: an improvement or new feature labels May 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.43%. Comparing base (0b18525) to head (735c03e).
⚠️ Report is 1 commits behind head on forks/amsterdam.

Additional details and impacted files
@@               Coverage Diff                @@
##           forks/amsterdam    #2932   +/-   ##
================================================
  Coverage            90.43%   90.43%           
================================================
  Files                  535      535           
  Lines                32430    32430           
  Branches              3012     3012           
================================================
  Hits                 29329    29329           
  Misses                2573     2573           
  Partials               528      528           
Flag Coverage Δ
unittests 90.43% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@danceratopz
Copy link
Copy Markdown
Member Author

This a tricky one, so adding Claude's explanation for dummies 😆

Setup

Two transactions in one block.

Tx 1 — the "refund tx"

Hits a contract that performs 10 SSTORE clears (slots from 10).
Two things result:

  • It charges roughly P = 71,050 gas (the "pre-refund" amount — the
    work actually done).
  • It earns a refund of R gas (~12,000) because clearing storage
    gets a partial rebate.

The receipt for this tx shows P − R gas used (what the sender pays
for). But the block counts the full P, because EIP-7778 says
block-level accounting ignores refunds.

After tx 1 the block has two "running totals":

  • Pre-refund running total: P ← this is what header.gas_used will be.
  • Post-refund running total: P − R ← this is what receipts add up to.

Tx 2 — the "trailing tx"

Just calls a contract that runs STOP. Almost nothing happens — its
actual gas use is the intrinsic 21,000.

The trick: we give tx 2 a gas_limit of 42,000 (twice what it'll
actually use). That extra headroom is the whole point.

Block

block.gas_limit = P + 42,000 − 1 = 113,049. Just one gas short of
fitting tx 2's declared gas limit on top of tx 1's pre-refund cost.

What the two clients do

Step Correct client (pre-refund gate) Buggy client (post-refund gate)
At tx 2's admission check, what does "used so far" mean? P (= 71,050) P − R (= ~59,050)
Available room = gas_limit − used so far 41,999 ~53,999
Tx 2's declared gas (42,000) vs available 42,000 > 41,999reject 42,000 ≤ 53,999admit
What happens next? Block invalid — bad tx in it. Done. Executes tx 2. It uses only 21,000, so block's pre-refund total becomes 92,050.
End-of-block: compare computed header.gas_used to what's in the block header (71,050). (Never reached.) 92,050 ≠ 71,050block invalid, but for the wrong reason.

The point

Both clients reject the block — but for different reasons, at
different points in the pipeline.

  • Correct: "tx 2 doesn't fit, reject at admission."
  • Buggy: "I admitted tx 2 (oops), and now my totals don't match the
    header at the end."

The got 92,050, expected 71,050 mismatch the buggy binary prints is
the smoking gun: that extra 21,000 is exactly tx 2's STOP, which proves
it was admitted when it shouldn't have been.

The whole trick of the test is the slack on tx 2's gas_limit.
Without it, the buggy code's "wrong" admission still ends up overflowing
the block gas limit at the end, which masks the bug as a
different-but-similar error. With the slack, the buggy code admits and
stays under the block limit, so the only thing that catches it is the
header-vs-computed-gas-used comparison — a distinct, distinguishable
error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-tests Area: Consensus tests. C-feat Category: an improvement or new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants