feat(ci): overhaul fixture releases#2888
Conversation
abb005f to
8bd6b77
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #2888 +/- ##
===================================================
+ Coverage 87.16% 90.44% +3.27%
===================================================
Files 586 535 -51
Lines 35791 32439 -3352
Branches 3364 3012 -352
===================================================
- Hits 31198 29338 -1860
+ Misses 3943 2573 -1370
+ Partials 650 528 -122
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
7eac7b3 to
d558bc7
Compare
|
This PR needs to be merged to fully test out the new workflow, but I did some smaller smoke tests on my fork:
To keep the workflow runnable in minutes on a fork (no Smoke test releases can be found here: https://github.com/spencer-tb/execution-specs/releases |
d558bc7 to
b2ac847
Compare
|
| evm-type: benchmark | ||
| fill-params: --fork=Osaka --generate-all-formats --gas-benchmark-values 100 ./tests/benchmark/compute | ||
| feature_only: true | ||
| fill-params: --fork=Amsterdam --gas-benchmark-values 1,10,30,60,100,150 ./tests/benchmark |
There was a problem hiding this comment.
We're using geth to fill the tests, but as we discussed, I don't think it has t8n tool support yet.
I did a partial review focusing mostly on benchmarking, and I don't see any other issues from my side.
There was a problem hiding this comment.
Geth should have t8n support after this was merged, curious if there any issues
There was a problem hiding this comment.
Geth should have t8n support after this was merged, curious if there any issues
Oh thanks, that fixes something I pointed out in ethereum/go-ethereum#34972 (comment) :)
|
|
This is great, thanks. Just one question about benchmark releases. Does it make sense for those to also have benchmark-consensus and benchmark-devnet variants? For example, recently ive been hitting a slight complication which ive had to work around - the benchmark fixtures dont have BALs in them so I couldnt use them locally for optimisation work on top of the devnet changes. I had to synthesise my own fixtures essentially. If we had benchmark-devnet variant for those I wouldve not had to do this workaround |
|
I support @taratorio 's idea if this does not complicate the workflow too much |
danceratopz
left a comment
There was a problem hiding this comment.
Thanks so much for this effort! Looking forward to getting all releases in execution-specs!
I feel a little bit weird about hijacking the MAJOR version for fork/devnet version (fork version is MINOR in EELS releases, so another slight inconsistency). We do have to hope PandaOps never launches feat-devnet-alphaone or similar to avoid invalid versions 😆
On face value, it seems to fit well. And i think one of your major motivations is to avoid hard-coding releases in the hive-tests runner config repos...
https://github.com/ethpandaops/hive-tests/blob/52cf2fcfa6bb698c11344994d8e223e69dd3a969/.github/workflows/hive-devnet-7.yaml#L161-L162
But it might be worth thinking through a little more. If we have two devnet versions (bal-devnet-3,bal-devnet-7) running at the same time, does it still simplify artifact download? I guess we never have hive runner configs for two devnets simultaneously?
As-is we could already do:
consume cache --input=bal-devnet-7@latest
and if we aim to deprecate consume cache then both versions are equally convenient with gh release download, I think?
Versioning scheme in this PR:
#!/usr/bin/env bash
TAG=$(gh release list --repo ethereum/execution-specs --limit 100 --json tagName \
--jq 'map(select(.tagName | startswith("tests-bal-devnet@v7."))) | .[0].tagName')
gh release download "$TAG" --repo ethereum/execution-specs --pattern '*.tar.gz'Versioning scheme currently:
#!/usr/bin/env bash
TAG=$(gh release list --repo ethereum/execution-specs --limit 100 --json tagName \
--jq 'map(select(.tagName | startswith("tests-bal-devnet-7@v"))) | .[0].tagName')
gh release download "$TAG" --repo ethereum/execution-specs --pattern '*.tar.gz'If this does not greatly help downstream convenience I would opt for explicit hard-coded release names here bal-devnet-7) and not add our own custom convention to versioning.
If we keep it, I wonder, at the risk of complicating the workflows, we should automate/hard-code the major version to avoid user error (see the docs corrections below) as suggested here. This would avoid duplicating this in the case of the devnet branch and put it close to the fill --until=<fork> config in the case of a fork branch. The comments on the invalid tag/version highlight that this is error prone.
Could you restructure the docs to keep all (or most of) the "Formats and Release Layout" section, but move it below the "Release Tracks" (or "Test Release Types" if we rename)?
I think removing blockchain_test_engine from the benchmark spec deserves its own PR as it gets a bit lost here.
| consensus: | ||
| evm-type: eels | ||
| fill-params: --until=BPO2 --generate-all-formats | ||
| fill-params: --until=BPO4 |
There was a problem hiding this comment.
Can we keep --generate-all-formats please? :)
| evm-type: benchmark | ||
| fill-params: --fork=Osaka --generate-all-formats --gas-benchmark-values 100 ./tests/benchmark/compute | ||
| feature_only: true | ||
| fill-params: --fork=Amsterdam --gas-benchmark-values 1,10,30,60,100,150 ./tests/benchmark |
There was a problem hiding this comment.
Should we append 200M here to reflect the minimum glammie target value from the interop? Or other values @LouisTsai-Csie
Can also be a follow-up!
There was a problem hiding this comment.
The fact that we configure a fill flag here is initially surprizing, a boolean supports-xdist instead of x-dist would show clearer intent. But probably not worth the overhead.
There was a problem hiding this comment.
Nit: Can we rename x-dist to xdist? I've never seen it hyphenated anywhere else )
| # Client aliases | ||
| benchmark: | ||
| impl: geth | ||
| repo: ethereum/go-ethereum | ||
| ref: master | ||
| evm-bin: evm | ||
| x-dist: auto | ||
| consensus: | ||
| impl: eels | ||
| repo: null | ||
| ref: null | ||
| evm-bin: null | ||
| x-dist: auto | ||
|
|
There was a problem hiding this comment.
For consensus this seems like an unnecessary redirct? I'd skip this obfuscation and just use eels, respectively geth in .github/configs/feature.yaml.
And, same for benchmark, assuming we only have one type of benchmark feature (as is the case in this PR - it removes benchmark_fast).
| | Example dispatch | Git tag | Release title | Artifact | | ||
| | ---------------- | ------- | ------------- | -------- | | ||
| | `feature=consensus version=v1.2.3` | `tests-consensus@v1.2.3` | `consensus@v1.2.3` | `fixtures_consensus.tar.gz` | | ||
| | `feature=bal-devnet version=v1.0.0 branch=bal-devnet-7` | `tests-bal-devnet@v1.0.0` | `bal-devnet@v1.0.0` | `fixtures_bal-devnet.tar.gz` | |
There was a problem hiding this comment.
Invalid version/tag I believe? It must be compatible with the target devnet version.
| | `feature=bal-devnet version=v1.0.0 branch=bal-devnet-7` | `tests-bal-devnet@v1.0.0` | `bal-devnet@v1.0.0` | `fixtures_bal-devnet.tar.gz` | | |
| | `feature=bal-devnet version=v7.0.0 branch=bal-devnet-7` | `tests-bal-devnet@v7.0.0` | `bal-devnet@v7.0.0` | `fixtures_bal-devnet.tar.gz` | |
| To cut a new release: | ||
|
|
||
| These releases are tagged using the format `<pre_release_name>@vX.Y.Z`. | ||
| 1. **Pick the next version** per the [Versioning Scheme](#versioning-scheme) for the track you're releasing on (e.g. the next consensus release after `consensus@v20.3.0` is `consensus@v20.3.1` for a tests-only bump, or `consensus@v20.4.0` for a behaviour change). |
There was a problem hiding this comment.
I think this could be less ambiguous:
- tests-only = new tests?
- behavior change = fixed tests/specs?
| ```bash | ||
| gh workflow run release_fixtures.yaml -f feature=consensus -f version=v20.3.1 | ||
| # devnet releases additionally require the branch to release from: | ||
| gh workflow run release_fixtures.yaml -f feature=bal-devnet -f version=v1.0.0 -f branch=bal-devnet-7 |
There was a problem hiding this comment.
Perhaps this should also be tested by the workflow that the branch "suffix" matches the target release major version?
| gh workflow run release_fixtures.yaml -f feature=bal-devnet -f version=v1.0.0 -f branch=bal-devnet-7 | |
| gh workflow run release_fixtures.yaml -f feature=bal-devnet -f version=v7.0.0 -f branch=bal-devnet-7 |
| version: | ||
| description: "Release version, e.g. v20.0.0" | ||
| required: true | ||
| type: string |
There was a problem hiding this comment.
How about making this Y.Z and:
- Devnet release type: Extracting
Xfromdevnets/bal/X(or in potentially in the futuredevnets-bal-X. - Consensus release type: Hard-coding in
.github/configs/feature.yaml(close to where--until=<fork>is defined). - Benchmark release type: Hard-code in
.github/configs/feature.yaml.
| env: | ||
| INPUT_FEATURE: ${{ inputs.feature }} | ||
| INPUT_VERSION: ${{ inputs.version }} | ||
| INPUT_BRANCH: ${{ inputs.branch }} |
There was a problem hiding this comment.
This is somewhat of an edge case, but it's valid and there's an easy fix that's probably worth adding.
From codex:
There's a potential race condition when INPUT_BRANCH is set, e.g. devnets/bal/7):
- setup, build (matrix), combine, release each call actions/checkout with:
ref: ${{ inputs.branch }}→ resolves the current branch tip independently at each checkout (lines 42, 97, 118, 174). - Release step:
TARGET="${INPUT_BRANCH:-$GITHUB_SHA}"→INPUT_BRANCH(a branch name string) →gh release create --target bal-devnet-7→ resolves the current branch tip again at tag creation time (line 221). - That's 5+ independent snapshots of the branch tip, any of which can disagree if the branch advances during the (potentially 24h =job timeout) build.
GITHUB_SHAdoesn't enter into this path at all — it's whatever the dispatch ref's tip was at dispatch time, which is typically a different ref (e.g.forks/amsterdam) and never gets consulted in this path.
No problem when INPUT_BRANCH is empty:
- Every actions/checkout is called with
ref: ${{ inputs.branch }}→ empty → checkout falls back toGITHUB_SHA. - Release step:
TARGET="${INPUT_BRANCH:-$GITHUB_SHA}"→GITHUB_SHA. GITHUB_SHAis immutable for the duration of the run.- Race-free. All jobs see the same SHA, tag lands on the same SHA. Done.
Suggested fix:
# in `setup` job, after checkout:
- name: Resolve target SHA
id: target_sha
run: echo "sha=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"
Add target_sha to setup.outputs, then replace every ref: ${{ inputs.branch }} with ref: ${{ needs.setup.outputs.target_sha }} (in build, combine, release), and --target "$TARGET" with --target "${{ needs.setup.outputs.target_sha }}". Now there's exactly one tip resolution; everything downstream is SHA-pinned.
Hey @taratorio, thanks for asking and the feedback! Yes, this def makes sense in general and we can add these as required.
Were you filling these fixtures yourself? The latest release https://github.com/ethereum/execution-specs/releases/tag/tests-benchmark%40v0.0.9 is only filled for Osaka. There might have been a bit of flux here due to incompatible/incomplete t8n tools for EELS and geth (benchmark releases have been using geth), but as-is today 😆 on |
yes, I filled a few that I was interested in with erigon but not via t8n but my own quick and hacky way |
|
Just discussed with @spencer-tb and @LouisTsai-Csie, this is our suggestion going forward:
|
I mostly agree with this comment except for points 3 and 4, since I think the MAJOR should refer to the devnet/fork: My suggestion is as follows: For
For
I.e. if a client is targeting to join fork/devnet X, it should target MAJOR equal to X, must take the latest MINOR, and should ideally take the latest PATCH. With fork/devnet as MAJOR, the version number alone tells a client whether a release is relevant to them. Under the alternative (MAJOR tracks any spec change), you'd have to combine the version and the -N suffix in the release name to figure out whether a spec change targets your devnet or a different one. Devnet compatibility shouldn't require parsing two fields IMO. Concretely, the mindset just from looking at the version (ignoring the name) should be: I'm a client dev targeting devnet 7, and I was passing tests contained in On the topic of parallel maintenance of two different devnets, this scheme handles this naturally IMO. E.g. if we are maintaining devnets 3 & 7 for the same feature, releasing v3.0.1 after or alongside v7.0.0 is not a problem (think Python 2.x vs 3.x). It is a well-known Semver pattern, and Semver is good at this. We should simply make this rule clear and follow it so we are predictable. On On benchmarking, major and minor should mirror the feature they are targeting, while the patch moves freely at its own pace. |
EL Client Dev Reviewers
Summary
Large heads up on how fixture releases will work going forward, we would appreciate your feedback on the consensus/devnet features before this lands.
The aim here is to consolidate to two release tracks that you'll consume:
tests-consensus@vX.Y.Z) is the mainnet "must pass" set, all tests for all forks, up to the last mainnet fork, cut frequently (weekly) from the latest development branch. It replaces the oldstable/developsplit, and passing it means you won't break mainnet. Over time it'll absorb all ofethereum/legacy-tests, so the only tests you'll need will be from one repo: EELS.tests-<feat>-devnet@vX.Y.Z, e.g.tests-bal-devnet@v7.0.0) covers the fork/feature under active development, it fills all forks up to the development fork and is meant as an advisory/non-blocking gate on your devnet branch, passing it is the bar for inclusion in that devnet.Both will be downloadable via
gh release download <tag>orconsume cache --input=<feature>@<version|latest>. For consensus, X = fork number (e.g. BPO2 = 20), Y = a behaviour/spec change, Z = new tests only; for devnet X = devnet number.Below are some items we have discussed, and will not add or change:
Questions
A few specific things we'd like your feedback on:
EELS Reviewers
🗒️ Description
This PR outlines a detailed description and updates our test fixture release process and strategy following alignment with @danceratopz and other EL clients.
Key Changes
evm-impl.yamlis merged intoevm.yaml; each entry now carries itsimpl,repo,ref,evm-binandx-dist. Client aliases (consensus,benchmark) are added sofeature.yaml/workflows can reference a client by name, andbuild-evm-baseresolves which builder to run via theimplfield.consensus,devnetandbenchmarkfeatures (inforks/amsterdam).devnetis specifically chosen so we don't ever need to updatefeature.yamlfor future devnet features; thedevnetkeyword is used as a substring match on release tagging, and<feat>-devnetkeeps its friendly name.release_fixture_feature.yamlis replaced withrelease_fixtures.yaml:workflow_dispatchwith explicitfeature+versioninputs; the git tag istests-<feature>@<version>and the release title is<feature>@<version>.versionmust matchvX.Y.Z,featuremust be non-empty,*-devnetrequires abranch, and anevmoverride must be a key inevm.yaml.evm/evm_repo/evm_refinputs override the client impl and the t8n tool repo/ref for a one-off release.osakarange (no standalonebposplit).blockchain+blockchain_engine_xfixtures only (the defaultblockchain_enginefixture is dropped). The autobenchmark_fastpush artifact is removed frombenchmark.yaml.Tagging A Release
To tag releases we must now use the github cli. This has the benefit of only creating tags in EELS if the fixture building process is successful. No more tag deletion and recreation, only workflow triggers.
The following can be ran locally or optionally triggered with the github actions website UI.
Downloading A Release
Fixtures can be downloaded by 2 seperate methods.
gh release downloadfor the raw tarball, orconsume cacheif you want tag resolution (@latest), local caching, and--inputintegration with consume subcommands.Fixed Release Types
tests-consensus@vX.Y.ZIn the past (pre-Weld) we had the following release features:
stable&develop, wherestablewas a subset fordevelop. To converge on these 2 features we now define theconsensusfeature. This is the invariant to thebenchmarkfeature.The
consensusfeature acts as our mainnet set of tests. That is for nowfill --until BPO4, all tests for all forks until last mainnet fork. This will be released weekly on any change or addition to the consensus tests always from the latest development branch:forks/amsterdamcurrently. Clients will use this release on their main/master branches in CI, and eventually this release will contain all tests fromethereum/testsðereum/legacy-tests(allowing us to archive both of these repos). TLDR; one tag type to verify you will not break mainnet. Additionally this will be ran in our Hive CI (under what is currently labelled generic).Here we define a new semantic versioning type for our consensus test releases:
The first release of this type in EELS will be
tests-consensus@v20.0.0, to catch up from the last fork. For the fork under development (Amsterdam) the firstconsensusrelease will be tagged once all CFI'd EIPs are deemed successful in a devnet (purposely ambiguous here, things change in ethereum). Typically this will resolve to the last devnet before the first testnet; this first release can be viewed as the testnet release. For Amsterdam we will tagtests-consensus@v24.0.0likely afterglamsterdam-devnet-6is deemed successful. Spec changes can still occur and that is why we have Y in the new fixture versioning scheme.tests-<feat>-devnet@vX.Y.ZThis release type will follow on from the current devnet release process but more explicitly. Today
<feat>isbaland soon it will beglamsterdam.<feat>can essentially be any keyword but typically is the fork name or the headliner feature.The
devnetfeature is entirely for test releases during the fork development process for the upcoming fork. Here we still fill for all forks so clients can make sure they do not introduce any regressions, currentlyfill --until Amsterdam, all tests for all forks until the development fork. Clients will use this release on theirdevnetbranches in CI; they must pass all of these tests before being included in the devnet that the release is tagged for. This release will be ran in Hive CI under the same naming scheme as we do currently.For the
baldevnets today we use the tagtests-bal@vX.Y.Z. This PR will change the process totests-bal-devnet@vX.Y.Z. Asbal-devnet-7is the last of thebal's we will start the new tagging scheme forglamsterdam-devnet-5. The devnet releases will additionally follow a new versioning scheme:glamsterdam-devnet-5this would be 5,consensustest releases.Following the latter, the first
glamsterdam-devnet-5release will be tagged astests-glamsterdam-devnet@v5.0.0. All devnet releases must be tagged from a devnet branch in EELS, in this caseglamsterdam-devnet-5. Here we are specifically choosing to diverge from the EELS/branch naming scheme to align every repo under the same devnet name.tests-benchmark@vX.Y.Z/tests-zkevm@vX.Y.ZBenchmark/zkevm releases are self explanatory here, and released only when required.
Here we only choose to change the versioning scheme to align with that of the
consensusfeature:The
benchmark/zkevmfeature can be tagged from the relatedforks/amsterdamor the latestdevnetbranch in EELS. The nextbenchmark/zkevmrelease shall be tagged astests-benchmark@v24.0.0/tests-zkevm@v24.0.0, cc @LouisTsai-Csie / @jsign.🔗 Related Issues or PRs
✅ Checklist
just statictype(scope):.Cute Animal Picture