[WIP] Start phase1 benchmarks by alpe · Pull Request #3081 · evstack/ev-node

alpe · 2026-02-18T14:29:13Z

Overview

github-actions · 2026-02-18T14:29:25Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Feb 18, 2026, 2:29 PM

claude · 2026-02-18T14:29:31Z

Claude finished @alpe's task in 1m 9s —— View job

Code Review

Gather context and read changed files
Review benchmark test file
Review workflow file
Post final review

Overall Assessment

A solid start for phase1 benchmarks. The BenchmarkProduceBlock test is well-structured with proper use of b.Helper(), b.ReportAllocs(), and b.ResetTimer(). A few issues worth addressing before merging.

`block/internal/executing/executor_benchmark_test.go`

Issues

BatchData field is a duplicate of Transactions — GetNextBatchResponse is returned with both Batch.Transactions and BatchData set to the same txs slice (lines 85–88). This appears redundant and may skew benchmark results by inflating apparent work per iteration. Confirm whether both fields are intentionally populated by the sequencer in production.
exec.ctx / exec.cancel fields are set directly (lines 80–81) — This accesses unexported struct fields from within the same package (package executing), which is acceptable here, but it introduces coupling to the internal lifecycle of Executor. The benchmark bypasses Run() / Start() lifecycle methods. This is fine for microbenchmarks, but worth a brief comment explaining that the executor is being driven manually to avoid goroutine overhead.
StartTime is set to the past (line 53) — time.Now().Add(-time.Hour) is intentional to ensure the genesis block is "old enough," but there's no comment explaining why. Add one to aid future readers.
newBenchExecutor only creates a fresh executor per benchmark subcase, not per b.N iteration — For most benchmarks this is fine, but since ProduceBlock is stateful (it increments block height and writes to the store), the store accumulates state across iterations. This is by design in standard Go benchmarks (setup is amortized), but it means later iterations may behave differently (e.g., store lookups grow). Not necessarily a bug, but worth a comment documenting this deliberate choice.
Missing larger-payload test cases — The two specs ("empty batch" and "single tx") are minimal. To identify scaling behavior, consider adding a case with many small transactions (e.g., 100×32-byte txs) and one with large payloads. Fix this →

Minor / Nits

cfg.Node.MaxPendingHeadersAndData = 100000 — Very large value set to prevent back-pressure. A short comment explaining the intent would help.
The map iteration order for specs is non-deterministic. Go's testing/B subtests handle this fine, but using a slice of structs with names preserves a deterministic run order, which makes output diffs easier to read when comparing benchmark runs.

`.github/workflows/benchmark.yml`

Issues

Both benchmark steps share a single job — evm-benchmark runs EVM benchmarks and Block Executor benchmarks in the same job. If the EVM benchmark step fails (e.g., build failure), the Block Executor benchmark never runs. Consider splitting into separate jobs, or at minimum using continue-on-error: true per step if isolation is desired.
block_executor_output.txt is written to the repo root — The go test output is redirected to block_executor_output.txt (line 47) which resolves relative to the runner's working directory (repo root), while the EVM benchmark output goes into test/e2e/output.txt. These should be co-located or use a consistent naming scheme under a dedicated benchmark-results/ path.
go test output includes timing metadata that may vary — The > block_executor_output.txt redirect captures full go test output including build logs. If any === RUN lines appear (e.g., from test setup), benchmark-action may fail to parse. Confirm this is clean with -run='^$' (it should be, just verify locally).
No -count= flag on Block Executor benchmark — The EVM benchmark doesn't use one either, but for stateful benchmarks like ProduceBlock, running with -count=3 would give more statistically stable results and help benchmark-action detect regressions more reliably. Fix this →
permissions: {} at top level, but contents: write at job level — This is correct practice (deny-by-default, grant per job). Good.
auto-push: true commits benchmark results to the repo on every push to main — Confirm a dedicated gh-pages branch or similar is configured as the data store for benchmark-action. If not configured, it will push to the current branch, which is undesirable.

Summary

Area	Status
Go benchmark structure	Good
Mock setup	Good
Test case coverage	Needs expansion
CI workflow isolation	Improve
Output file placement	Minor inconsistency
Documentation/comments	Add a few clarifying comments

gemini-code-assist · 2026-02-18T14:29:38Z

Summary of Changes

Hello @alpe, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request initiates the first phase of performance benchmarking for EVM contract operations within the ev-node system. It establishes a framework for measuring transaction roundtrip latency and integrates OpenTelemetry for granular performance analysis. The changes also involve making existing test utilities more versatile for use in both testing and benchmarking contexts, alongside necessary dependency updates.

Highlights

New EVM Contract Benchmark: Introduced a new benchmark, BenchmarkEvmContractRoundtrip, to measure the store-retrieve latency of EVM contracts against a real Reth node, providing performance insights into contract interactions.
OpenTelemetry Tracing Integration: Integrated OpenTelemetry tracing into the new benchmark to collect detailed performance data, allowing for a hierarchical timing report of ev-node components like Engine API calls, executor, and sequencer.
In-process OTLP/HTTP Collector: Implemented a lightweight, in-process OTLP/HTTP receiver (otlpCollector) to efficiently collect trace spans from ev-node during benchmark execution.
Refactored Test Helpers for Benchmarks: Updated numerous test helper functions across execution/evm/test, test/e2e, and test/e2e/sut_helper.go to accept testing.TB instead of *testing.T, making them reusable for both tests and benchmarks.
Dependency Updates: Adjusted go.mod files to explicitly include go.uber.org/zap and go.opentelemetry.io/proto/otlp as direct dependencies, supporting new logging and tracing functionalities.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

block/internal/executing/executor_benchmark_test.go
- Added newBenchExecutor function to create a mocked Executor for benchmarking.
- Added BenchmarkProduceBlock function to benchmark block production with varying transaction batches.
execution/evm/test/go.mod
- Added go.uber.org/zap as a direct dependency.
- Removed go.uber.org/zap from indirect dependencies.
execution/evm/test/test_helpers.go
- Imported go.uber.org/zap and go.uber.org/zap/zaptest.
- Updated getTestScopedDockerSetup to accept testing.TB.
- Updated SetupTestRethNode to accept testing.TB and integrated zap logger for Reth node setup.
- Updated waitForRethContainer to accept testing.TB.
execution/evm/test_helpers.go
- Updated GetRandomTransaction to accept testing.TB.
test/e2e/evm_contract_bench_test.go
- Added new file for BenchmarkEvmContractRoundtrip to measure contract store/retrieve latency.
- Implemented otlpCollector for in-process OpenTelemetry trace collection.
- Added printCollectedTraceReport to analyze and log trace breakdowns.
- Added waitForReceipt helper function.
test/e2e/evm_contract_e2e_test.go
- Updated setupTestSequencer to accept testing.TB and extraArgs.
- Updated deployContract to accept testing.TB.
test/e2e/evm_test_common.go
- Updated mustGetAvailablePort to accept testing.TB.
- Updated createPassphraseFile to accept testing.TB.
- Updated createJWTSecretFile to accept testing.TB.
- Updated getNodeP2PAddress to accept testing.TB.
- Updated setupSequencerNode to accept testing.TB and extraArgs.
- Updated setupSequencerNodeLazy to accept testing.TB.
- Updated setupFullNode to accept testing.TB.
- Updated submitTransactionAndGetBlockNumber to accept testing.TB.
- Updated setupCommonEVMTest to accept testing.TB.
- Updated checkBlockInfoAt to accept testing.TB.
- Updated setupSequencerOnlyTest to accept testing.TB and extraArgs.
- Updated restartDAAndSequencer to accept testing.TB.
- Updated restartDAAndSequencerLazy to accept testing.TB.
- Updated restartSequencerNode to accept testing.TB.
- Updated verifyNoBlockProduction to accept testing.TB.
test/e2e/go.mod
- Added go.opentelemetry.io/proto/otlp as a direct dependency.
- Removed go.opentelemetry.io/proto/otlp from indirect dependencies.
test/e2e/sut_helper.go
- Updated SystemUnderTest struct field t type to testing.TB.
- Updated NewSystemUnderTest constructor to accept testing.TB.
- Updated AwaitNodeUp to accept testing.TB.
- Updated AwaitNodeLive to accept testing.TB.
- Updated AwaitNBlocks to accept testing.TB.
- Updated MustCopyFile to accept testing.TB.
- Updated NodeID to accept testing.TB.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/benchmark.yml

Activity

The pull request is currently marked as [WIP], indicating ongoing development.
The description body is empty, suggesting that detailed context and rationale are yet to be provided by the author.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov · 2026-02-18T14:33:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.13%. Comparing base (b3b0881) to head (7150135).

Additional details and impacted files

@@                   Coverage Diff                    @@
##           alex/bench_e2e_start    #3081      +/-   ##
========================================================
- Coverage                 61.14%   61.13%   -0.02%     
========================================================
  Files                       113      113              
  Lines                     11444    11444              
========================================================
- Hits                       6998     6996       -2     
- Misses                     3657     3658       +1     
- Partials                    789      790       +1

Flag	Coverage Δ
combined	`61.13% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist

Code Review

This pull request introduces benchmarks for the ProduceBlock function. The overall structure is good, but there are a couple of areas for improvement to ensure the benchmarks are robust and deterministic. I've pointed out that using time.Now() can lead to non-deterministic tests and should be replaced with a fixed time. More critically, the mock setup for GetNextBatch uses a static timestamp, which could cause benchmark failures if timestamp monotonicity is required for blocks. I've suggested a fix to generate an increasing timestamp for each block produced in the benchmark loop.

gemini-code-assist · 2026-02-18T14:34:16Z

block/internal/executing/executor_benchmark_test.go

+	mockSeq.EXPECT().GetNextBatch(mock.Anything, mock.AnythingOfType("sequencer.GetNextBatchRequest")).
+		Return(&coreseq.GetNextBatchResponse{
+			Batch:     &coreseq.Batch{Transactions: txs},
+			Timestamp: time.Now(),
+			BatchData: txs,
+		}, nil)


The mock for GetNextBatch is configured to return a response with Timestamp: time.Now(). This timestamp is evaluated only once when the mock expectation is set up. Consequently, every call to ProduceBlock within the benchmark loop will receive the same timestamp. If there is a requirement for block timestamps to be monotonically increasing, this will cause ProduceBlock to fail after the first iteration.

To fix this, use RunAndReturn to generate a new, increasing timestamp for each call. This ensures the benchmark correctly simulates the production of a sequence of valid blocks.

Suggested change

mockSeq.EXPECT().GetNextBatch(mock.Anything, mock.AnythingOfType("sequencer.GetNextBatchRequest")).

Return(&coreseq.GetNextBatchResponse{

Batch: &coreseq.Batch{Transactions: txs},

Timestamp: time.Now(),

BatchData: txs,

}, nil)

var blockNum uint64

mockSeq.EXPECT().GetNextBatch(mock.Anything, mock.AnythingOfType("sequencer.GetNextBatchRequest")).

RunAndReturn(func(_ context.Context, _ coreseq.GetNextBatchRequest) (*coreseq.GetNextBatchResponse, error) {

blockNum++

timestamp := gen.StartTime.Add(time.Duration(blockNum) * cfg.Node.BlockTime.Duration)

return &coreseq.GetNextBatchResponse{

Batch: &coreseq.Batch{Transactions: txs},

Timestamp: timestamp,

BatchData: txs,

}, nil

})

gemini-code-assist · 2026-02-18T14:34:16Z

block/internal/executing/executor_benchmark_test.go

+	gen := genesis.Genesis{
+		ChainID:         "bench-chain",
+		InitialHeight:   1,
+		StartTime:       time.Now().Add(-time.Hour),


Using time.Now() in tests can lead to non-deterministic behavior. It's better to use a fixed timestamp to ensure the benchmark is repeatable and not affected by the time it is run.

Suggested change

StartTime: time.Now().Add(-time.Hour),

StartTime: time.Date(2023, 1, 1, 12, 0, 0, 0, time.UTC),

Introduce phase1 benchmark

7150135

github-actions bot assigned alpe Feb 18, 2026

alpe changed the base branch from main to alex/bench_e2e_start February 18, 2026 14:29

gemini-code-assist bot reviewed Feb 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Start phase1 benchmarks#3081

[WIP] Start phase1 benchmarks#3081
alpe wants to merge 1 commit intoalex/bench_e2e_startfrom
alex/bench_phase1_start

alpe commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

claude bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 18, 2026

Uh oh!

codecov bot commented Feb 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 18, 2026

Uh oh!

gemini-code-assist bot Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

	StartTime: time.Now().Add(-time.Hour),
	StartTime: time.Date(2023, 1, 1, 12, 0, 0, 0, time.UTC),

Conversation

alpe commented Feb 18, 2026

Overview

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

claude bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Overall Assessment

block/internal/executing/executor_benchmark_test.go

.github/workflows/benchmark.yml

Summary

Uh oh!

gemini-code-assist bot commented Feb 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

codecov bot commented Feb 18, 2026

Codecov Report

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

claude bot commented Feb 18, 2026 •

edited

Loading

`block/internal/executing/executor_benchmark_test.go`

`.github/workflows/benchmark.yml`