Skip to content

Conversation

@julienrbrt
Copy link
Member

@julienrbrt julienrbrt commented Jan 16, 2026

Attempt to embed geth in evnode.
90% AI.

friday fun project. not meant to be merged at any time.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 16, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedJan 16, 2026, 11:51 PM

@claude
Copy link
Contributor

claude bot commented Jan 16, 2026

Claude finished @julienrbrt's task in 2m 22s —— View job


Code Review: In-Process Geth Integration

I've completed a comprehensive review of this PR, which introduces in-process Geth execution capabilities. This is an ambitious feature that simplifies deployment by embedding Geth directly in the node. While the implementation demonstrates solid understanding of both Geth internals and the Engine API, there are several critical issues that need attention before this can be considered production-ready.


Critical Issues (Must Fix)

1. Datastore Performance Bottlenecks 🔴 (HIGH PRIORITY)

File: execution/evm/engine_geth_db.go

The datastore wrapper has serious performance issues that will impact production use:

DeleteRange O(N) complexity (lines 49-71):

func (w *wrapper) DeleteRange(start []byte, end []byte) error {
    // Query all keys and delete those in range
    q := query.Query{KeysOnly: true}
    results, err := w.ds.Query(context.Background(), q)
    // ...iterates through ALL keys in the datastore

This queries all keys in the datastore and filters in memory. With millions of keys, this will be extremely slow. Geth uses range deletions heavily during state pruning and ancient data cleanup.

Iterator inefficiency (lines 298-333):
The iterator filters keys by comparing to start after retrieving them, meaning it may iterate over many keys before the start key. A proper iterator should seek directly to the start position.

Impact: These will cause severe performance degradation during:

  • State pruning (DeleteRange calls)
  • Blockchain reorganizations
  • Historical data queries
  • Block sync operations

Recommendation: The go-datastore interface may not be suitable for this use case. Consider:

  1. Using a datastore implementation with native range operation support (e.g., pebbledb, badger)
  2. Adding range query capabilities to the datastore interface
  3. Documenting these limitations clearly if keeping the current implementation

Fix datastore performance →

2. Missing Error Handling in Flag Parsing 🔴

File: apps/evm/cmd/run.go:63,75,275,290

useGeth, _ := cmd.Flags().GetBool(evm.FlagEVMInProcessGeth)  // Line 63
genesisPath, _ := cmd.Flags().GetString(evm.FlagEVMGenesisPath)  // Line 275
rpcAddress, _ := cmd.Flags().GetString(evm.FlagEVMRPCAddress)  // Line 290

Errors from flag parsing are silently ignored. If flags are misconfigured (typo, wrong type), the program will proceed with default/zero values, causing confusing runtime failures.

Impact:

  • useGeth defaulting to false means user thinks they're using in-process Geth but actually gets RPC client
  • Empty genesisPath causes immediate panic on file read
  • Wrong rpcAddress means RPC server silently doesn't start

Fix:

useGeth, err := cmd.Flags().GetBool(evm.FlagEVMInProcessGeth)
if err != nil {
    return nil, fmt.Errorf("failed to get '%s' flag: %w", evm.FlagEVMInProcessGeth, err)
}

Fix flag error handling →

3. Transaction Pool Price Limit Set to 0 ⚠️ (MEDIUM PRIORITY)

File: execution/evm/engine_geth.go:215

txPool, err := txpool.New(0, blockchain, []txpool.SubPool{legacyPool})

The priceLimit of 0 disables minimum gas price enforcement, allowing zero-price transactions to flood the mempool.

Impact:

  • Mempool spam with zero-fee transactions
  • Resource exhaustion (memory, CPU for tx validation)
  • No economic spam protection

Consideration: This may be intentional for a rollup where gas pricing is handled differently (e.g., fees collected on L1, or free transactions for testing). However:

  1. It should be documented why this is safe
  2. Consider making it configurable via flag
  3. Add DoS protections if keeping zero price (rate limiting, IP-based, etc.)

Make price limit configurable →


Security Issues

4. Unrestricted CORS Headers ⚠️

File: execution/evm/engine_geth_rpc.go:133-138

w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Access-Control-Allow-Methods", "POST, GET, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization")

Wildcard CORS allows any website to make requests to the RPC server, enabling cross-site attacks.

Impact:

  • Malicious websites can call RPC methods from user browsers
  • Potential for CSRF attacks if methods have state-changing effects
  • Exposure of potentially sensitive data (balances, transactions)

Recommendation:

  • Make CORS origins configurable via flag
  • Default to restrictive origins (localhost only)
  • Document security implications clearly

Fix CORS configuration →

5. Timeout Configuration Too Low ⚠️

File: execution/evm/engine_geth_rpc.go:119-120

ReadTimeout:  30 * time.Second,
WriteTimeout: 30 * time.Second,

30-second timeouts may be too short for:

  • Large eth_getLogs queries
  • eth_call on complex contracts
  • Block sync operations
  • Historical data queries

Recommendation: Make timeouts configurable or increase to 60s+


Architecture & Code Quality

6. Subsecond Block Support Incomplete ℹ️

File: execution/evm/engine_geth.go:491-494

// Validate timestamp allow equal time to allow sub seconds blocks
if ps.timestamp < parent.Time() {
    return nil, fmt.Errorf("invalid timestamp: %d must be greater than parent timestamp %d", ps.timestamp, parent.Time())
}

Comment says "allow equal time to allow sub seconds blocks", but the validation uses < not <=, which does allow equal timestamps. However:

  • Geth's timestamp is uint64 (seconds precision)
  • Two blocks with the same second-level timestamp but different sub-second timestamps will both have the same ps.timestamp
  • This creates ambiguity and potential hash collisions

Questions:

  • Is subsecond block production actually supported?
  • If yes, how is subsecond precision stored/handled?
  • Related to commit "allow subseconds blocks (does not work yet)"

Recommendation: Either implement subsecond support properly (custom time encoding) or enforce strictly increasing second-level timestamps.

7. Mutex Usage in Batch Operations ℹ️

File: execution/evm/engine_geth_db.go

The batchWrapper uses a mutex for all operations:

func (b *batchWrapper) Put(key []byte, value []byte) error {
    b.mu.Lock()
    defer b.mu.Unlock()
    // ...
}

Observation: Batches are typically used in single-threaded context (batch.Put(), batch.Write()). The mutex may be unnecessary and adds overhead. If concurrent access is possible, this needs documentation.

8. ContractAddress Calculation Incorrect 🔴

File: execution/evm/engine_geth.go:988-990

// Set contract address if this was a contract creation
if msg.To == nil {
    receipt.ContractAddress = evmInstance.Origin
}

Contract addresses are not the transaction origin (From address). They're computed as:

receipt.ContractAddress = crypto.CreateAddress(msg.From, tx.Nonce())

Impact: Contract creation receipts will have wrong contractAddress, breaking:

  • Event indexers
  • Block explorers
  • Smart contract deployment tools
  • Contract verification services

Fix contract address calculation →

9. Receipt Logs Parameters Incorrect ⚠️

File: execution/evm/engine_geth.go:974

Logs: stateDB.GetLogs(tx.Hash(), header.Number.Uint64(), common.Hash{}, header.Number.Uint64()),

The GetLogs parameters look suspicious:

  • blockHash parameter is common.Hash{} (zero hash) - should be block hash
  • blockNumber is duplicated - one appears to be wrong parameter

Impact: Logs may have incorrect BlockHash field, breaking log queries.


Testing & Reliability

10. Insufficient Test Coverage ℹ️

File: execution/evm/engine_geth_test.go

Tests cover basic happy paths but miss critical error scenarios:

Missing tests:

  • Transaction execution failures (out of gas, revert)
  • Invalid block scenarios (bad timestamp, nonce, hash mismatch)
  • Concurrent block production
  • Database errors during execution
  • Payload building with large transaction sets
  • State pruning and compaction
  • Recovery from crash during payload build
  • Iterator edge cases (empty datastore, single entry, start > all keys)

Recommendation: Add integration tests that:

  • Test against real Geth contracts (ERC20, Uniswap, etc.)
  • Stress test with high transaction volume
  • Simulate crash scenarios
  • Test datastore wrapper with various backends

11. No Resource Cleanup on Error Paths ⚠️

File: execution/evm/engine_geth.go:107-118

backend, err := newGethBackend(genesis, db, logger)
if err != nil {
    return nil, fmt.Errorf("failed to create geth backend: %w", err)
}

// Start RPC server if address is provided
if rpcAddress != "" {
    if err := backend.StartRPCServer(rpcAddress); err != nil {
        backend.Close()  // ✅ Good!
        return nil, fmt.Errorf("failed to start RPC server: %w", err)
    }
}

Good cleanup here, but check other error paths in newGethBackend and StartRPCServer for resource leaks (goroutines, file handles, network listeners).


Best Practices & Style

12. Magic Numbers Should Be Constants ℹ️

File: execution/evm/engine_geth_db.go

if len(payloadIDBytes) != 8 {  // Why 8?

Document why payloadID is 8 bytes (matches engine.PayloadID definition).

File: execution/evm/engine_geth_rpc.go:635

var gas uint64 = 50000000  // Why 50M?

Document why 50M gas limit for eth_call.

13. Error Messages Could Be More Informative ℹ️

File: execution/evm/engine_geth.go:484-486

if parent == nil {
    return nil, fmt.Errorf("parent block not found: %s", ps.parentHash.Hex())
}

Good! Include context like block height for debugging:

return nil, fmt.Errorf("parent block not found: hash=%s height=%d", ps.parentHash.Hex(), ps.parentHash)

Documentation

14. Missing Package-Level Documentation ℹ️

The new files lack package-level documentation explaining:

  • Purpose of in-process Geth vs RPC client
  • When to use which approach
  • Performance implications
  • Security considerations
  • Known limitations (datastore performance)

Recommendation: Add comprehensive package docs:

// Package evm provides Ethereum Virtual Machine execution for the ev-node rollup.
//
// It supports two execution backends:
// 1. RPC Client: Connects to external Geth/Reth via Engine API
// 2. In-Process Geth: Embeds Geth directly (this file)
//
// In-process Geth simplifies deployment but has performance tradeoffs...

Summary

This PR implements a complex feature with generally clean code structure and good test coverage for happy paths. However, there are critical issues that must be addressed:

Blocking Issues:

  1. ❌ Datastore wrapper O(N) performance (DeleteRange, Iterator) - will cause severe production issues
  2. ❌ Contract address calculation is incorrect - breaks compatibility
  3. ❌ Missing flag error handling - causes confusing failures

Important Issues:
4. ⚠️ Transaction pool spam protection (price limit = 0)
5. ⚠️ CORS security (wildcard origin)
6. ⚠️ Receipt logs have wrong parameters

For Production Readiness:

  • Add comprehensive integration tests with real contracts
  • Load testing with high transaction volume
  • Security audit of RPC endpoints
  • Documentation of limitations and operational considerations
  • Consider datastore abstraction redesign

Given this is marked as "Friday fun project, not meant to be merged", the issues are expected. If you do plan to eventually merge this, I'd be happy to help with fixes or provide more detailed implementation guidance.

Note: Since this PR is experimental (90% AI, not for merging), these findings serve as a roadmap for what would need to be addressed before production use.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the EVM execution capabilities by introducing an in-process Geth client. This architectural shift allows the application to perform Ethereum Virtual Machine operations directly, bypassing the need for an external execution client via RPC. The change aims to streamline development, simplify deployment, and potentially improve performance by reducing inter-process communication overhead. It includes the necessary data storage adaptations and new configuration options to enable this embedded functionality.

Highlights

  • In-Process Geth Integration: Introduced an in-process Geth execution client, allowing direct EVM execution within the application without requiring an external RPC connection to a Geth or Reth node.
  • Flexible Execution Client Creation: Modified the execution client creation logic to conditionally instantiate either the existing RPC-based client or the new in-process Geth client based on command-line flags.
  • Datastore Abstraction for Geth: Implemented a wrapper to adapt the go-datastore.Batching interface to go-ethereum's ethdb.KeyValueStore, enabling the in-process Geth to utilize the application's existing data storage.
  • Logger Standardization: Refactored the EngineClient constructor to directly accept a zerolog.Logger, removing the need for a separate SetLogger method and ensuring consistent logging practices.
  • New Configuration Flags: Added new command-line flags (--evm.geth and --evm.geth.genesis-path) to control the activation and configuration of the in-process Geth client.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@julienrbrt julienrbrt changed the title geth. feat(evm): support in-process evm via geth Jan 16, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant new feature: the ability to run an in-process Geth instance as the execution client. This is a great addition that will simplify setup, testing, and deployment for users by removing the need for an external execution client process. The implementation is extensive, including a new ethdb wrapper for go-datastore and the core logic for the in-process Geth backend. My review focuses on some critical areas for improvement, particularly around error handling in flag parsing and potential performance bottlenecks in the new database wrapper. I've also included a comment on the transaction pool configuration for consideration.

Comment on lines 213 to 218
useGeth, _ := cmd.Flags().GetBool(evm.FlagEVMInProcessGeth)
if useGeth {
genesisPath, _ := cmd.Flags().GetString(evm.FlagEVMGenesisPath)
if len(genesisPath) == 0 {
return nil, fmt.Errorf("genesis path must be provided when using in-process Geth")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The errors returned by cmd.Flags().GetBool() and cmd.Flags().GetString() are being ignored. This can lead to silent failures and unexpected behavior if the flags are not parsed correctly (e.g., due to a typo or incorrect type). The program might proceed with default values (e.g., an empty genesisPath), causing issues later on. It's crucial to handle these errors to make the command-line interface more robust.

Suggested change
useGeth, _ := cmd.Flags().GetBool(evm.FlagEVMInProcessGeth)
if useGeth {
genesisPath, _ := cmd.Flags().GetString(evm.FlagEVMGenesisPath)
if len(genesisPath) == 0 {
return nil, fmt.Errorf("genesis path must be provided when using in-process Geth")
}
useGeth, err := cmd.Flags().GetBool(evm.FlagEVMInProcessGeth)
if err != nil {
return nil, fmt.Errorf("failed to get '%s' flag: %w", evm.FlagEVMInProcessGeth, err)
}
if useGeth {
genesisPath, err := cmd.Flags().GetString(evm.FlagEVMGenesisPath)
if err != nil {
return nil, fmt.Errorf("failed to get '%s' flag: %w", evm.FlagEVMGenesisPath, err)
}
if len(genesisPath) == 0 {
return nil, fmt.Errorf("genesis path must be provided when using in-process Geth")
}

Comment on lines +48 to +70
// DeleteRange implements ethdb.KeyValueStore.
func (w *wrapper) DeleteRange(start []byte, end []byte) error {
// Query all keys and delete those in range
q := query.Query{KeysOnly: true}
results, err := w.ds.Query(context.Background(), q)
if err != nil {
return err
}
defer results.Close()

for result := range results.Next() {
if result.Error != nil {
return result.Error
}
keyBytes := datastoreKeyToBytes(result.Entry.Key)
if bytes.Compare(keyBytes, start) >= 0 && bytes.Compare(keyBytes, end) < 0 {
if err := w.ds.Delete(context.Background(), datastore.NewKey(result.Entry.Key)); err != nil {
return err
}
}
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The DeleteRange implementation queries all keys from the datastore and then iterates through them to find keys within the specified range. This approach has a time complexity of O(N), where N is the total number of keys in the datastore. This can be very inefficient and cause significant performance degradation, especially with a large database, as Geth may use range deletions frequently. A more efficient implementation would leverage the underlying database's ability to iterate over a specific range, which is typically O(M) where M is the number of keys in the range. While the go-datastore interface is limited, this implementation could become a serious performance bottleneck.

Comment on lines +323 to +326
// Check if key is >= start (if start is set)
if len(it.start) > 0 && bytes.Compare(keyBytes, it.start) < 0 {
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The iterator's Next method filters keys by comparing them to the start key after retrieving them. This means it may iterate over many keys before the start key, which is inefficient. A more performant iterator would seek to the start key directly. This implementation could lead to performance issues when iterating over large datasets with a start key set far from the beginning. Given that database iterators are a core part of Geth's performance, this could be a significant bottleneck.

txPoolConfig.NoLocals = true

legacyPool := legacypool.New(txPoolConfig, blockchain)
txPool, err := txpool.New(0, blockchain, []txpool.SubPool{legacyPool})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The transaction pool is initialized with a priceLimit of 0. This disables the minimum gas price enforcement in the transaction pool, which could potentially allow for spamming the mempool with zero-price transactions. While this might be intentional for a rollup environment where gas pricing is handled differently, it's important to ensure this is the desired behavior. Consider making this configurable if there are scenarios where a price limit is needed.

@codecov
Copy link

codecov bot commented Jan 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.98%. Comparing base (73297c1) to head (8df657c).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2986   +/-   ##
=======================================
  Coverage   58.98%   58.98%           
=======================================
  Files         103      103           
  Lines        9902     9902           
=======================================
  Hits         5841     5841           
  Misses       3434     3434           
  Partials      627      627           
Flag Coverage Δ
combined 58.98% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants