feat(block): async epoch fetching #2952

julienrbrt · 2026-01-05T16:25:04Z

github-actions · 2026-01-05T16:25:19Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 9, 2026, 3:37 PM

claude · 2026-01-05T16:25:22Z

Claude finished @julienrbrt's task in 4m 34s —— View job

Comprehensive Code Review

I've completed a thorough review of the async epoch fetching implementation. Here's my analysis across all key areas:

Review Summary

This PR introduces a well-designed async prefetching mechanism for DA blocks that significantly improves performance by caching upcoming epoch data. The implementation is solid overall with good concurrency patterns and proper resource cleanup. However, there are several areas that need attention.

1. Code Quality & Architecture

✅ Strengths

Clean separation of concerns: AsyncBlockRetriever handles prefetching, ForcedInclusionRetriever orchestrates DA fetching
Good use of interfaces and dependency injection
Proper context handling throughout
Idiomatic Go code style

⚠️ Issues Found

CRITICAL: Fragile Key Parsing (async_block_retriever.go:318)

var height uint64
_, err := fmt.Sscanf(key.String(), "/block/%d", &height)

This parsing is fragile and error-prone. The recommended approach is to use key.Name() with strconv.ParseUint:

heightStr := key.Name()
height, err := strconv.ParseUint(heightStr, 10, 64)

This is more robust and follows datastore key conventions.

Missing Documentation (async_block_retriever.go:22-28)
The AsyncBlockRetriever interface lacks godoc comments. All exported interfaces should be documented:

// AsyncBlockRetriever provides background prefetching of DA blocks with caching.
// It maintains an in-memory cache and automatically prefetches blocks within a
// configurable window ahead of the current DA height.
type AsyncBlockRetriever interface { ... }

Package Organization (forced_inclusion_retriever.go:36-44)
The NewForcedInclusionRetriever function creates and starts the async fetcher internally. Consider whether this is the right level of control - callers have no visibility into or control over the async fetcher's lifecycle independently.

2. Error Handling & Resilience

✅ Strengths

Proper error wrapping with context
Graceful handling of StatusHeightFromFuture for retry
Good use of errors.Is() for error type checking

⚠️ Issues Found

Silent Error in Cleanup (async_block_retriever.go:305-306)

if err != nil {
    f.logger.Debug().Err(err).Msg("failed to query cache for cleanup")
    return
}

This returns early on any query error, potentially blocking all cleanup. Consider logging and continuing:

if err != nil {
    f.logger.Warn().Err(err).Msg("failed to query cache for cleanup - skipping")
    return
}

Context Cancellation in Prefetch (async_block_retriever.go:223)
The fetchAndCacheBlock method uses f.ctx which may be cancelled during shutdown. While this is intentional, consider whether partial writes to cache during shutdown could cause issues.

Error Accumulation in RetrieveForcedIncludedTxs (forced_inclusion_retriever.go:144-170)
Errors are accumulated with errors.Join() but the function continues processing. This could lead to partial data being returned. The current behavior returns empty event on any error, which is correct, but the logic could be clearer.

3. Security

✅ Strengths

No credential handling
Input validation for DA heights
Bounds checking on array access

⚠️ Issues Found

Potential Memory Exhaustion (async_block_retriever.go:243-248)

for _, blob := range result.Data {
    if len(blob) > 0 {
        block.Blobs = append(block.Blobs, blob)
    }
}

There's no validation on the total size or count of blobs returned from DA. A malicious or buggy DA layer could return massive blobs leading to memory exhaustion. Consider adding:

Maximum blob size validation
Maximum number of blobs per height validation

DoS Risk in Cache (async_block_retriever.go:277)
The cache uses an in-memory datastore with no size limits beyond prefetch window cleanup. If cleanup fails or DA heights contain very large blobs, memory could grow unbounded. Consider:

Setting a maximum cache size
Implementing LRU eviction
Adding memory pressure monitoring

4. Performance & Resource Efficiency

✅ Strengths

Excellent use of background prefetching to reduce latency
Proper use of atomic operations for currentDAHeight
Good cache hit optimization in RetrieveForcedIncludedTxs

⚠️ Issues Found

Inefficient Cleanup Query (async_block_retriever.go:302-331)
The cleanup iterates all cache keys on every prefetch cycle. With a large prefetch window, this could become expensive. Consider:

Only running cleanup periodically (e.g., every 10 prefetch cycles)
Tracking oldest key to avoid full scans
Using a TTL-based cache instead

Redundant Cache Checks (async_block_retriever.go:202-207)

_, err := f.cache.Get(f.ctx, key)
if err == nil {
    continue
}

This performs a full cache Get just to check existence. Consider tracking prefetched heights in a map or bloom filter for O(1) existence checks.

Serialization Overhead (async_block_retriever.go:262-274)
Every cache write serializes to protobuf. For hot paths, consider whether the serialization overhead is justified, or if keeping raw BlockData in memory would be more efficient.

Blocking Prefetch Loop (async_block_retriever.go:198-211)
Prefetch happens sequentially for all blocks in the window. Consider parallelizing with a worker pool:

var wg sync.WaitGroup
for i := uint64(0); i < f.prefetchWindow; i++ {
    wg.Add(1)
    go func(h uint64) {
        defer wg.Done()
        // Check cache and fetch
    }(currentHeight + i)
}
wg.Wait()

5. Testing & Reliability

✅ Strengths

Good test coverage for core functionality
Tests use mocks appropriately
Tests verify both success and error paths

⚠️ Issues Found

Flaky Test: Time-Based Polling (async_block_retriever_test.go:85-98)

for i := 0; i < 40; i++ {
    block, err = fetcher.GetCachedBlock(ctx, 100)
    require.NoError(t, err)
    if block != nil {
        break
    }
    time.Sleep(50 * time.Millisecond)
}

This test could be flaky on slow CI systems. Consider using a channel or synchronization primitive to signal when prefetch completes rather than polling.

Missing Test: Restart Scenario (Addressed in review comments)
As noted by tac0turtle, there should be a test for restart scenarios where cache is lost. While the parent component handles this, unit tests should verify the component behaves correctly when starting with empty cache.

Missing Test: Concurrent Access
No tests verify concurrent access to the cache or concurrent calls to UpdateCurrentHeight. Add race detection tests:

func TestAsyncBlockRetriever_ConcurrentUpdates(t *testing.T) {
    // Test concurrent UpdateCurrentHeight and GetCachedBlock calls
}

Missing Test: Cache Cleanup
No tests verify that cleanupOldBlocks actually removes old entries from cache. Add a test that:

Populates cache with old heights
Updates current height forward
Verifies old heights are removed

6. Concurrency Patterns & Safety

✅ Strengths

Proper use of atomic.Uint64 for currentDAHeight
Good goroutine lifecycle management with context and WaitGroup
Thread-safe cache with dsync.MutexWrap

⚠️ Issues Found

Compare-and-Swap Loop Could Spin (async_block_retriever.go:114-125)

for {
    current := f.currentDAHeight.Load()
    if height <= current {
        return
    }
    if f.currentDAHeight.CompareAndSwap(current, height) {
        return
    }
}

While unlikely, this infinite loop could spin if CompareAndSwap fails repeatedly. Consider adding a maximum retry count or using a simple mutex here since contention is expected to be low.

Race Condition in Cleanup (async_block_retriever.go:293-331)
The cleanup function queries the cache and then deletes entries. Between query and delete, the cache could be modified by prefetch operations. This is benign (cleanup just skips the entry), but the race should be documented or the logic should use transactions if the datastore supports them.

7. Documentation & Developer Experience

⚠️ Issues Found

Missing Package Documentation (async_block_retriever.go:1)
The package lacks a package-level comment explaining its purpose and architecture.

Unclear Prefetch Window Semantics (async_block_retriever.go:72-74)
The default prefetch window of 10 blocks is not explained. Add documentation:

// prefetchWindow determines how many DA blocks ahead to prefetch.
// A larger window reduces cache misses but increases memory usage.
// For epoch-based systems, this should typically be >= epoch size.
if prefetchWindow == 0 {
    prefetchWindow = 10 // Default: prefetch next 10 blocks
}

Missing Metric/Observability
No metrics are exported for:

Cache hit/miss rates
Prefetch success/failure rates
Cache size
Cleanup operations

Consider adding prometheus metrics or at least debug logs with counts.

8. Integration Issues

Lifecycle Management (sequencer.go:114, based/sequencer.go:82)
Sequencers create ForcedInclusionRetriever which internally starts an async fetcher, but there's no explicit cleanup in sequencer lifecycle. While the retriever has a Stop() method, it's unclear if/when sequencers call it. Verify sequencers properly stop the retriever on shutdown.

Missing Call to Stop (single/sequencer.go)
The single sequencer creates a new fiRetriever at line 114 and potentially recreates it at line 187, but I don't see where fiRetriever.Stop() is called during sequencer shutdown. This could leak goroutines. Add:

func (c *Sequencer) Stop() {
    if c.fiRetriever != nil {
        c.fiRetriever.Stop()
    }
}

Summary of Required Changes

High Priority

Fix fragile key parsing with key.Name() and strconv.ParseUint - block/internal/da/async_block_retriever.go:318
Add lifecycle management: ensure sequencers call fiRetriever.Stop() on shutdown - pkg/sequencers/single/sequencer.go
Add blob size validation to prevent memory exhaustion - block/internal/da/async_block_retriever.go:243-248
Add test for concurrent access and race conditions - block/internal/da/async_block_retriever_test.go

Medium Priority

Optimize cleanup to run periodically instead of every prefetch - block/internal/da/async_block_retriever.go:293-331
Add cache size limits and memory pressure handling - block/internal/da/async_block_retriever.go:83
Add comprehensive godoc for exported interfaces and functions
Consider parallelizing prefetch operations for better performance

Low Priority

Add observability metrics for cache performance
Improve test stability by removing time-based polling
Add test coverage for cache cleanup functionality

Positive Notes

The async prefetching architecture is well-designed and should significantly reduce DA fetch latency
Concurrency patterns are generally sound with proper atomic operations
Error handling is comprehensive with good retry logic
Integration with existing sequencer code is clean
Test coverage demonstrates good understanding of edge cases

The implementation is production-ready with the high-priority fixes applied. The medium and low priority items are optimizations that would improve robustness and observability.

gemini-code-assist · 2026-01-05T16:25:27Z

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements an asynchronous epoch fetching mechanism to optimize the retrieval of forced inclusion transactions from the Data Availability (DA) layer. By introducing an AsyncEpochFetcher that prefetches and caches upcoming epoch data in the background, the system can significantly reduce the latency associated with fetching this data, particularly at epoch transitions. This enhancement improves the overall performance and responsiveness of sequencers and the block syncing process by providing immediate access to frequently needed DA epoch information.

Highlights

Introduced AsyncEpochFetcher: A new component for background prefetching and caching of Data Availability (DA) epoch data, designed to reduce latency at epoch boundaries.
Integrated with ForcedInclusionRetriever: The existing ForcedInclusionRetriever now utilizes the AsyncEpochFetcher to first check for cached epoch data before performing a synchronous fetch.
Deployment across Sequencers and Syncer: The AsyncEpochFetcher is instantiated and started in the createSequencer functions for EVM, gRPC, and TestApp, as well as within the Syncer component, ensuring widespread performance benefits.
Serialization and Deserialization: The AsyncEpochFetcher includes custom serialization and deserialization logic for ForcedInclusionEvent to efficiently store and retrieve cached epoch data.
Comprehensive Testing: New unit tests have been added for the AsyncEpochFetcher to ensure its correct functionality, including caching, background operations, and graceful shutdown.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an AsyncEpochFetcher component designed to prefetch Data Availability (DA) epoch data in the background, thereby improving performance for sequencers and the block syncer. The new fetcher includes an in-memory cache for storing prefetched epochs, manages background fetching, tracks the current DA height, and cleans up old cached data. The ForcedInclusionRetriever has been updated to leverage this AsyncEpochFetcher, prioritizing cached data before making direct DA client calls. The AsyncEpochFetcher is integrated into the evm, grpc, and testapp applications, the block syncer, and the single sequencer, with corresponding unit tests added and existing tests updated. Review comments indicate that the mu mutex in AsyncEpochFetcher is redundant due to dsync.MutexWrap already providing thread-safety, leading to double-locking, and suggest replacing the fragile fmt.Sscanf for parsing datastore keys with key.Name() and strconv.ParseUint for robustness.

gemini-code-assist · 2026-01-05T16:27:51Z

block/internal/da/async_epoch_fetcher.go

+
+	// In-memory cache for prefetched epoch data
+	cache ds.Batching
+	mu    sync.RWMutex


The cache field is already wrapped with dsync.MutexWrap, which makes it thread-safe. The additional mu field is redundant and leads to double-locking. Please remove this mutex and all its usages (f.mu.Lock/Unlock/RLock/RUnlock) throughout the file.

gemini-code-assist · 2026-01-05T16:27:51Z

block/internal/da/async_epoch_fetcher.go

+		var epochEnd uint64
+		_, err := fmt.Sscanf(key.String(), "/epoch/%d", &epochEnd)
+		if err != nil {
+			continue
+		}


Using fmt.Sscanf to parse the key string is fragile. A more robust and idiomatic way to handle datastore keys is to use key.Name() to get the last component of the key and then parse it using strconv.ParseUint. You will need to import the strconv package.

Suggested change

var epochEnd uint64

_, err := fmt.Sscanf(key.String(), "/epoch/%d", &epochEnd)

if err != nil {

continue

}

var epochEnd uint64

epochEndStr := key.Name()

epochEnd, err := strconv.ParseUint(epochEndStr, 10, 64)

if err != nil {

continue

}

codecov · 2026-01-05T16:28:54Z

Codecov Report

❌ Patch coverage is 61.58730% with 121 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.76%. Comparing base (453a8a4) to head (a7fdca7).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
...quencers/common/forced_inclusion_retriever_mock.go	0.00%	62 Missing ⚠️
block/internal/da/async_block_retriever.go	74.54%	31 Missing and 11 partials ⚠️
block/internal/da/forced_inclusion_retriever.go	84.61%	8 Missing and 4 partials ⚠️
pkg/sequencers/single/sequencer.go	57.14%	3 Missing ⚠️
block/internal/syncing/syncer.go	0.00%	1 Missing ⚠️
block/public.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2952      +/-   ##
==========================================
- Coverage   58.74%   57.76%   -0.99%     
==========================================
  Files          93       97       +4     
  Lines        8863     9306     +443     
==========================================
+ Hits         5207     5376     +169     
- Misses       3067     3326     +259     
- Partials      589      604      +15

Flag	Coverage Δ
combined	`57.76% <61.58%> (-0.99%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tac0turtle · 2026-01-09T14:00:42Z

block/internal/da/async_block_retriever.go

+		logger:         logger.With().Str("component", "async_block_retriever").Logger(),
+		namespace:      namespace,
+		daStartHeight:  daStartHeight,
+		cache:          dsync.MutexWrap(ds.NewMapDatastore()),


are restarts a concern here or is the height reset so we query it again?

we would lose the cache yeah, but the cache is only 2 epochs, so it will be rebuilt up fast.
Additionally, if there is no cache, we hit the da directly

block/internal/da/async_block_retriever_test.go

block/internal/da/forced_inclusion_retriever.go

pkg/sequencers/based/sequencer_test.go

tac0turtle

utACK, left two comments overall nice job!!

feat(block): async epoch fetching

229da68

github-actions bot assigned julienrbrt Jan 5, 2026

julienrbrt mentioned this pull request Jan 5, 2026

Force inclusion follow ups #2906

Open

13 tasks

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

julienrbrt added 8 commits January 7, 2026 23:49

simplify

c727394

Merge branch 'main' into julien/aync-epochs

fed8c3b

chore: fix unit tests

d08b3bf

chore: use interfaces

7f789ee

updates

8cd8713

Merge branch 'main' into julien/aync-epochs

1436863

wip

9480a14

updates

c5f4e0f

julienrbrt marked this pull request as ready for review January 8, 2026 19:25

julienrbrt added 7 commits January 8, 2026 20:47

remove mutexes and use atomic

a022478

lint

ca523a5

rename to retriever

5608e0e

fix

d6d9af1

nits

655d1e6

keep async retriever internal

c818766

updates

02d843a

julienrbrt requested review from alpe and tac0turtle January 8, 2026 23:39

update logs

a3ab535

tac0turtle reviewed Jan 9, 2026

View reviewed changes

block/internal/da/async_block_retriever_test.go Show resolved Hide resolved

tac0turtle reviewed Jan 9, 2026

View reviewed changes

block/internal/da/forced_inclusion_retriever.go Outdated Show resolved Hide resolved

tac0turtle reviewed Jan 9, 2026

View reviewed changes

pkg/sequencers/based/sequencer_test.go Outdated Show resolved Hide resolved

tac0turtle reviewed Jan 9, 2026

View reviewed changes

julienrbrt mentioned this pull request Jan 9, 2026

feat(syncer): verify force inclusion for p2p blocks #2963

Open

tac0turtle approved these changes Jan 9, 2026

View reviewed changes

test: use proper mocks

59b33b1

julienrbrt added this pull request to the merge queue Jan 9, 2026

julienrbrt removed this pull request from the merge queue due to a manual request Jan 9, 2026

julienrbrt added 2 commits January 9, 2026 16:32

refactor: Combine cached and sync DA fetch results

ea9c055

update tests

a7fdca7

julienrbrt enabled auto-merge January 9, 2026 15:38

julienrbrt added this pull request to the merge queue Jan 9, 2026

Merged via the queue into main with commit aaae087 Jan 9, 2026
34 of 35 checks passed

julienrbrt deleted the julien/aync-epochs branch January 9, 2026 16:02

feat(block): async epoch fetching #2952

feat(block): async epoch fetching #2952

Uh oh!

Conversation

julienrbrt commented Jan 5, 2026

Uh oh!

github-actions bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Code Review

Review Summary

1. Code Quality & Architecture

✅ Strengths

⚠️ Issues Found

2. Error Handling & Resilience

✅ Strengths

⚠️ Issues Found

3. Security

✅ Strengths

⚠️ Issues Found

4. Performance & Resource Efficiency

✅ Strengths

⚠️ Issues Found

5. Testing & Reliability

✅ Strengths

⚠️ Issues Found

6. Concurrency Patterns & Safety

✅ Strengths

⚠️ Issues Found

7. Documentation & Developer Experience

⚠️ Issues Found

8. Integration Issues

Summary of Required Changes

High Priority

Medium Priority

Low Priority

Positive Notes

Uh oh!

gemini-code-assist bot commented Jan 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tac0turtle Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tac0turtle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

github-actions bot commented Jan 5, 2026 •

edited

Loading

claude bot commented Jan 5, 2026 •

edited

Loading

codecov bot commented Jan 5, 2026 •

edited

Loading

julienrbrt Jan 9, 2026 •

edited

Loading