Skip to content

persist blocks and FullCommitQCs in data layer via WAL#3126

Open
wen-coding wants to merge 4 commits intomainfrom
wen/persist_data
Open

persist blocks and FullCommitQCs in data layer via WAL#3126
wen-coding wants to merge 4 commits intomainfrom
wen/persist_data

Conversation

@wen-coding
Copy link
Copy Markdown
Contributor

@wen-coding wen-coding commented Mar 27, 2026

Summary

  • Add GlobalBlockPersister and GlobalCommitQCPersister backed by indexedWAL, so the data layer can recover both blocks and QCs after a crash
  • Group both into a DataWAL struct with NewDataWAL, Close, and TruncateBefore helpers; integrate into data.State
  • NewState loads blocks and QCs from WAL on startup and reconciles cursors so nextBlock is correct immediately on recovery
  • Add ReadAt to indexedWAL for random-access reads needed by variable-size QC truncation

Test plan

  • globalblocks_test.go: persist & reload, truncate & reload, truncate all, no-op, duplicate ignored, gap error, continue after reload
  • globalcommitqcs_test.go: persist & reload, truncate & reload, truncate all, no-op, duplicate ignored, gap error, mid-range truncation, continue after reload
  • state_test.go: TestStateRecoveryFromWAL — pushes QCs via PushQC, closes WALs, reopens from same dir, verifies NextBlock() and all blocks/QCs recovered
  • All existing tests pass (data, avail, consensus, p2p/giga)
  • golangci-lint and gofmt clean

Add GlobalBlockPersister (using indexedWAL) to persist data.inner.blocks
across restarts. Each WAL entry embeds its GlobalBlockNumber since Block
doesn't carry it.

data.NewState now accepts a GlobalBlockWAL interface and pulls preloaded
blocks from it via LoadedBlocks(). Fix a bug in PushQC where
updateNextBlock was skipped when blocks were already preloaded but the
QC arrived without new blocks.

Includes integration test for the reload path and unit tests for the
persister.

Made-with: Cursor
Same pattern as the commitqcs bug: in no-op mode, truncateBefore
returned early without advancing s.next. Also improve the no-op test
to exercise truncate-then-persist.

Made-with: Cursor
Add GlobalCommitQCPersister alongside GlobalBlockPersister so the data
layer can recover both blocks and QCs after a crash. Without this, a
restart could leave blocks without their verifying QCs (or vice-versa)
because avail may prune QCs on a different schedule.

Key changes:
- Add ReadAt to indexedWAL for random-access reads
- Add GlobalCommitQCPersister (one WAL entry per FullCommitQC)
- Introduce DataWAL struct grouping both persisters with
  TruncateBefore and Close helpers
- NewState loads QCs from WAL and calls updateNextBlock during
  construction so nextBlock is correct immediately on recovery
- NewDataWAL constructor; no-op persisters when stateDir is None
- Replace TestNewStateReloadsPreloadedBlocks with end-to-end
  TestStateRecoveryFromWAL that pushes through PushQC and restarts

Made-with: Cursor
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMar 27, 2026, 1:28 PM

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 67.28625% with 88 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.65%. Comparing base (1141939) to head (08a3eb1).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...nternal/autobahn/consensus/persist/globalblocks.go 69.30% 19 Missing and 12 partials ⚠️
sei-tendermint/internal/autobahn/data/state.go 58.82% 20 Missing and 8 partials ⚠️
...rnal/autobahn/consensus/persist/globalcommitqcs.go 72.63% 16 Missing and 10 partials ⚠️
...dermint/internal/autobahn/consensus/persist/wal.go 40.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3126      +/-   ##
==========================================
+ Coverage   58.62%   58.65%   +0.03%     
==========================================
  Files        2099     2101       +2     
  Lines      173751   174085     +334     
==========================================
+ Hits       101867   102117     +250     
- Misses      62816    62870      +54     
- Partials     9068     9098      +30     
Flag Coverage Δ
sei-chain-pr 72.32% <67.28%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...dermint/internal/autobahn/consensus/persist/wal.go 62.12% <40.00%> (-1.82%) ⬇️
...rnal/autobahn/consensus/persist/globalcommitqcs.go 72.63% <72.63%> (ø)
sei-tendermint/internal/autobahn/data/state.go 62.09% <58.82%> (-3.21%) ⬇️
...nternal/autobahn/consensus/persist/globalblocks.go 69.30% <69.30%> (ø)

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Fix gofmt: indentation in runPruning, extra blank line after NewDataWAL
- Fix staticcheck S1016: use type conversion in loadAllGlobalBlocks
- Use errors.Join in DataWAL.Close() so both WALs are always closed
- Remove redundant LoadedGlobalCommitQC struct; LoadedQCs() now returns
  []*types.FullCommitQC directly since First is derivable from the QC
- Inline nextBlock advancement in NewState recovery to avoid recording
  stale receive-latency metrics on restart

Made-with: Cursor
@wen-coding wen-coding changed the title persist FullCommitQCs in data layer via WAL persist blocks and FullCommitQCs in data layer via WAL Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant