feat(db): Repair missing spent UTXOs and hashes CF gaps; deepen reorg undo buffer by DeviaVir · Pull Request #5 · Blockstream/waterfalls

DeviaVir · 2026-02-09T11:23:17Z

Backfill missing spent UTXO script hashes from the node/esplora during block indexing to avoid hard failures on partial DB corruption.
Repair small gaps in hashesv2 at startup by fetching missing block headers and persisting them; fail only when repair is not possible.
Expand reorg undo handling to a multi‑block buffer and align memory/db store behavior; add tests for gap detection and multi‑block reorgs.

Copilot

Pull request overview

This PR improves resilience of the indexing/storage pipeline by repairing certain on-disk inconsistencies (missing spent UTXO script hashes and small header/hash CF gaps) and by expanding reorg undo handling to support multi-block reorgs.

Changes:

Backfill missing spent UTXO script hashes during block indexing by querying the node/esplora, rather than hard-failing on partial UTXO DB corruption.
Add startup detection + optional repair of small gaps in the stored block hash/timestamp column family, persisting repaired entries.
Expand reorg undo handling from single-block to a bounded multi-block in-memory buffer, with new tests for multi-block reorg behavior and gap detection.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/threads/blocks.rs	Adds spent-UTXO resolution/backfill to produce `SpentUtxo` entries with script hashes before store update.
src/store/mod.rs	Introduces `SpentUtxo`, adds `put_hash_ts` to the `Store` trait, and defines a max reorg undo buffer depth.
src/store/memory.rs	Implements multi-block reorg undo buffering and updates `update()`/`reorg()` to use `SpentUtxo`.
src/store/db.rs	Implements multi-block reorg undo buffering, updates `update()`/`reorg()` to use `SpentUtxo`, and adds hash/timestamp delete/write helpers + tests.
src/server/preload.rs	Adds hashes CF gap detection and optional repair on startup (with a new test).
src/server/mod.rs	Updates server startup to provide a `Client` to header preloading/repair.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T11:27:40Z

src/server/preload.rs

    let mut blocks_hash_ts = state.blocks_hash_ts.lock().await;
    let mut i = 0usize;
-    for meta in state.store.iter_hash_ts() {
-        assert_eq!(i as u32, meta.height());
+    let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect();
+    for meta in metas {
+        if i as u32 != meta.height() {
+            let gap_start = i as u32;
+            let gap_end = meta.height();
+            let gap_len = gap_end.saturating_sub(gap_start);
+
+            if gap_len == 0 {
+                return Err(Error::DBCorrupted(format!(
+                    "hashes CF out-of-order entry at height {}, reindex required",
+                    meta.height()
+                )));
+            }
+
+            let client = client.ok_or_else(|| {
+                Error::DBCorrupted(format!(
+                    "hashes CF gap detected: expected height {}, found {}. \
+DB is inconsistent; reindex required",
+                    i,
+                    meta.height()
+                ))
+            })?;
+
+            if gap_len > MAX_HASH_GAP_REPAIR {
+                return Err(Error::DBCorrupted(format!(
+                    "hashes CF gap too large to repair ({} blocks from {} to {}), reindex required",
+                    gap_len,
+                    gap_start,
+                    gap_end - 1
+                )));
+            }
+
+            log::warn!(
+                "hashes CF gap detected ({} blocks from {} to {}), attempting repair",
+                gap_len,
+                gap_start,
+                gap_end - 1
+            );
+
+            for height in gap_start..gap_end {
+                let hash = client
+                    .block_hash(height)
+                    .await
+                    .map_err(|e| Error::DBCorrupted(format!("failed to fetch block hash: {e}")))?
+                    .ok_or_else(|| {
+                        Error::DBCorrupted(format!(
+                            "missing block hash at height {height} while repairing hashes CF"
+                        ))
+                    })?;
+                let header = client
+                    .block_header(hash, family)
+                    .await
+                    .map_err(|e| {
+                        Error::DBCorrupted(format!(
+                            "failed to fetch block header for {hash}: {e}"
+                        ))
+                    })?;
+                let repaired = BlockMeta::new(height, hash, header.time());
+                state
+                    .store
+                    .put_hash_ts(&repaired)
+                    .map_err(|e| Error::DBCorrupted(format!("failed to write hash meta: {e}")))?;
+                blocks_hash_ts.push((repaired.hash(), repaired.timestamp()));
+            }
+            i = gap_end as usize;


blocks_hash_ts (a tokio::Mutex) is locked for the entire duration of header gap repair, but the repair loop performs multiple .await network calls (block_hash/block_header). Holding an async mutex guard across awaits can block other tasks and can deadlock if any awaited path tries to read/update blocks_hash_ts indirectly. Refactor to avoid awaiting while the mutex is held (e.g., build a local list of metas to append + perform repairs first, then take the lock briefly to extend/replace the vector).

Copilot · 2026-02-09T11:27:40Z

src/server/preload.rs

+    let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect();
+    for meta in metas {


let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect(); loads the entire hashes CF into memory at startup. On mainnet-sized chains this can be tens of MB and increases startup latency/peak RSS. Consider iterating in a streaming fashion (or repairing gaps in a separate pass) to avoid collecting all metas at once.

Suggested change

let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect();

for meta in metas {

for meta in state.store.iter_hash_ts() {

Copilot · 2026-02-09T11:27:41Z

src/store/db.rs

        .unwrap(); // TODO handle unwrap;

+        self.delete_hash_ts_batch(&mut batch, reorg_data.height);
+
        self.write(batch).unwrap(); // TODO handle unwrap;


reorg() still uses unwrap() on RocksDB operations (insert_utxos(...).unwrap() and self.write(batch).unwrap()). Since other reorg cleanup paths were updated to include context and panic with a message, these unwraps can obscure the root cause and make failures harder to diagnose. Prefer consistent error handling here as well (propagate a Result, or unwrap_or_else with an error_panic! message that includes the height).

Suggested change

.unwrap(); // TODO handle unwrap;

self.delete_hash_ts_batch(&mut batch, reorg_data.height);

self.write(batch).unwrap(); // TODO handle unwrap;

.unwrap_or_else(|e| {

error_panic!(

"failed to restore reorg-spent utxos at height {}: {e}",

reorg_data.height

)

});

self.delete_hash_ts_batch(&mut batch, reorg_data.height);

self.write(batch).unwrap_or_else(|e| {

error_panic!(

"failed to write reorg batch at height {}: {e}",

reorg_data.height

)

});

RCasatta · 2026-02-09T13:47:40Z

This PR is big and it also try to fix different things at the same time (handling long reorgs and fixing corrupted database) making it more complicated.
I don't want to complicate things too much before understanding the root cause, for example I don't think a multi block reorg happened.

feat(db): repair missing spent utxo

fb90316

DeviaVir requested a review from Copilot February 9, 2026 11:23

DeviaVir assigned RCasatta Feb 9, 2026

Copilot started reviewing on behalf of DeviaVir February 9, 2026 11:23 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(db): Repair missing spent UTXOs and hashes CF gaps; deepen reorg undo buffer#5

feat(db): Repair missing spent UTXOs and hashes CF gaps; deepen reorg undo buffer#5
DeviaVir wants to merge 1 commit intomasterfrom
repair

DeviaVir commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

RCasatta commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect();
		for meta in metas {

	let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect();
	for meta in metas {
	for meta in state.store.iter_hash_ts() {

Conversation

DeviaVir commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

RCasatta commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants