Conversation
DeviaVir
commented
Feb 9, 2026
- Backfill missing spent UTXO script hashes from the node/esplora during block indexing to avoid hard failures on partial DB corruption.
- Repair small gaps in hashesv2 at startup by fetching missing block headers and persisting them; fail only when repair is not possible.
- Expand reorg undo handling to a multi‑block buffer and align memory/db store behavior; add tests for gap detection and multi‑block reorgs.
There was a problem hiding this comment.
Pull request overview
This PR improves resilience of the indexing/storage pipeline by repairing certain on-disk inconsistencies (missing spent UTXO script hashes and small header/hash CF gaps) and by expanding reorg undo handling to support multi-block reorgs.
Changes:
- Backfill missing spent UTXO script hashes during block indexing by querying the node/esplora, rather than hard-failing on partial UTXO DB corruption.
- Add startup detection + optional repair of small gaps in the stored block hash/timestamp column family, persisting repaired entries.
- Expand reorg undo handling from single-block to a bounded multi-block in-memory buffer, with new tests for multi-block reorg behavior and gap detection.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/threads/blocks.rs | Adds spent-UTXO resolution/backfill to produce SpentUtxo entries with script hashes before store update. |
| src/store/mod.rs | Introduces SpentUtxo, adds put_hash_ts to the Store trait, and defines a max reorg undo buffer depth. |
| src/store/memory.rs | Implements multi-block reorg undo buffering and updates update()/reorg() to use SpentUtxo. |
| src/store/db.rs | Implements multi-block reorg undo buffering, updates update()/reorg() to use SpentUtxo, and adds hash/timestamp delete/write helpers + tests. |
| src/server/preload.rs | Adds hashes CF gap detection and optional repair on startup (with a new test). |
| src/server/mod.rs | Updates server startup to provide a Client to header preloading/repair. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let mut blocks_hash_ts = state.blocks_hash_ts.lock().await; | ||
| let mut i = 0usize; | ||
| for meta in state.store.iter_hash_ts() { | ||
| assert_eq!(i as u32, meta.height()); | ||
| let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect(); | ||
| for meta in metas { | ||
| if i as u32 != meta.height() { | ||
| let gap_start = i as u32; | ||
| let gap_end = meta.height(); | ||
| let gap_len = gap_end.saturating_sub(gap_start); | ||
|
|
||
| if gap_len == 0 { | ||
| return Err(Error::DBCorrupted(format!( | ||
| "hashes CF out-of-order entry at height {}, reindex required", | ||
| meta.height() | ||
| ))); | ||
| } | ||
|
|
||
| let client = client.ok_or_else(|| { | ||
| Error::DBCorrupted(format!( | ||
| "hashes CF gap detected: expected height {}, found {}. \ | ||
| DB is inconsistent; reindex required", | ||
| i, | ||
| meta.height() | ||
| )) | ||
| })?; | ||
|
|
||
| if gap_len > MAX_HASH_GAP_REPAIR { | ||
| return Err(Error::DBCorrupted(format!( | ||
| "hashes CF gap too large to repair ({} blocks from {} to {}), reindex required", | ||
| gap_len, | ||
| gap_start, | ||
| gap_end - 1 | ||
| ))); | ||
| } | ||
|
|
||
| log::warn!( | ||
| "hashes CF gap detected ({} blocks from {} to {}), attempting repair", | ||
| gap_len, | ||
| gap_start, | ||
| gap_end - 1 | ||
| ); | ||
|
|
||
| for height in gap_start..gap_end { | ||
| let hash = client | ||
| .block_hash(height) | ||
| .await | ||
| .map_err(|e| Error::DBCorrupted(format!("failed to fetch block hash: {e}")))? | ||
| .ok_or_else(|| { | ||
| Error::DBCorrupted(format!( | ||
| "missing block hash at height {height} while repairing hashes CF" | ||
| )) | ||
| })?; | ||
| let header = client | ||
| .block_header(hash, family) | ||
| .await | ||
| .map_err(|e| { | ||
| Error::DBCorrupted(format!( | ||
| "failed to fetch block header for {hash}: {e}" | ||
| )) | ||
| })?; | ||
| let repaired = BlockMeta::new(height, hash, header.time()); | ||
| state | ||
| .store | ||
| .put_hash_ts(&repaired) | ||
| .map_err(|e| Error::DBCorrupted(format!("failed to write hash meta: {e}")))?; | ||
| blocks_hash_ts.push((repaired.hash(), repaired.timestamp())); | ||
| } | ||
| i = gap_end as usize; |
There was a problem hiding this comment.
blocks_hash_ts (a tokio::Mutex) is locked for the entire duration of header gap repair, but the repair loop performs multiple .await network calls (block_hash/block_header). Holding an async mutex guard across awaits can block other tasks and can deadlock if any awaited path tries to read/update blocks_hash_ts indirectly. Refactor to avoid awaiting while the mutex is held (e.g., build a local list of metas to append + perform repairs first, then take the lock briefly to extend/replace the vector).
| let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect(); | ||
| for meta in metas { |
There was a problem hiding this comment.
let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect(); loads the entire hashes CF into memory at startup. On mainnet-sized chains this can be tens of MB and increases startup latency/peak RSS. Consider iterating in a streaming fashion (or repairing gaps in a separate pass) to avoid collecting all metas at once.
| let metas: Vec<BlockMeta> = state.store.iter_hash_ts().collect(); | |
| for meta in metas { | |
| for meta in state.store.iter_hash_ts() { |
| .unwrap(); // TODO handle unwrap; | ||
|
|
||
| self.delete_hash_ts_batch(&mut batch, reorg_data.height); | ||
|
|
||
| self.write(batch).unwrap(); // TODO handle unwrap; |
There was a problem hiding this comment.
reorg() still uses unwrap() on RocksDB operations (insert_utxos(...).unwrap() and self.write(batch).unwrap()). Since other reorg cleanup paths were updated to include context and panic with a message, these unwraps can obscure the root cause and make failures harder to diagnose. Prefer consistent error handling here as well (propagate a Result, or unwrap_or_else with an error_panic! message that includes the height).
| .unwrap(); // TODO handle unwrap; | |
| self.delete_hash_ts_batch(&mut batch, reorg_data.height); | |
| self.write(batch).unwrap(); // TODO handle unwrap; | |
| .unwrap_or_else(|e| { | |
| error_panic!( | |
| "failed to restore reorg-spent utxos at height {}: {e}", | |
| reorg_data.height | |
| ) | |
| }); | |
| self.delete_hash_ts_batch(&mut batch, reorg_data.height); | |
| self.write(batch).unwrap_or_else(|e| { | |
| error_panic!( | |
| "failed to write reorg batch at height {}: {e}", | |
| reorg_data.height | |
| ) | |
| }); |
