fix(dash-spv): preserve in-progress sync state on peer disconnect by xdustinface · Pull Request #746 · dashpay/rust-dashcore

xdustinface · 2026-05-08T15:17:16Z

BlocksManager and FiltersManager wiped their pipelines on every disconnect, which raced with the previous cycle's BlockProcessed events still in flight and left pending_blocks stuck above zero so SyncComplete never fired.

Replace the trait's clear_in_flight_state with a single on_disconnect hook. Each manager only invalidates state tied to the dead peer. BlocksManager and FiltersManager requeue their in-flight getdata / getcfilters slots so the next send_pending reissues them on the new peer, and start_sync resumes the preserved batches. The wallet-rescan path in FiltersManager::tick calls a manager-internal reset_for_rescan for the case that genuinely wants a full wipe.

Summary by CodeRabbit

Refactor
- Renamed the disconnect lifecycle hook across sync managers for clearer lifecycle handling.
Improvements
- Download/filter/block sync now preserves in-progress work across peer disconnects and requeues in-flight requests for the next peer.
- Introduced a distinct wallet rescan reset that fully clears rescan state without affecting disconnect-preserved progress.
- Mempool pending state is cleared on disconnect to avoid stale requests.
Tests
- Added unit tests verifying requeue, cancel-pending, and disconnect-preserve behaviors.

`BlocksManager` and `FiltersManager` wiped their pipelines on every disconnect, which raced with the previous cycle's `BlockProcessed` events still in flight and left `pending_blocks` stuck above zero so `SyncComplete` never fired. Replace the trait's `clear_in_flight_state` with a single `on_disconnect` hook. Each manager only invalidates state tied to the dead peer. `BlocksManager` and `FiltersManager` requeue their in-flight `getdata` / `getcfilters` slots so the next `send_pending` reissues them on the new peer, and `start_sync` resumes the preserved batches. The wallet-rescan path in `FiltersManager::tick` calls a manager-internal `reset_for_rescan` for the case that genuinely wants a full wipe.

coderabbitai · 2026-05-08T15:18:50Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d61b0df8-9127-41a6-abb1-4a2e63f46ab6

📥 Commits

Reviewing files that changed from the base of the PR and between b628a1f and d61355e.

📒 Files selected for processing (2)

dash-spv/src/sync/download_coordinator.rs
dash-spv/src/sync/filters/pipeline.rs

📝 Walkthrough

Walkthrough

The PR renames the SyncManager disconnect hook from clear_in_flight_state to on_disconnect, updates the trait and tests, adds DownloadCoordinator requeue/cancel utilities, and changes BlocksManager and FiltersManager to preserve and requeue in-flight work across disconnects instead of wiping it.

Changes

Disconnect Lifecycle Refactor

Layer / File(s)	Summary
Trait API Definition `dash-spv/src/sync/sync_manager.rs`	`SyncManager` replaces `clear_in_flight_state(&mut self)` with `on_disconnect(&mut self)`; `stop_sync` now invokes `on_disconnect`; `MockManager` test updated.
DownloadCoordinator: requeue & cancel `dash-spv/src/sync/download_coordinator.rs`	Adds `requeue_in_flight()` to drain in-flight keys and push them to the front of pending (preserving `retry_counts`) and `cancel_pending()` to remove a pending key; tests added for priority ordering, retry preservation, and no-op behavior.
Straightforward Hook Renames `dash-spv/src/sync/block_headers/sync_manager.rs`, `dash-spv/src/sync/chainlock/sync_manager.rs`, `dash-spv/src/sync/filter_headers/sync_manager.rs`, `dash-spv/src/sync/filter_headers/manager.rs`, `dash-spv/src/sync/instantsend/sync_manager.rs`, `dash-spv/src/sync/masternodes/sync_manager.rs`, `dash-spv/src/sync/mempool/sync_manager.rs`	Several managers rename the disconnect hook to `on_disconnect` without behavioral changes; FilterHeaders test renamed to match.
BlocksPipeline Requeue `dash-spv/src/sync/blocks/pipeline.rs`	Adds `BlocksPipeline::requeue_in_flight()` delegating to `DownloadCoordinator`, preserving downloaded entries and hash/height/wallet mappings; tests ensure preservation.
BlocksManager Disconnect Preservation `dash-spv/src/sync/blocks/sync_manager.rs`, `dash-spv/src/sync/blocks/manager.rs`	`on_disconnect` now preserves the pipeline and calls `requeue_in_flight()` to requeue in-flight getdata; `start_sync` resumes draining when pipeline is incomplete; `clear_in_flight_state` removed. New test validates preservation.
FiltersPipeline receive + requeue `dash-spv/src/sync/filters/pipeline.rs`	`receive_with_data` cancels stale pending when coordinator.receive returns false; adds `requeue_in_flight()` delegating to coordinator; tests verify partial receipts and late arrivals don't orphan pending keys.
FiltersManager Disconnect vs Rescan `dash-spv/src/sync/filters/sync_manager.rs`, `dash-spv/src/sync/filters/manager.rs`	`on_disconnect` re-queues in-flight filter batches to resume work; `reset_for_rescan()` added to fully wipe multi-batch state for wallet rescans; `start_sync` resumes active batches and `tick` uses `reset_for_rescan` for stale-wallet handling; tests added/updated.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

dashpay/rust-dashcore#484: Directly related SyncManager lifecycle refactor that introduced/used the earlier disconnect semantics and influenced trait-level changes.

Suggested reviewers

ZocoLini

Poem

🐰 I tuck the in-flight hops safe in my den,
Then nudge them to front when new peers begin,
Filters and blocks keep their stitched-up thread,
Disconnects tiptoe — progress stays fed,
🥕 A hop, a requeue, and onward again.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(dash-spv): preserve in-progress sync state on peer disconnect' directly and clearly describes the main change: modifying disconnect handling to preserve sync state instead of clearing it.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/blocks-filters-disconnect-race

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-08T15:22:45Z

Codecov Report

❌ Patch coverage is 97.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.98%. Comparing base (478af3b) to head (d61355e).
⚠️ Report is 7 commits behind head on v0.42-dev.

Files with missing lines	Patch %	Lines
dash-spv/src/sync/filters/manager.rs	94.44%	2 Missing ⚠️
dash-spv/src/sync/chainlock/sync_manager.rs	0.00%	1 Missing ⚠️
dash-spv/src/sync/instantsend/sync_manager.rs	0.00%	1 Missing ⚠️
dash-spv/src/sync/masternodes/sync_manager.rs	0.00%	1 Missing ⚠️
dash-spv/src/sync/sync_manager.rs	50.00%	1 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##           v0.42-dev     #746      +/-   ##
=============================================
+ Coverage      71.35%   71.98%   +0.63%     
=============================================
  Files            320      320              
  Lines          69370    69653     +283     
=============================================
+ Hits           49497    50138     +641     
+ Misses         19873    19515     -358

Flag	Coverage Δ
core	`76.30% <ø> (-0.39%)`	⬇️
ffi	`46.50% <ø> (+1.80%)`	⬆️
rpc	`20.00% <ø> (ø)`
spv	`88.06% <97.00%> (+0.73%)`	⬆️
wallet	`71.18% <ø> (+1.08%)`	⬆️

Files with missing lines	Coverage Δ
dash-spv/src/sync/block_headers/sync_manager.rs	`84.61% <100.00%> (ø)`
dash-spv/src/sync/blocks/manager.rs	`96.93% <100.00%> (+0.34%)`	⬆️
dash-spv/src/sync/blocks/pipeline.rs	`97.08% <100.00%> (+0.18%)`	⬆️
dash-spv/src/sync/blocks/sync_manager.rs	`82.35% <100.00%> (-0.51%)`	⬇️
dash-spv/src/sync/download_coordinator.rs	`100.00% <100.00%> (ø)`
dash-spv/src/sync/filter_headers/manager.rs	`91.03% <100.00%> (ø)`
dash-spv/src/sync/filter_headers/sync_manager.rs	`100.00% <100.00%> (ø)`
dash-spv/src/sync/filters/pipeline.rs	`98.21% <100.00%> (+0.12%)`	⬆️
dash-spv/src/sync/filters/sync_manager.rs	`100.00% <100.00%> (ø)`
dash-spv/src/sync/mempool/sync_manager.rs	`98.72% <100.00%> (ø)`
... and 5 more

... and 56 files with indirect coverage changes

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

dash-spv/src/sync/download_coordinator.rs (1)
95-110: 💤 Low value

LGTM – requeue_in_flight implementation is correct.

One cosmetic note: items.into_iter().rev() reverses a collection whose order is already arbitrary (from HashMap::drain), so the reversal has no meaningful effect on relative ordering among the requeued items. The tests correctly handle this by sorting the requeued slice before asserting content. If the intent was "push items to the front in the order they were originally enqueued," that ordering is not recoverable from the in_flight HashMap; the current code is functionally equivalent to iterating forward. This is a no-op cleanup at best.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dash-spv/src/sync/download_coordinator.rs` around lines 95 - 110, The call to
items.into_iter().rev() in requeue_in_flight is pointless because items comes
from draining a HashMap (in_flight) and has arbitrary order; remove the .rev()
so you simply iterate items.into_iter() and push_front each into pending, or if
you actually need to restore original enqueue order change in_flight from a
HashMap to an ordered container (e.g., IndexMap or track a VecDeque of keys) and
then requeue using that preserved order.
dash-spv/src/sync/blocks/sync_manager.rs (1)
35-53: ⚡ Quick win

Resume path doesn't immediately reissue requeued getdata — relies on next tick.

on_disconnect requeues in-flight slots to pending, but start_sync's non-empty-pipeline branch only flips state to Syncing without calling self.send_pending(_requests). Requeued getdatas sit idle until the 100ms tick fires. Compare with FiltersManager::start_sync (filters/sync_manager.rs Lines 60-63), which immediately calls filter_pipeline.send_pending(...) on resume.

The delay is small but the asymmetry between the two managers is easy to forget when debugging. Consider mirroring the filter-side pattern:
Proposed change
-        // If in-progress block work survived a disconnect, resume in Syncing so
-        // `process_buffered_blocks` can transition to Synced once the pipeline
-        // drains. Otherwise wait for BlocksNeeded / FiltersSyncComplete events.
-        if !self.pipeline.is_complete() {
-            self.set_state(SyncState::Syncing);
-        } else {
-            self.set_state(SyncState::WaitForEvents);
-        }
-        Ok(vec![])
+        // If in-progress block work survived a disconnect, resume in Syncing so
+        // `process_buffered_blocks` can transition to Synced once the pipeline
+        // drains. Otherwise wait for BlocksNeeded / FiltersSyncComplete events.
+        if !self.pipeline.is_complete() {
+            self.set_state(SyncState::Syncing);
+            // Reissue any requeued in-flight `getdata`s to the new peer immediately
+            // instead of waiting for the next tick.
+            if self.pipeline.has_pending_requests() {
+                self.send_pending(_requests).await?;
+            }
+        } else {
+            self.set_state(SyncState::WaitForEvents);
+        }
+        Ok(vec![])
Note: the _requests parameter would also need to be renamed to requests since it would now be used.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dash-spv/src/sync/blocks/sync_manager.rs` around lines 35 - 53, start_sync
currently flips state to Syncing when self.pipeline is non-empty but does not
reissue any requeued getdata, causing a delay until the next tick; update
sync_manager::start_sync to accept and use the RequestSender (rename _requests
to requests) and, in the branch where you call
self.set_state(SyncState::Syncing), immediately invoke the manager's
pending-send routine (analogous to FiltersManager::start_sync calling
filter_pipeline.send_pending) — call the blocks pipeline/send_pending method
(e.g., self.pipeline.send_pending or self.send_pending(requests)) right after
setting the state so requeued in-flight slots from on_disconnect are resent
without waiting for the tick.
dash-spv/src/sync/sync_manager.rs (1)
116-134: 💤 Low value

Trait contract clarification looks good; lifecycle ordering is correct.

stop_sync sets state to WaitingForConnections first, then calls on_disconnect, so any state-aware preservation logic in implementations sees the post-stop state. The new doc clearly distinguishes the "preserve durable, requeue peer-bound" intent. Note that this is a breaking trait change (no default impl), so any out-of-tree SyncManager implementations will need to be updated — worth calling out in release notes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dash-spv/src/sync/sync_manager.rs` around lines 116 - 134, The new trait
method on_disconnect currently has no default implementation which is a breaking
change; to avoid breaking out-of-tree SyncManager implementors, add a default
no-op impl for on_disconnect in the SyncManager trait (so existing impls
continue to compile) and keep stop_sync calling self.on_disconnect(); also
update the release notes to mention the lifecycle ordering (stop_sync sets
WaitingForConnections then calls on_disconnect) and the intended
preservation/requeue semantics.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dash-spv/src/sync/filters/pipeline.rs`:
- Around line 255-262: The requeue flow can leave a batch key in
coordinator.pending without a tracker if late filter messages complete the batch
in receive_with_data; add a method on DownloadCoordinator (e.g.,
cancel_pending(&mut self, batch_start: HeightType)) that removes a key from
pending, and update receive_with_data so that after calling
coordinator.receive(&batch_start) if it returns false you invoke
coordinator.cancel_pending(batch_start) and also remove the tracker from
batch_trackers to keep coordinator.pending and batch_trackers consistent (so
send_pending no longer errors with SyncError::InvalidState).

---

Nitpick comments:
In `@dash-spv/src/sync/blocks/sync_manager.rs`:
- Around line 35-53: start_sync currently flips state to Syncing when
self.pipeline is non-empty but does not reissue any requeued getdata, causing a
delay until the next tick; update sync_manager::start_sync to accept and use the
RequestSender (rename _requests to requests) and, in the branch where you call
self.set_state(SyncState::Syncing), immediately invoke the manager's
pending-send routine (analogous to FiltersManager::start_sync calling
filter_pipeline.send_pending) — call the blocks pipeline/send_pending method
(e.g., self.pipeline.send_pending or self.send_pending(requests)) right after
setting the state so requeued in-flight slots from on_disconnect are resent
without waiting for the tick.

In `@dash-spv/src/sync/download_coordinator.rs`:
- Around line 95-110: The call to items.into_iter().rev() in requeue_in_flight
is pointless because items comes from draining a HashMap (in_flight) and has
arbitrary order; remove the .rev() so you simply iterate items.into_iter() and
push_front each into pending, or if you actually need to restore original
enqueue order change in_flight from a HashMap to an ordered container (e.g.,
IndexMap or track a VecDeque of keys) and then requeue using that preserved
order.

In `@dash-spv/src/sync/sync_manager.rs`:
- Around line 116-134: The new trait method on_disconnect currently has no
default implementation which is a breaking change; to avoid breaking out-of-tree
SyncManager implementors, add a default no-op impl for on_disconnect in the
SyncManager trait (so existing impls continue to compile) and keep stop_sync
calling self.on_disconnect(); also update the release notes to mention the
lifecycle ordering (stop_sync sets WaitingForConnections then calls
on_disconnect) and the intended preservation/requeue semantics.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fdedfab9-db3b-4890-8346-6d64c9f9f9b3

📥 Commits

Reviewing files that changed from the base of the PR and between 49c8c6d and b628a1f.

📒 Files selected for processing (15)

dash-spv/src/sync/block_headers/sync_manager.rs
dash-spv/src/sync/blocks/manager.rs
dash-spv/src/sync/blocks/pipeline.rs
dash-spv/src/sync/blocks/sync_manager.rs
dash-spv/src/sync/chainlock/sync_manager.rs
dash-spv/src/sync/download_coordinator.rs
dash-spv/src/sync/filter_headers/manager.rs
dash-spv/src/sync/filter_headers/sync_manager.rs
dash-spv/src/sync/filters/manager.rs
dash-spv/src/sync/filters/pipeline.rs
dash-spv/src/sync/filters/sync_manager.rs
dash-spv/src/sync/instantsend/sync_manager.rs
dash-spv/src/sync/masternodes/sync_manager.rs
dash-spv/src/sync/mempool/sync_manager.rs
dash-spv/src/sync/sync_manager.rs

…equeued batch After `requeue_in_flight()` moves a batch's `start_height` from `coordinator.in_flight` back to `coordinator.pending`, a buffered `CFilter` from the disconnected peer can still reach `receive_with_data` and complete the batch. The tracker is removed but `coordinator.receive` returns `false` (the key is in `pending`, not `in_flight`), leaving the key orphaned in `pending`. The next `send_pending` pops it, finds no tracker, and aborts sync with `SyncError::InvalidState`. Add `DownloadCoordinator::cancel_pending` and call it from `receive_with_data` when `receive` reports the key was not in flight. Addresses CodeRabbit review comment on PR #746 #746 (comment)

xdustinface · 2026-05-09T02:42:23Z

@CodeRabbit review

coderabbitai · 2026-05-09T02:42:27Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai Bot requested changes May 8, 2026

View reviewed changes

Comment thread dash-spv/src/sync/filters/pipeline.rs

coderabbitai Bot approved these changes May 9, 2026

View reviewed changes

github-actions Bot added the ready-for-review CodeRabbit has approved this PR label May 9, 2026

xdustinface requested review from ZocoLini May 9, 2026 02:47

xdustinface mentioned this pull request May 9, 2026

dash-spv: non-exclusive peer disconnection test times out on Ubuntu ARM #750

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(dash-spv): preserve in-progress sync state on peer disconnect#746

fix(dash-spv): preserve in-progress sync state on peer disconnect#746
xdustinface wants to merge 2 commits intov0.42-devfrom
fix/blocks-filters-disconnect-race

xdustinface commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

codecov Bot commented May 8, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

xdustinface commented May 9, 2026

Uh oh!

coderabbitai Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xdustinface commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

codecov Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xdustinface commented May 9, 2026

Uh oh!

coderabbitai Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xdustinface commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading

codecov Bot commented May 8, 2026 •

edited

Loading