Skip to content

db/snapshotsync/freezeblocks: run block-snapshot merge off the shared build semaphore#21526

Draft
sudeepdino008 wants to merge 1 commit into
mainfrom
sudeep/block-merge-off-build-semaphore
Draft

db/snapshotsync/freezeblocks: run block-snapshot merge off the shared build semaphore#21526
sudeepdino008 wants to merge 1 commit into
mainfrom
sudeep/block-merge-off-build-semaphore

Conversation

@sudeepdino008
Copy link
Copy Markdown
Member

Problem

Block retirement holds the shared snapshot-build semaphore (snBuildAllowed, default size 1) across both the fast dump and the slow, expensive block-segment merge. While a large block merge runs, state-snapshot collation/prune (which bounds chaindata growth) is blocked for the merge's full duration.

Observed on a minimal node: an 8.2 GB / ~19-minute 025100→025200 transactions merge held the semaphore the whole time, starving state collation — stepsInDB climbed to 3.82 (chaindata bloats while collation waits).

Change

Release the semaphore after the dump phase and run the merge in its own goroutine, off the semaphore — mirroring how the aggregator runs MergeLoop. The fast dump stays serialized against state-snapshot building (preserving the I/O-throttle intent); the slow merge no longer starves state collation.

Effect

With the same 8.2 GB block-tx merge in flight, state collation now proceeds concurrently — stepsInDB stays < 2 (was 3.82). The CLI retire path calls MergeBlocks explicitly so its behavior is unchanged.

Tests

  • TestBlockMergeRunsWithoutSemaphore — merge completes while the semaphore is fully held (regression-guards re-acquisition).
  • TestRetireBlocksInBackgroundReleasesSemaphore — background retire releases the semaphore, no leak.

Draft — full make lint + integration verification pending.

… build semaphore

Block retire held the shared snapshot-build semaphore across both the dump and
the slow, expensive merge, blocking state-snapshot collation/prune for the
merge's full duration. Release the semaphore after the dump and run the merge
in its own goroutine (mirroring the aggregator's MergeLoop). The fast dump
stays serialized against state building; the slow merge no longer starves
state collation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant