Per-export cooldown compaction gating#65
Merged
Conversation
Defer dead-ratio compaction of a chunk until it has been idle (unwritten) for `compaction_cooldown` flush cycles. On overwrite-heavy DB volumes a hot chunk repeatedly crosses the dead-ratio threshold and gets recompacted over and over, rewriting its live blocks to S3 each time — pure PUT-byte write amplification. Trace-replay (compaction_sim) shows the gate cuts S3 PUT bytes ~22% (Postgres) / ~30% (SQLite) at 128 KiB blocks, ~0 elsewhere. - New per-export knob `compaction_cooldown` (Option<u64>, default 0 = disabled) on ExportConfig + the PUT /api/exports DTO; persists via the existing export.json save/discover and is preserved across resize. - compact_if_needed gains `cooldown` + a per-chunk idle-age map and gates ONLY the dead-ratio branch (also skips its pack-index fetches for hot chunks). The pack-count cap stays an unconditional greedy backstop, so a hot chunk's packs can never grow without bound. - The flush scheduler owns an ephemeral per-chunk idle-age map, advanced each cycle from FlushStats.touched_chunks (reset written chunks to 0, prune singletons). Ephemeral by design: a restart just defers reclaim briefly, bounded by the pack cap — never an incorrect compaction. Also resolves the write-trace hook TODO in create_export (GLIDEFS_WRITE_TRACE_DIR), the production counterpart that makes this analysis reproducible on real traffic. Tests (integration, in-tree harness + cold-reader data-integrity checks): defer-hot / compact-cold gate, pack-cap-overrides-cooldown backstop, FlushStats.touched_chunks reporting, and cooldown persistence across save/discover. ExportConfig literal updates across existing fixtures. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a per-export, default-OFF cooldown knob that defers dead-ratio compaction of a chunk until it has been idle (unwritten) for
compaction_cooldownflush cycles.On overwrite-heavy DB volumes a hot chunk repeatedly crosses the dead-ratio threshold and gets recompacted over and over, rewriting its live blocks to S3 each time — pure PUT-byte write amplification. Trace-replay (
compaction_sim) showed a simple cooldown age-gate cuts S3 PUT bytes ~22% (Postgres) / ~30% (SQLite) at 128 KiB blocks, ~0 elsewhere — and that the fuller F2FS(1-u)/(2u)·agevictim score converges to the same result, so we ship the simpler gate.How
compaction_cooldown: Option<u64>(0/unset = disabled) onExportConfig+ thePUT /api/exports/{name}DTO. Persists via the existingexport.jsonsave/discover and is preserved acrossresize_export.compact_if_neededgainscooldown+ a per-chunk idle-age map; gates only the dead-ratio candidate branch (also skips its pack-index fetches for hot chunks). The pack-count cap (> 16packs) stays an unconditional greedy backstop, so a hot chunk's packs can never grow without bound.FlushStats.touched_chunks(increment all, reset written chunks to 0, prune singletons). Ephemeral by design: a daemon restart just defers reclaim briefly (bounded by the pack cap) — never an incorrect compaction.create_export(GLIDEFS_WRITE_TRACE_DIR) — the production counterpart that makes this analysis reproducible on real traffic.Safety
Default OFF is byte-for-byte the current code path. A buggy age map can only defer compaction, bounded by the pack cap — never a durability or correctness issue.
Tests
Integration tests (in-tree harness, in-memory object store, cold-reader data-integrity verification):
test_cooldown_compaction_defers_hot_chunk— unknown/age-7 defer, age-8 compacts; latest data survives.test_cooldown_pack_cap_overrides_hot_chunk— pack cap fires on a hot chunk regardless of cooldown.test_flush_reports_touched_chunks—FlushStats.touched_chunkscorrectness.test_compaction_cooldown_persists_across_discover— save→discover round-trip.cargo build+cargo clippyclean; existing compaction/GC/flush tests pass.🤖 Generated with Claude Code