Clarify Paxos repair mechanisms: background vs coordinated#151
Open
Clarify Paxos repair mechanisms: background vs coordinated#151
Conversation
…d repair The documentation previously presented automatic Paxos repair and nodetool repair --paxos-only as interchangeable alternatives, causing confusion about whether scheduled paxos-only repairs are redundant when Cassandra 4.1+ automatic repairs are enabled. In reality these are two distinct mechanisms: the automatic background repair only completes uncommitted transactions, while coordinated repair (nodetool) also advances the low bound in system.paxos_repair_history, enabling system.paxos garbage collection with paxos_state_purging: repaired. Changes: - strategies.md: Rewrite Paxos Repairs section with clear two-mechanism distinction, comparison table, expanded paxos_state_purging descriptions, recommended configuration path, and config independence note - options-reference.md: Fix incorrect guidance that --paxos-only is only needed for pre-4.1 clusters; add critical paxos_state_purging: repaired case - concepts.md: Add numbered list distinguishing the two mechanisms and config independence note - paxos.md: Add Paxos State Purging section, Commit Consistency Optimization section (serial vs commit CL, CL=ANY prerequisites), config independence callout, and expanded terminology reference
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
nodetool(also advances the low bound and enablessystem.paxosgarbage collection)--paxos-onlyrepairs are only needed for pre-4.1 clusters — they are required when usingpaxos_state_purging: repairedANYoptimization with Paxos v2 +repairedpaxos_state_purgingdescriptions with actual mechanisms (TTL-based, compaction-time, low-bound), revert paths, and safety notespaxos_variantandpaxos_state_purgingare independent settingsContext
This was prompted by confusion in the Apache Cassandra Slack where an operator couldn't tell whether their scheduled
nodetool repair --paxos-onlywas redundant given Cassandra's automatic repairs. Our docs presented the two as interchangeable alternatives, when they actually serve different purposes.Files Changed
docs/data-platforms/cassandra/operations/repair/strategies.md— Major rewrite of Paxos Repairs sectiondocs/data-platforms/cassandra/operations/repair/options-reference.md— Fix--paxos-onlyguidancedocs/data-platforms/cassandra/operations/repair/concepts.md— Clarify two-mechanism distinctiondocs/data-platforms/cassandra/architecture/distributed-data/paxos.md— Add purging, commit CL, terminologyTest plan