Skip to content

Clarify Paxos repair mechanisms: background vs coordinated#151

Open
millerjp wants to merge 1 commit intomasterfrom
feature/paxosrepairs
Open

Clarify Paxos repair mechanisms: background vs coordinated#151
millerjp wants to merge 1 commit intomasterfrom
feature/paxosrepairs

Conversation

@millerjp
Copy link
Copy Markdown
Contributor

@millerjp millerjp commented Apr 3, 2026

Summary

  • Distinguishes the two Paxos repair mechanisms that Cassandra 4.1+ provides: automatic background repair (completes uncommitted transactions only) vs coordinated repair via nodetool (also advances the low bound and enables system.paxos garbage collection)
  • Fixes incorrect guidance that --paxos-only repairs are only needed for pre-4.1 clusters — they are required when using paxos_state_purging: repaired
  • Adds Commit Consistency Optimization section explaining serial CL vs commit CL and the ANY optimization with Paxos v2 + repaired
  • Expands paxos_state_purging descriptions with actual mechanisms (TTL-based, compaction-time, low-bound), revert paths, and safety notes
  • Adds config independence calloutpaxos_variant and paxos_state_purging are independent settings
  • Adds recommended configuration path for LWT clusters adopting Paxos v2

Context

This was prompted by confusion in the Apache Cassandra Slack where an operator couldn't tell whether their scheduled nodetool repair --paxos-only was redundant given Cassandra's automatic repairs. Our docs presented the two as interchangeable alternatives, when they actually serve different purposes.

Files Changed

  • docs/data-platforms/cassandra/operations/repair/strategies.md — Major rewrite of Paxos Repairs section
  • docs/data-platforms/cassandra/operations/repair/options-reference.md — Fix --paxos-only guidance
  • docs/data-platforms/cassandra/operations/repair/concepts.md — Clarify two-mechanism distinction
  • docs/data-platforms/cassandra/architecture/distributed-data/paxos.md — Add purging, commit CL, terminology

Test plan

  • Verify mkdocs builds without errors
  • Check all internal anchor links resolve (new sections use anchors referenced from other pages)
  • Review Paxos Repairs section in strategies.md for technical accuracy
  • Review Commit Consistency Optimization section in paxos.md for driver examples
  • Confirm the comparison table (background vs coordinated) is accurate per Cassandra source

…d repair

The documentation previously presented automatic Paxos repair and
nodetool repair --paxos-only as interchangeable alternatives, causing
confusion about whether scheduled paxos-only repairs are redundant
when Cassandra 4.1+ automatic repairs are enabled.

In reality these are two distinct mechanisms: the automatic background
repair only completes uncommitted transactions, while coordinated
repair (nodetool) also advances the low bound in
system.paxos_repair_history, enabling system.paxos garbage collection
with paxos_state_purging: repaired.

Changes:
- strategies.md: Rewrite Paxos Repairs section with clear two-mechanism
  distinction, comparison table, expanded paxos_state_purging descriptions,
  recommended configuration path, and config independence note
- options-reference.md: Fix incorrect guidance that --paxos-only is only
  needed for pre-4.1 clusters; add critical paxos_state_purging: repaired case
- concepts.md: Add numbered list distinguishing the two mechanisms and
  config independence note
- paxos.md: Add Paxos State Purging section, Commit Consistency
  Optimization section (serial vs commit CL, CL=ANY prerequisites),
  config independence callout, and expanded terminology reference
@millerjp millerjp requested a review from hshimizu April 3, 2026 09:04
@millerjp millerjp self-assigned this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant