axonops · millerjp · Apr 3, 2026
diff --git a/docs/data-platforms/cassandra/architecture/distributed-data/paxos.md b/docs/data-platforms/cassandra/architecture/distributed-data/paxos.md
@@ -356,8 +356,53 @@ paxos_variant: v2
 paxos_state_purging: repaired
 ```
 
+!!! note "`paxos_variant` and `paxos_state_purging` Are Independent"
+    These two settings do not depend on each other. You can enable Paxos v2 without changing `paxos_state_purging`, or set `paxos_state_purging: repaired` with Paxos v1. However, the recommended production configuration for LWT-heavy clusters is `v2` + `repaired`, which together enable the [commit consistency optimization](#commit-consistency-optimization) below.
+
 For detailed configuration options, see [Paxos-Related cassandra.yaml Configuration](../../operations/repair/strategies.md#paxos-related-cassandrayaml-configuration).
 
+### Paxos State Purging
+
+The `paxos_state_purging` setting controls how old entries in the `system.paxos` table are cleaned up:
+
+| Value | Mechanism | Safe with Commit CL=ANY | Revert Path |
+|-------|-----------|------------------------|-------------|
+| `legacy` | TTL-based expiration | **No** — committed values may expire before propagation | N/A (default) |
+| `gc_grace` | Compaction-time expiry based on `gc_grace_seconds`, no TTLs | **No** | Safe fallback from `repaired` |
+| `repaired` | Purged only after Paxos repair low bound confirms quorum persistence | **Yes** | **MUST** revert to `gc_grace`, **NOT** `legacy` |
+
+With `repaired`, Cassandra uses the low bound recorded in `system.paxos_repair_history` to determine which `system.paxos` entries can be safely purged during compaction. This low bound is only advanced by **coordinated Paxos repairs** (`nodetool repair --paxos-only` or regular `nodetool repair`), not by the automatic background Paxos repair. See [Understanding the Two Paxos Repair Mechanisms](../../operations/repair/strategies.md#understanding-the-two-paxos-repair-mechanisms) for the full distinction.
+
+### Commit Consistency Optimization
+
+LWT operations in Cassandra use two consistency levels:
+
+- **Serial consistency level** (`SERIAL` or `LOCAL_SERIAL`): Controls the Paxos consensus phase — how many replicas must participate in the prepare/propose/accept rounds.
+- **Commit (non-serial) consistency level**: Controls the final commit phase — how many replicas must acknowledge that the committed value has been written to the base table.
+
+These are configured separately in application code. For example, a query might use `LOCAL_SERIAL` for consensus and `LOCAL_QUORUM` for the commit.
+
+With Paxos v2 and `paxos_state_purging: repaired`, the commit consistency level can be safely set to `ANY`. This eliminates a WAN round-trip because the coordinator does not need to wait for a quorum acknowledgment of the commit — the Paxos repair mechanism guarantees that committed values will eventually be propagated.
+
+**Prerequisites for commit CL=ANY:**
+
+1. `paxos_variant: v2` set consistently across **all nodes**
+2. `paxos_state_purging: repaired` set consistently across **all nodes**
+3. Regular coordinated Paxos repairs running (`nodetool repair --paxos-only` or regular `nodetool repair`)
+
+**Example driver configuration (Java):**
+
+```java
+// Serial consistency controls the Paxos consensus phase
+statement.setSerialConsistencyLevel(ConsistencyLevel.LOCAL_SERIAL);
+
+// Commit consistency controls the final write — can be ANY with v2 + repaired
+statement.setConsistencyLevel(ConsistencyLevel.ANY);
+```
+
+!!! warning "Reverting Commit CL"
+    If `paxos_state_purging` must be changed from `repaired` to `gc_grace` (for example, because coordinated Paxos repairs must be disabled for an extended period), applications **MUST** change their commit consistency level back from `ANY` to `QUORUM` or `LOCAL_QUORUM` to maintain correctness.
+
 ### Upgrade Considerations
 
 - Clusters with heavy LWT usage **SHOULD** upgrade to Paxos v2
@@ -376,9 +421,12 @@ For detailed configuration options, see [Paxos-Related cassandra.yaml Configurat
 | **Quorum** | Majority of replicas; with RF=3, quorum is 2 |
 | **Ballot** | Unique proposal number combining timestamp and node ID |
 | **Paxos state** | Entries in `system.paxos` table tracking proposals and accepted values |
-| **Paxos repair** | Process of reconciling Paxos state across replicas |
+| **Background Paxos repair** | Automatic process (every 5 min in 4.1+) that completes uncommitted Paxos transactions. Does not advance the repair low bound. |
+| **Coordinated Paxos repair** | `nodetool repair --paxos-only` or the Paxos step in regular `nodetool repair`. Completes uncommitted transactions AND advances the low bound in `system.paxos_repair_history`, enabling `system.paxos` garbage collection. |
+| **Paxos repair low bound** | Ballot recorded in `system.paxos_repair_history` indicating the point up to which Paxos state has been safely reconciled. Used by `paxos_state_purging: repaired` to determine what can be garbage collected. |
+| **Serial consistency** | Consistency level (`SERIAL` or `LOCAL_SERIAL`) controlling the Paxos consensus phase |
+| **Commit consistency** | Non-serial consistency level controlling the final commit write. Can be set to `ANY` with Paxos v2 + `repaired` purging. |
 | **LWT** | Lightweight Transaction—Cassandra's conditional atomic operations using Paxos |
-| **SERIAL** | Consistency level that uses Paxos for linearizable operations |
 
 ---
 
@@ -401,9 +449,9 @@ For detailed configuration options, see [Paxos-Related cassandra.yaml Configurat
 
 ### Operational Requirements
 
-- **Paxos state accumulates**: Without regular Paxos repairs, `system.paxos` grows unboundedly
+- **Paxos state accumulates**: Without regular **coordinated** Paxos repairs (`nodetool repair --paxos-only` or regular `nodetool repair`), `system.paxos` grows unboundedly when using `paxos_state_purging: repaired`. The automatic background Paxos repair does not advance the low bound needed for garbage collection.
 - **Topology changes**: Paxos repairs **MUST** complete before topology changes (bootstrap, decommission)
-- **Repair requirements**: Clusters using LWTs **MUST** run regular Paxos repairs
+- **Repair requirements**: Clusters using LWTs with `paxos_state_purging: repaired` **MUST** run regular coordinated Paxos repairs
 
 For operational guidance, see [Paxos Repairs](../../operations/repair/strategies.md#paxos-repairs).
 

diff --git a/docs/data-platforms/cassandra/operations/repair/concepts.md b/docs/data-platforms/cassandra/operations/repair/concepts.md
@@ -702,7 +702,12 @@ Paxos repairs maintain LWT **linearizability** and correctness, especially acros
 
 Paxos repairs are only relevant for **keyspaces that use LWTs**. For keyspaces that never use LWTs, Paxos state does not affect correctness, and operators **MAY** safely skip Paxos repairs for those keyspaces.
 
-In Cassandra 4.1+, Paxos repairs run automatically every 5 minutes by default. Operators **SHOULD** ensure Paxos repairs run regularly on clusters where LWTs are in use. See [Paxos Repairs](strategies.md#paxos-repairs) in the Repair Strategies guide for operational details.
+Cassandra 4.1+ provides two distinct Paxos repair mechanisms:
+
+1. **Background Paxos repair** — runs automatically every 5 minutes (configurable). Completes uncommitted Paxos transactions but does **NOT** advance the Paxos repair low bound or enable garbage collection of `system.paxos` data.
+2. **Coordinated Paxos repair** — runs via `nodetool repair --paxos-only` or as part of regular `nodetool repair`. Completes uncommitted transactions **AND** advances the low bound in `system.paxos_repair_history`, enabling garbage collection when using `paxos_state_purging: repaired`.
+
+For clusters using `paxos_state_purging: repaired`, operators **MUST** run regular coordinated Paxos repairs. The automatic background repair alone is not sufficient. See [Understanding the Two Paxos Repair Mechanisms](strategies.md#understanding-the-two-paxos-repair-mechanisms) in the Repair Strategies guide for the full distinction.
 
 ### Paxos Repairs and Topology Changes
 
@@ -760,9 +765,11 @@ Cassandra 4.1+ introduces **Paxos v2**, an updated Paxos implementation for ligh
 
 Paxos v2 is selected via the `paxos_variant` setting in `cassandra.yaml` (values: `v1` or `v2`).
 
-To safely take full advantage of Paxos v2, operators **MUST** ensure:
+`paxos_variant` and `paxos_state_purging` are **independent settings** — neither requires the other. However, the recommended production configuration for LWT-heavy clusters is `paxos_variant: v2` combined with `paxos_state_purging: repaired`, which together enable the [commit consistency optimization](../../architecture/distributed-data/paxos.md#commit-consistency-optimization).
+
+To safely take full advantage of Paxos v2 with `repaired` purging, operators **MUST** ensure:
 
-1. **Regular Paxos repairs** are running on all nodes
+1. **Regular coordinated Paxos repairs** are running (via `nodetool repair --paxos-only` schedule or regular `nodetool repair`)
 2. **Paxos state purging** is configured appropriately (see [Paxos-related cassandra.yaml configuration](strategies.md#paxos-related-cassandrayaml-configuration) in the Repair Strategies guide)
 
 Detailed configuration options and upgrade guidance are covered in the [Repair Strategies](strategies.md) documentation.

diff --git a/docs/data-platforms/cassandra/operations/repair/options-reference.md b/docs/data-platforms/cassandra/operations/repair/options-reference.md
@@ -642,27 +642,24 @@ nodetool repair --paxos-only my_keyspace
 
 **How it works:**
 
-Paxos repairs synchronize the Paxos commit log entries stored in `system.paxos` across replicas. This ensures that all nodes agree on the outcome of previous LWT operations, which is essential for maintaining linearizability guarantees.
+This command runs a **coordinated Paxos repair** that synchronizes Paxos state stored in `system.paxos` across replicas. Unlike the [automatic background Paxos repair](strategies.md#background-paxos-repair-automatic) (which only completes uncommitted transactions), `--paxos-only` also advances the **Paxos repair low bound** by writing to `system.paxos_repair_history`. This low bound is what enables garbage collection of old `system.paxos` data when using `paxos_state_purging: repaired`.
 
 **When to use:**
 
-- **Pre-4.1 clusters**: Operators **MUST** schedule `--paxos-only` repairs manually (typically hourly) since automatic Paxos repairs are not available
-- **Before topology changes**: Run on all nodes before bootstrap, decommission, replace, or move operations to reduce the risk of Paxos cleanup timeouts
-- **After disabling automatic Paxos repairs**: If `paxos_repair_enabled` is set to `false`, manual Paxos repairs **MUST** be scheduled regularly for clusters using LWTs
-- **Troubleshooting LWT issues**: When LWTs are timing out or behaving unexpectedly
+- **Clusters using `paxos_state_purging: repaired`**: Operators **MUST** run `--paxos-only` repairs regularly (typically hourly) or ensure regular full repairs include the Paxos step. The automatic background repair does **NOT** advance the low bound, so without coordinated repairs, `system.paxos` grows unboundedly.
+- **Pre-4.1 clusters**: Operators **MUST** schedule `--paxos-only` repairs manually since the automatic background repair is not available.
+- **Before topology changes**: Run on all nodes before bootstrap, decommission, replace, or move operations to reduce the risk of Paxos cleanup timeouts.
+- **After disabling automatic Paxos repairs**: If `paxos_repair_enabled` is set to `false`, coordinated Paxos repairs **SHOULD** be scheduled regularly for clusters using LWTs.
+- **Troubleshooting LWT issues**: When LWTs are timing out or behaving unexpectedly.
 
-**Automatic Paxos repairs (Cassandra 4.1+):**
+**Relationship to automatic background Paxos repair (Cassandra 4.1+):**
 
-In Cassandra 4.1 and later, Paxos repairs run automatically every 5 minutes by default when `paxos_repair_enabled` is `true`. Manual `--paxos-only` repairs are typically only needed for:
-
-- Pre-4.1 clusters
-- Clusters where automatic Paxos repairs have been disabled
-- Proactive cleanup before topology changes
+Cassandra 4.1+ includes an automatic background Paxos repair that runs every 5 minutes (controlled by `paxos_repair_enabled`). This background repair completes uncommitted transactions but does **NOT** replace the need for coordinated `--paxos-only` repairs. See [Understanding the Two Paxos Repair Mechanisms](strategies.md#understanding-the-two-paxos-repair-mechanisms) for the full distinction.
 
 **Operational guidance:**
 
 - Running without a keyspace argument repairs Paxos state for **all keyspaces**. This is often **RECOMMENDED** because operators frequently do not know which keyspaces developers are using for LWTs.
-- Paxos repairs are lightweight compared to full data repairs and complete quickly
+- Paxos repairs are lightweight compared to full data repairs and complete quickly.
 
 For more details on Paxos repair strategy and configuration, see [Paxos Repairs](strategies.md#paxos-repairs) in the Repair Strategies guide.