Merged
Conversation
Stage 1a (merged in vortex-data/vortex#7269): - MSE-only TurboQuant with 8-bit default (near-lossless, ~4e-5 MSE) - Dimension >= 128 scheme selection, 3-round SORF - Original QJL PR (#7167) closed Stage 1b (next — array representation cleanup): - Power-of-2 dimension requirement (remove internal padding) - FixedSizeListArray rotation signs for variable SRHT rounds - Dtype-matching norms, structured metadata (format TBD pending vtable refactor) - Goal: wire format ready for backward-compat guarantees Stage 2 reframed as general-purpose structural encoding: - Block decomposition is a vertical split of FSL by dimension, analogous to ChunkedArray's horizontal split by rows - Encoding-agnostic: each block is independently encoded (all TQ initially, but supports heterogeneous child encodings) - Straggler blocks noted as future work for no-qualifying-B dims - PDX (Stage 3) similarly structural, not TQ-specific Other changes: - Codes/centroids remain separate slots; DictArray for canonicalize - Updated compression ratio examples for 8-bit default - Updated array layouts, migration table, references throughout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Will Manning <will@willmanning.io>
Fixes from critical review against cited sources: - Fix SORF/SRHT terminology conflation: SORF (multi-round HD product from [5]) was incorrectly called "SRHT" (Tropp's single-round R·H·D from [3]) in ~15 places. Now consistent throughout. - PDX speedup claims: cite precise Table 4 figures (2x avg, 1.5x at D>32) instead of ambiguous "about 40%". Clarify int8 layout and ADSampling are from the open-source impl, not the paper. - Strengthen SORF disclaimer: [5] does not prove distributional closeness to Haar measure; butterfly-stage counting has no theoretical backing in [5]. - Fix d=2 "singularity" language: the arcsine distribution exists at d=2; the real issue is it's U-shaped and unsuitable for Max-Lloyd. - Note GPU distance table at b=8 is 256KB (exceeds shared memory). - Note Eviox [7] URL may require account access. - Clarify Stage 1b gap: scheme still pads non-power-of-2 externally between Stage 1b and Stage 2. - Clarify Stage 2 tension: block decomposition is TQ-internal in initial implementation; extraction to general-purpose type is future. - Fix stale "k×3×B" in QJL strategy table (now k×R×B). Structural reorganization: - Move reference implementation bugs + Theorem 1 constant to Appendix A - Move community QJL findings to Appendix B - Move "Why not DCT?" + shared rotation speculation to Appendix C - Replace with brief summaries + appendix references in main text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Will Manning <will@willmanning.io>
Reframe Stage 1 as a forward-looking description of the target end state rather than a point-in-time snapshot of PR 7269. This helps the RFC age well — readers approaching it in months will care about what Stage 1 delivers, not which pieces landed in which PR. - Merge Stage 1a + 1b into single "Stage 1: MSE-only TurboQuant (in progress)" section focused on target properties - PR 7269 mentioned as "initial implementation is merged" context - "Remaining work" list captures what's left to complete Stage 1 - Single array layout diagram for Stage 1 (target state) - Merged Phase 1a/1b into single Phase 1 in Phasing section - Simplified migration section and shipping table - Removed all 1a/1b references throughout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Will Manning <will@willmanning.io>
…dimension) Stage 2 needs dimension=768 (non-power-of-2) inside a single TQ array, which contradicts the previous "dimension is always power-of-2" invariant. The constraint actually applies to block_size: in Stage 1 block_size = dimension (both power-of-2), but in Stage 2 dimension = num_blocks × block_size can be non-power-of-2. Fixed throughout: decoder invariant, Stage 1 target properties, minimum dimension, current limitations. Also fix "Stage Stage" typo on line 254. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Will Manning <will@willmanning.io>
Revert the "move padding to scheme level" decision — the TQ array keeps its existing internal zero-padding for non-power-of-2 dimensions. The power-of-2 constraint applies to block_size (the SORF dimension), not the input dimension. - Stage 1: accepts any d >= 4, pads non-power-of-2 internally (block_size = padded_dim). codes.list_size may exceed dimension. - Stage 2: block decomposition eliminates padding for dims with a qualifying B (each block is natively power-of-2). No-qualifying-B dims fall back to internal zero-padding (single padded block). - Decoder invariant: block_size is always power-of-2; codes.list_size = num_blocks × block_size (may differ from dimension when internal padding applies in Stage 1). - Remove "require power-of-2 dimensions" from Stage 1 remaining work. - Replace all "scheme-level padding" references with "internal zero-padding". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Will Manning <will@willmanning.io>
…r intuition - Fix Stage 2 comparison table: Stage 1 column now correctly uses padded_dim (not dim) for rotation signs, centroids, codes, and dot product — consistent with the Stage 1 array layout diagram. - Remove stale "power-of-2 dimension requirement" from Phase 1 in Phasing section (was removed from Stage 1 remaining work earlier). - Rewrite minimum dimension discussion: TQ is unlikely to be effective below d=64; exact threshold to be determined empirically. Modest padding (96→128) probably fine; large-fraction padding (32→64) not. - Expand straggler blocks: for small stragglers (e.g., d=800 → 3×256 + 32 remainder), SORF is ineffective; prefer uncompressed straggler or whole-vector padding. Note that full padding may beat block decomp with straggler for some dimensions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Will Manning <will@willmanning.io>
2f0d910 to
516f9c9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
follow up to #33 #34 #35 #36 #37 #38