Motivation
zvec has Python-level OPQ (python/zvec/backends/opq.py) using QR decomposition, but no C++ PQ/OPQ primitives. A header-only C++ implementation would:
- Enable PQ in the native index pipeline (no Python overhead)
- Improve OPQ rotation quality via SVD Procrustes (vs current QR approach)
- Have zero new dependencies (self-contained Jacobi SVD solver)
Proposed approach
Two new headers under src/ailego/:
product_quantizer.h
- k-means training for codebook learning
- Encode:
float32 vector → uint8 codes
- Decode:
uint8 codes → float32 approximation
- Asymmetric distance computation
- Distortion measurement
opq.h
- SVD-based Orthogonal Procrustes rotation
- Self-contained Jacobi SVD (no LAPACK dependency)
learn_rotation(data, pq) → rotation matrix
- Compatible with
product_quantizer.h
Also update the Python OPQ's _learn_rotation to use SVD Procrustes instead of QR decomposition for better rotation quality.
Questions for maintainers
- Is
src/ailego/ the right home for these headers, or would you prefer a subdirectory like src/ailego/quantization/?
- Any existing C++ PQ work in progress that this might overlap with?
- Should the Python OPQ upgrade (QR → SVD) be a separate PR?
Draft implementation: #173
Motivation
zvec has Python-level OPQ (
python/zvec/backends/opq.py) using QR decomposition, but no C++ PQ/OPQ primitives. A header-only C++ implementation would:Proposed approach
Two new headers under
src/ailego/:product_quantizer.hfloat32 vector → uint8 codesuint8 codes → float32 approximationopq.hlearn_rotation(data, pq)→ rotation matrixproduct_quantizer.hAlso update the Python OPQ's
_learn_rotationto use SVD Procrustes instead of QR decomposition for better rotation quality.Questions for maintainers
src/ailego/the right home for these headers, or would you prefer a subdirectory likesrc/ailego/quantization/?Draft implementation: #173