Proposal: Python GPU-accelerated indexing with Collection API integration

## Motivation

Issues #100 and #147 request GPU-accelerated indexing. zvec already has standalone GPU backends (FAISS GPU, cuVS CAGRA/IVF-PQ, Apple MPS) but they're **not integrated with the Collection API** — users can't do `collection.query()` with GPU acceleration.

## Proposed approach

### 1. `UnifiedGpuIndex` ABC (`python/zvec/backends/unified.py`)

Abstract base with `train()` + `add()` + `search()` + `size()` + `backend_name`, plus 6 adapters:

| Adapter | Wraps | Priority |
|---------|-------|----------|
| `CppCuvsAdapter` | Native `_zvec` pybind11 (zero-copy) | 1 (highest) |
| `CuvsCAGRAAdapter` | `cuvs.neighbors.cagra` | 2 |
| `CuvsIvfPqAdapter` | `cuvs.neighbors.ivf_pq` | 3 |
| `FaissGpuAdapter` | `backends/gpu.py::GPUIndex` | 4 |
| `AppleMpsAdapter` | `backends/apple_silicon.py` | 5 |
| `FaissCpuAdapter` | FAISS CPU fallback | 6 |

C++ native path is preferred (avoids Python→GPU data copies).

### 2. `GpuIndex` bridge (`python/zvec/gpu_index.py`)

Connects GPU search to Collection:
```python
gpu = collection.gpu_index("embedding")
gpu.build(vectors, ids)
docs = gpu.query(query_vector, topk=10, output_fields=["title"])
# Returns list[Doc] — same format as collection.query()
```

Flow: GPU search → map indices to doc IDs → `collection.fetch()` → attach scores → return `list[Doc]`.

### 3. Backend detection (`python/zvec/backends/detect.py`)

Adds C++ cuVS and Python cuVS detection with proper priority chain.

### Tested on RTX 4090

| Backend | QPS (50K vectors, dim=128) |
|---------|---------------------------|
| FAISS GPU (flat) | 529,316 |
| cuVS IVF-PQ | 45,771 |
| cuVS CAGRA | 43,711 |

## Questions for maintainers

1. Is `collection.gpu_index(field_name)` the right API surface, or would you prefer a different entry point?
2. The current design requires explicit `gpu.build(vectors, ids)` since `_Collection` has no scan/iterate API. Would you consider exposing a bulk vector extraction API from C++?
3. Any preference on how the C++ cuVS priority should interact with the existing backend detection?

Draft implementation: #176 (addresses #100 and #147)

Adapter	Wraps	Priority
`CppCuvsAdapter`	Native `_zvec` pybind11 (zero-copy)	1 (highest)
`CuvsCAGRAAdapter`	`cuvs.neighbors.cagra`	2
`CuvsIvfPqAdapter`	`cuvs.neighbors.ivf_pq`	3
`FaissGpuAdapter`	`backends/gpu.py::GPUIndex`	4
`AppleMpsAdapter`	`backends/apple_silicon.py`	5
`FaissCpuAdapter`	FAISS CPU fallback	6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Python GPU-accelerated indexing with Collection API integration #180

Motivation

Proposed approach

1. `UnifiedGpuIndex` ABC (`python/zvec/backends/unified.py`)

2. `GpuIndex` bridge (`python/zvec/gpu_index.py`)

3. Backend detection (`python/zvec/backends/detect.py`)

Tested on RTX 4090

Questions for maintainers

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Python GPU-accelerated indexing with Collection API integration #180

Description

Motivation

Proposed approach

1. UnifiedGpuIndex ABC (python/zvec/backends/unified.py)

2. GpuIndex bridge (python/zvec/gpu_index.py)

3. Backend detection (python/zvec/backends/detect.py)

Tested on RTX 4090

Questions for maintainers

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `UnifiedGpuIndex` ABC (`python/zvec/backends/unified.py`)

2. `GpuIndex` bridge (`python/zvec/gpu_index.py`)

3. Backend detection (`python/zvec/backends/detect.py`)