feat: add Voyage AI embedding provider — closes #1 by TerminalGravity · Pull Request #222 · preflight-dev/preflight

TerminalGravity · 2026-03-11T22:18:14Z

Adds Voyage AI as a third embedding provider option alongside local (Xenova) and OpenAI.

Changes

src/lib/embeddings.ts — new VoyageEmbeddingProvider class (voyage-3 model, 1024 dims, 128 texts/batch)
src/lib/config.ts — extended EmbeddingProvider type and config interface with voyage_api_key / voyage_model
src/lib/timeline-db.ts — TimelineConfig and getEmbedder() updated to support voyage
src/tools/onboard-project.ts — accepts voyage_api_key and voyage_model params
tests/lib/embeddings.test.ts — 3 new tests for voyage provider creation
README.md — documented VOYAGE_API_KEY, VOYAGE_MODEL env vars and config.yml options

Usage

export EMBEDDING_PROVIDER=voyage
export VOYAGE_API_KEY=pa-...
# optional: VOYAGE_MODEL=voyage-3-lite

Or in .preflight/config.yml:

embeddings:
  provider: voyage
  voyage_api_key: pa-...
  voyage_model: voyage-3

Closes #1

TerminalGravity

Clean implementation. The batch-by-128 approach matches Voyage's API limits, and sorting by index before collecting is the right call since the API doesn't guarantee ordering.

A few things:

Dimensions are model-dependent — voyage-3 is 1024-dim, but voyage-code-3 is also 1024 while voyage-3-lite is 512. Hardcoding dimensions = 1024 means swapping models via config could silently produce wrong-sized vectors in LanceDB. Consider fetching dimensions from a model→dim map or making it configurable.
input_type: "document" — this is correct for indexing, but queries should use input_type: "query". The embed() method (used for search queries) goes through embedBatch which always sends "document". Voyage's docs say this asymmetry matters for retrieval quality.
Rate limiting — no retry/backoff on 429s. The OpenAI provider doesn't have this either so it's consistent, but worth noting for heavy onboarding runs.

None of these are blockers for a first pass, but the input_type one is worth fixing before merge — it'll measurably impact search quality.

- Add VoyageEmbeddingProvider class supporting voyage-3 (1024d) and voyage-3-lite (512d) - Update EmbeddingConfig type to include 'voyage' provider option - Update onboard-project, timeline-db, and CLI init to accept voyage - Add 3 tests for voyage provider creation and dimension checks - All 46 tests passing

TerminalGravity commented Mar 12, 2026

View reviewed changes

This was referenced Mar 12, 2026

docs: add module-level documentation to embeddings.ts and timeline-db.ts #226

Open

docs: replace premature Ollama troubleshooting with embedding provider reference #223

Open

TerminalGravity force-pushed the feat/voyage-embeddings branch from 1ef342b to 59ccda1 Compare March 20, 2026 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Voyage AI embedding provider — closes #1#222

feat: add Voyage AI embedding provider — closes #1#222
TerminalGravity wants to merge 1 commit into
mainfrom
feat/voyage-embeddings

TerminalGravity commented Mar 11, 2026

Uh oh!

TerminalGravity left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TerminalGravity commented Mar 11, 2026

Changes

Usage

Uh oh!

TerminalGravity left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant