feat: add Voyage AI embedding provider — closes #1#222
feat: add Voyage AI embedding provider — closes #1#222TerminalGravity wants to merge 1 commit intomainfrom
Conversation
TerminalGravity
left a comment
There was a problem hiding this comment.
Clean implementation. The batch-by-128 approach matches Voyage's API limits, and sorting by index before collecting is the right call since the API doesn't guarantee ordering.
A few things:
-
Dimensions are model-dependent —
voyage-3is 1024-dim, butvoyage-code-3is also 1024 whilevoyage-3-liteis 512. Hardcodingdimensions = 1024means swapping models via config could silently produce wrong-sized vectors in LanceDB. Consider fetching dimensions from a model→dim map or making it configurable. -
input_type: "document"— this is correct for indexing, but queries should useinput_type: "query". Theembed()method (used for search queries) goes throughembedBatchwhich always sends"document". Voyage's docs say this asymmetry matters for retrieval quality. -
Rate limiting — no retry/backoff on 429s. The OpenAI provider doesn't have this either so it's consistent, but worth noting for heavy onboarding runs.
None of these are blockers for a first pass, but the input_type one is worth fixing before merge — it'll measurably impact search quality.
- Add VoyageEmbeddingProvider class supporting voyage-3 (1024d) and voyage-3-lite (512d) - Update EmbeddingConfig type to include 'voyage' provider option - Update onboard-project, timeline-db, and CLI init to accept voyage - Add 3 tests for voyage provider creation and dimension checks - All 46 tests passing
1ef342b to
59ccda1
Compare
Adds Voyage AI as a third embedding provider option alongside local (Xenova) and OpenAI.
Changes
src/lib/embeddings.ts— newVoyageEmbeddingProviderclass (voyage-3 model, 1024 dims, 128 texts/batch)src/lib/config.ts— extendedEmbeddingProvidertype and config interface withvoyage_api_key/voyage_modelsrc/lib/timeline-db.ts—TimelineConfigandgetEmbedder()updated to support voyagesrc/tools/onboard-project.ts— acceptsvoyage_api_keyandvoyage_modelparamstests/lib/embeddings.test.ts— 3 new tests for voyage provider creationREADME.md— documentedVOYAGE_API_KEY,VOYAGE_MODELenv vars and config.yml optionsUsage
Or in
.preflight/config.yml:Closes #1