Skip to content

refactor(embedding): simplify to OpenAI + Ollama with O(1) character-based truncation#2

Merged
jgpruitt merged 3 commits intomainfrom
me1/embeddings
Mar 19, 2026
Merged

refactor(embedding): simplify to OpenAI + Ollama with O(1) character-based truncation#2
jgpruitt merged 3 commits intomainfrom
me1/embeddings

Conversation

@jgpruitt
Copy link
Copy Markdown
Collaborator

Summary

  • Replace CPU-intensive tiktoken tokenization with O(1) character-based truncation
  • Simplify embedding providers to just OpenAI and Ollama
  • Add retry logic for OpenAI context length errors

Changes

Truncation overhaul

  • Removed js-tiktoken and mistral-tokenizer-js dependencies
  • New approach: Character-based truncation using 3.8 chars/token ratio (~10% buffer)
  • text.length and text.slice() are O(1) — no event loop blocking
  • OpenAI retries with progressively tighter ratios (3.8 → 3.0 → 2.5) if estimate is insufficient

Provider simplification

  • Removed Cohere, Mistral, and Google providers (and their SDK dependencies)
  • Kept OpenAI (hosted platform) and Ollama (self-hosted option)
  • Removes ~70 lines of provider-specific code

Options cleanup

  • Wired up maxParallelCalls to actually be passed to embedMany
  • Removed unused batchSize option (SDK handles chunking automatically)

Why

  1. Performance: tiktoken tokenization is CPU-bound and blocks the event loop
  2. Simplicity: We only need OpenAI (hosted) and Ollama (self-hosted)
  3. Cost: OpenAI is cheaper than self-hosting until ~80K memories/day

Breaking Changes

  • EmbeddingProvider type now only accepts "openai" | "ollama"
  • Removed batchSize from EmbeddingOptions

Test Results

  • 201 tests pass
  • TypeScript compiles cleanly
  • Biome lint passes

Simplify to just OpenAI and Ollama:
- OpenAI for hosted platform
- Ollama for self-hosted (runs any model locally)

Removes 3 dependencies: @ai-sdk/cohere, @ai-sdk/mistral, @ai-sdk/google
Removes Cohere-specific providerOptions handling
- maxParallelCalls is passed through to embedMany (controls concurrent chunk requests)
- Removed batchSize option (not a Vercel AI SDK option, chunking is automatic)
@jgpruitt jgpruitt merged commit 6a00cc7 into main Mar 19, 2026
2 checks passed
@jgpruitt jgpruitt deleted the me1/embeddings branch March 19, 2026 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant