refactor(embedding): simplify to OpenAI + Ollama with O(1) character-based truncation by jgpruitt · Pull Request #2 · timescale/memory-engine

jgpruitt · 2026-03-19T01:38:00Z

Summary

Replace CPU-intensive tiktoken tokenization with O(1) character-based truncation
Simplify embedding providers to just OpenAI and Ollama
Add retry logic for OpenAI context length errors

Changes

Truncation overhaul

Removed js-tiktoken and mistral-tokenizer-js dependencies
New approach: Character-based truncation using 3.8 chars/token ratio (~10% buffer)
text.length and text.slice() are O(1) — no event loop blocking
OpenAI retries with progressively tighter ratios (3.8 → 3.0 → 2.5) if estimate is insufficient

Provider simplification

Removed Cohere, Mistral, and Google providers (and their SDK dependencies)
Kept OpenAI (hosted platform) and Ollama (self-hosted option)
Removes ~70 lines of provider-specific code

Options cleanup

Wired up maxParallelCalls to actually be passed to embedMany
Removed unused batchSize option (SDK handles chunking automatically)

Why

Performance: tiktoken tokenization is CPU-bound and blocks the event loop
Simplicity: We only need OpenAI (hosted) and Ollama (self-hosted)
Cost: OpenAI is cheaper than self-hosting until ~80K memories/day

Breaking Changes

EmbeddingProvider type now only accepts "openai" | "ollama"
Removed batchSize from EmbeddingOptions

Test Results

201 tests pass
TypeScript compiles cleanly
Biome lint passes

Simplify to just OpenAI and Ollama: - OpenAI for hosted platform - Ollama for self-hosted (runs any model locally) Removes 3 dependencies: @ai-sdk/cohere, @ai-sdk/mistral, @ai-sdk/google Removes Cohere-specific providerOptions handling

- maxParallelCalls is passed through to embedMany (controls concurrent chunk requests) - Removed batchSize option (not a Vercel AI SDK option, chunking is automatic)

jgpruitt added 3 commits March 18, 2026 20:28

fix(embedding): wire up maxParallelCalls, remove unused batchSize option

5db6b35

- maxParallelCalls is passed through to embedMany (controls concurrent chunk requests) - Removed batchSize option (not a Vercel AI SDK option, chunking is automatic)

chore: update lockfile after removing embedding provider deps

c8cd644

jgpruitt merged commit 6a00cc7 into main Mar 19, 2026
2 checks passed

jgpruitt deleted the me1/embeddings branch March 19, 2026 01:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(embedding): simplify to OpenAI + Ollama with O(1) character-based truncation#2

refactor(embedding): simplify to OpenAI + Ollama with O(1) character-based truncation#2
jgpruitt merged 3 commits intomainfrom
me1/embeddings

jgpruitt commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jgpruitt commented Mar 19, 2026

Summary

Changes

Truncation overhaul

Provider simplification

Options cleanup

Why

Breaking Changes

Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant