Skip to content

feat: findEmbeddingsByIds read primitive#62

Open
techiejd wants to merge 7 commits into
feat/cf-workers-typesfrom
feat/find-embeddings-by-ids
Open

feat: findEmbeddingsByIds read primitive#62
techiejd wants to merge 7 commits into
feat/cf-workers-typesfrom
feat/find-embeddings-by-ids

Conversation

@techiejd
Copy link
Copy Markdown
Owner

@techiejd techiejd commented Jun 1, 2026

Adds `findEmbeddingsByIds` (public) / `findByIds` (adapter contract): fetch stored embedding records by primary key — including the raw `embedding` vector, which the normal search/query API never returns. Building block for "more like this."

The `id` of each record is whatever `search()` returns as `result.id`, so a search result round-trips directly.

What's here

  • `EmbeddingRecord` type + `VectorizedPayload.findEmbeddingsByIds({ knowledgePool, ids })` + `DbAdapter.findByIds(payload, poolName, ids)` signatures, re-exported for adapters.
  • Public wiring in `createVectorizedPayloadObject` (empty `ids` short-circuits to `[]`), covered via the in-memory mock adapter.
  • PostgreSQL: Drizzle `inArray` direct read; surfaces the raw pgvector column as `number[]`; non-numeric ids dropped via `/^\d+$/`.
  • MongoDB: `find({ _id: { $in } })`; non-24-hex ids dropped via the existing `HEX24` guard (never throws).
  • Cloudflare: official `binding.getByIds(ids)` (typed by refactor(cf): adopt @cloudflare/workers-types #61); metadata + values mapped to the record, no mapping-collection lookup needed.
  • Docs: `adapters/README.md` contract (interface, method table, `EmbeddingRecord` types reference, custom-adapter stub) + root `README.md` Local API section.

Contract

Misses dropped (result length may be `< ids.length`); order not guaranteed; empty `ids` → `[]` without a backend call; unknown/malformed ids treated as misses, not errors.

Tests

All green: main `test:int` 86/86 · pg 65/65 · cf 93/93 · mongodb 100/100. Each adapter has a dedicated `findByIds` spec (full record incl. numeric embedding + reserved + extension fields, drop-misses, empty-ids).

Stacking

Based on #61 (`feat/cf-workers-types`), which the CF `findByIds` depends on (unlocks the typed `getByIds`). Merge #61 first, then this retargets to `main`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant