feat: add search config surface and LLM query expansion by BYK · Pull Request #49 · BYK/opencode-lore

BYK · 2026-03-22T15:51:36Z

Phase 4 of search improvements (depends on #48)

Adds a configurable search section and optional LLM-based query expansion for the recall tool.

New config: `search` section

{
  "search": {
    "ftsWeights": { "title": 6.0, "content": 2.0, "category": 3.0 },
    "recallLimit": 10,
    "queryExpansion": false
  }
}

ftsWeights — BM25 column weights for knowledge FTS5 search. Tune how much title/content/category matches matter relative to each other.
recallLimit — Max results per source in the recall tool before RRF fusion (default 10, range 1-50).
queryExpansion — When enabled, the recall tool uses the configured LLM to generate 2-3 alternative query phrasings before search, improving recall for ambiguous queries.

Query expansion (`search.queryExpansion: true`)

When enabled:

The configured LLM generates 2-3 alternative phrasings of the user's query
FTS5 searches run for each variant (original + expansions)
All results are fused via RRF across all query variants
Original query naturally gets higher weight (appears first in fusion)

Implementation:

Uses the same worker session pattern as distillation/curation
3-second timeout — if the LLM is slow, falls back to original query only
Errors caught silently (logged) — never blocks search
Registered as lore-query-expand hidden agent

Config wiring

ftsWeights flows through to all BM25 searches in ltm.ts via ftsWeights() function (reads from config().search.ftsWeights)
recallLimit used as the per-source limit in the recall tool
client + searchConfig passed through createRecallTool()

Test coverage

7 new config tests: defaults, partial overrides, range validation, optional section

- Add search config section to LoreConfig: - ftsWeights: configurable BM25 column weights (title=6, content=2, category=3) - recallLimit: max results per source (default 10, range 1-50) - queryExpansion: optional LLM-based query expansion (default false) - Wire ftsWeights through ltm.ts (ftsWeights() reads from config) - Add expandQuery() to search.ts: - Uses worker session pattern (same as curator/distillation) - 3-second timeout — falls back to original query if LLM is slow - Parses JSON array response, caps at 3 expansions - Original query always included first - Register 'lore-query-expand' hidden agent in index.ts - Wire client + searchConfig through createRecallTool(): - When queryExpansion enabled: generates alternative queries, runs searches for each variant, fuses via RRF across all - Original query gets natural 2x weight (appears in both its own RRF lists and as the first entry) - Add QUERY_EXPANSION_SYSTEM prompt to prompt.ts - Add 7 new config tests for search schema

## Phase 5: Vector embedding search (depends on #49) Adds semantic vector search using Voyage AI's `voyage-code-3` model, layered on top of the existing BM25 + RRF fusion pipeline. ### How it works 1. **On knowledge create/update**: fire-and-forget embedding via Voyage AI API → stored as Float32Array BLOB in SQLite 2. **On recall search**: embed the query → brute-force cosine similarity over all knowledge BLOBs → feed vector results into existing RRF fusion as another ranked list ### Why Voyage AI? - `voyage-code-3` is **code-optimized** — best-in-class for technical/code text retrieval - 200M free tokens — effectively free forever for <100 knowledge entries - Single API key (`VOYAGE_API_KEY`), OpenAI-compatible response format - Supports `input_type: "document"" vs "query"" for retrieval-optimized embeddings ### Why pure-JS cosine, not libSQL/sqlite-vec? Tested both. `bun:sqlite` is standard SQLite — no `vector_distance_cos()` or DiskANN. The `@libsql/client-wasm` WASM package also lacks vector functions. Native `libsql` works but adds a ~15MB native dependency — overkill for <100 entries where brute-force cosine takes microseconds. ### Graceful degradation - Gated behind `search.embeddings.enabled` (default: false) + `VOYAGE_API_KEY` env var - If API key missing or API fails: FTS-only search continues working normally - `embedKnowledgeEntry()` is fire-and-forget — embedding failures are logged, never thrown - Entries without embeddings are simply excluded from vector search ### RRF integration Vector results use the same `k:` key prefix as BM25 knowledge results → RRF **merges** rather than duplicates. An entry found by both BM25 and vector search gets a higher combined RRF score. ### Config ```json { "search": { "embeddings": { "enabled": true, "model": "voyage-code-3", "dimensions": 1024 } } } ``` ### Test coverage - 15 new tests: cosine similarity (identical/opposite/orthogonal/zero vectors), BLOB round-trip (single/empty/1024-dim), vectorSearch (sorting/limit/null embeddings/confidence filter), isAvailable, config schema - 281 total tests, 0 failures

BYK enabled auto-merge (squash) March 22, 2026 15:51

BYK mentioned this pull request Mar 22, 2026

feat: add Voyage AI embedding search #50

Merged

BYK disabled auto-merge March 22, 2026 21:46

BYK force-pushed the feat/search-config-expansion branch from f1f24ca to 50ce231 Compare March 22, 2026 21:46

BYK enabled auto-merge (squash) March 22, 2026 21:48

BYK force-pushed the feat/search-config-expansion branch from 50ce231 to 47079eb Compare March 22, 2026 21:49

BYK merged commit dd7a435 into main Mar 22, 2026
1 check passed

BYK deleted the feat/search-config-expansion branch March 22, 2026 21:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add search config surface and LLM query expansion#49

feat: add search config surface and LLM query expansion#49
BYK merged 1 commit intomainfrom
feat/search-config-expansion

BYK commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented Mar 22, 2026

Phase 4 of search improvements (depends on #48)

New config: search section

Query expansion (search.queryExpansion: true)

Config wiring

Test coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New config: `search` section

Query expansion (`search.queryExpansion: true`)