Skip to content

feat: add search config surface and LLM query expansion#49

Merged
BYK merged 1 commit intomainfrom
feat/search-config-expansion
Mar 22, 2026
Merged

feat: add search config surface and LLM query expansion#49
BYK merged 1 commit intomainfrom
feat/search-config-expansion

Conversation

@BYK
Copy link
Owner

@BYK BYK commented Mar 22, 2026

Phase 4 of search improvements (depends on #48)

Adds a configurable search section and optional LLM-based query expansion for the recall tool.

New config: search section

{
  "search": {
    "ftsWeights": { "title": 6.0, "content": 2.0, "category": 3.0 },
    "recallLimit": 10,
    "queryExpansion": false
  }
}
  • ftsWeights — BM25 column weights for knowledge FTS5 search. Tune how much title/content/category matches matter relative to each other.
  • recallLimit — Max results per source in the recall tool before RRF fusion (default 10, range 1-50).
  • queryExpansion — When enabled, the recall tool uses the configured LLM to generate 2-3 alternative query phrasings before search, improving recall for ambiguous queries.

Query expansion (search.queryExpansion: true)

When enabled:

  1. The configured LLM generates 2-3 alternative phrasings of the user's query
  2. FTS5 searches run for each variant (original + expansions)
  3. All results are fused via RRF across all query variants
  4. Original query naturally gets higher weight (appears first in fusion)

Implementation:

  • Uses the same worker session pattern as distillation/curation
  • 3-second timeout — if the LLM is slow, falls back to original query only
  • Errors caught silently (logged) — never blocks search
  • Registered as lore-query-expand hidden agent

Config wiring

  • ftsWeights flows through to all BM25 searches in ltm.ts via ftsWeights() function (reads from config().search.ftsWeights)
  • recallLimit used as the per-source limit in the recall tool
  • client + searchConfig passed through createRecallTool()

Test coverage

  • 7 new config tests: defaults, partial overrides, range validation, optional section

@BYK BYK enabled auto-merge (squash) March 22, 2026 15:51
@BYK BYK disabled auto-merge March 22, 2026 21:46
@BYK BYK force-pushed the feat/search-config-expansion branch from f1f24ca to 50ce231 Compare March 22, 2026 21:46
@BYK BYK enabled auto-merge (squash) March 22, 2026 21:48
- Add search config section to LoreConfig:
  - ftsWeights: configurable BM25 column weights (title=6, content=2, category=3)
  - recallLimit: max results per source (default 10, range 1-50)
  - queryExpansion: optional LLM-based query expansion (default false)
- Wire ftsWeights through ltm.ts (ftsWeights() reads from config)
- Add expandQuery() to search.ts:
  - Uses worker session pattern (same as curator/distillation)
  - 3-second timeout — falls back to original query if LLM is slow
  - Parses JSON array response, caps at 3 expansions
  - Original query always included first
- Register 'lore-query-expand' hidden agent in index.ts
- Wire client + searchConfig through createRecallTool():
  - When queryExpansion enabled: generates alternative queries,
    runs searches for each variant, fuses via RRF across all
  - Original query gets natural 2x weight (appears in both its own
    RRF lists and as the first entry)
- Add QUERY_EXPANSION_SYSTEM prompt to prompt.ts
- Add 7 new config tests for search schema
@BYK BYK force-pushed the feat/search-config-expansion branch from 50ce231 to 47079eb Compare March 22, 2026 21:49
@BYK BYK merged commit dd7a435 into main Mar 22, 2026
1 check passed
@BYK BYK deleted the feat/search-config-expansion branch March 22, 2026 21:49
BYK added a commit that referenced this pull request Mar 22, 2026
## Phase 5: Vector embedding search (depends on #49)

Adds semantic vector search using Voyage AI's `voyage-code-3` model,
layered on top of the existing BM25 + RRF fusion pipeline.

### How it works

1. **On knowledge create/update**: fire-and-forget embedding via Voyage
AI API → stored as Float32Array BLOB in SQLite
2. **On recall search**: embed the query → brute-force cosine similarity
over all knowledge BLOBs → feed vector results into existing RRF fusion
as another ranked list

### Why Voyage AI?
- `voyage-code-3` is **code-optimized** — best-in-class for
technical/code text retrieval
- 200M free tokens — effectively free forever for <100 knowledge entries
- Single API key (`VOYAGE_API_KEY`), OpenAI-compatible response format
- Supports `input_type: "document"" vs "query"" for retrieval-optimized
embeddings

### Why pure-JS cosine, not libSQL/sqlite-vec?
Tested both. `bun:sqlite` is standard SQLite — no
`vector_distance_cos()` or DiskANN. The `@libsql/client-wasm` WASM
package also lacks vector functions. Native `libsql` works but adds a
~15MB native dependency — overkill for <100 entries where brute-force
cosine takes microseconds.

### Graceful degradation
- Gated behind `search.embeddings.enabled` (default: false) +
`VOYAGE_API_KEY` env var
- If API key missing or API fails: FTS-only search continues working
normally
- `embedKnowledgeEntry()` is fire-and-forget — embedding failures are
logged, never thrown
- Entries without embeddings are simply excluded from vector search

### RRF integration
Vector results use the same `k:` key prefix as BM25 knowledge results →
RRF **merges** rather than duplicates. An entry found by both BM25 and
vector search gets a higher combined RRF score.

### Config

```json
{
  "search": {
    "embeddings": {
      "enabled": true,
      "model": "voyage-code-3",
      "dimensions": 1024
    }
  }
}
```

### Test coverage
- 15 new tests: cosine similarity (identical/opposite/orthogonal/zero
vectors), BLOB round-trip (single/empty/1024-dim), vectorSearch
(sorting/limit/null embeddings/confidence filter), isAvailable, config
schema
- 281 total tests, 0 failures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant