feat: improve FTS5 search foundations by BYK · Pull Request #46 · BYK/opencode-lore

BYK · 2026-03-22T14:24:59Z

Phase 1 of search improvements

Fixes the FTS5 search foundations as the first step toward a comprehensive search overhaul.

Changes

New: src/search.ts — centralized search module

ftsQuery() — AND-based FTS5 query builder with stopword + single-char filtering
ftsQueryOr() — OR-based variant for fallback when AND returns nothing
STOPWORDS — conservative set (only genuinely content-free words, preserves domain terms like handle, state, type)
EMPTY_QUERY sentinel for all-stopword queries

Fixed: Knowledge search ranking

Was: ORDER BY updated_at DESC (most recently edited wins regardless of relevance)
Now: ORDER BY bm25(knowledge_fts, 6.0, 2.0, 3.0) (title matches weighted 6x, category 3x)
Uses JOIN pattern instead of subquery for proper rank access

New: distillation_fts table (schema migration v7)

FTS5 on observations column with porter unicode61 tokenizer
Replaces LIKE-based distillation search with BM25-ranked FTS5 search
Backfills existing data, sync triggers for INSERT/UPDATE/DELETE

Improved: AND→OR fallback pattern

All search functions try AND first (precision), fall back to OR when nothing matches (recall)
Blanket OR was tested empirically and rejected — adds noise even with stopwords

New: "Too vague" handling

When query is all stopwords/single-chars, recall tool returns guidance message instead of empty results
Prompts the LLM to reformulate with specific keywords

Test coverage

23 new tests in test/search.test.ts (query building, stopwords, edge cases)
New BM25 ranking test + AND→OR fallback test in test/ltm.test.ts
Schema v7 + distillation_fts verification in test/db.test.ts
Updated temporal.test.ts for new import path + behavior

- Create src/search.ts with centralized ftsQuery/ftsQueryOr functions - Add stopword filtering (conservative list, preserves domain terms) - Drop single-char tokens (contraction artifacts) but keep 2-char+ terms - Implement AND-then-OR fallback: AND first for precision, OR when AND returns nothing - Fix knowledge search to use BM25 rank instead of updated_at DESC - Uses bm25() with column weights: title=6.0, content=2.0, category=3.0 - JOIN pattern instead of subquery for proper rank access - Add distillation_fts table (schema migration v7) - FTS5 on observations column with porter unicode61 tokenizer - Backfill existing data, sync triggers for INSERT/UPDATE/DELETE - Replace LIKE-based distillation search with FTS5 ranked search - Add 'too vague' handling in recall tool for all-stopword queries - Remove ftsQuery from temporal.ts (now in search.ts, no re-export)

## Phase 2 of search improvements (depends on #46) Adds cross-source score fusion using Reciprocal Rank Fusion and rewrites the recall tool to produce a single ranked result list. ### Changes **New in `src/search.ts`** - `reciprocalRankFusion<T>()` — merges multiple ranked lists using RRF (k=60, Cormack et al. 2009). Rank-based, not score-based, so magnitude differences across FTS tables don't matter. - `normalizeRank()` — min-max normalization of FTS5 BM25 ranks to 0–1 (for display only) **New scored search variants** - `ltm.searchScored()` — returns `KnowledgeEntry & { rank }` with BM25 scores via `bm25(knowledge_fts, 6, 2, 3)` - `temporal.searchScored()` — returns `TemporalMessage & { rank }` - `searchDistillationsScored()` — returns `Distillation & { rank }` All scored variants include AND→OR fallback (same as Phase 1 search functions). **Rewritten recall tool** - Runs all 3 scored searches, tags results with source type - Fuses via RRF into a single ranked list - Output format: source-annotated list (`[knowledge/category]`, `[distilled]`, `[temporal/role]`) - Most relevant results appear first regardless of which source they came from ### Test coverage - 11 new tests for `normalizeRank()` and `reciprocalRankFusion()` - Tests cover: multi-list merge, dedup, empty lists, single list, custom k, score correctness

BYK enabled auto-merge (squash) March 22, 2026 14:25

BYK merged commit 60cfc76 into main Mar 22, 2026
1 check passed

BYK deleted the feat/fts5-foundations branch March 22, 2026 14:25

BYK mentioned this pull request Mar 22, 2026

feat: add RRF score fusion and rewrite recall tool #47

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve FTS5 search foundations#46

feat: improve FTS5 search foundations#46
BYK merged 1 commit intomainfrom
feat/fts5-foundations

BYK commented Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented Mar 22, 2026

Phase 1 of search improvements

Changes

Test coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant