Skip to content

feat: replace forSession() scoring with FTS5 BM25#48

Merged
BYK merged 1 commit intomainfrom
feat/fts-session-scoring
Mar 22, 2026
Merged

feat: replace forSession() scoring with FTS5 BM25#48
BYK merged 1 commit intomainfrom
feat/fts-session-scoring

Conversation

@BYK
Copy link
Owner

@BYK BYK commented Mar 22, 2026

Phase 3 of search improvements (depends on #47)

Replaces the coarse bag-of-words term-overlap scoring in forSession() with FTS5 BM25-based scoring.

Problem

forSession() used manual term-overlap counting: extract top 30 words >3 chars, count how many appear in each entry via string.includes(). This ignored:

  • Porter stemming ("configure" wouldn't match "configuration")
  • TF-IDF weighting (all matching terms counted equally)
  • Stopwords (common words inflated match counts)

Solution

New scoreEntriesFTS() in ltm.ts:

  • Runs session context terms against knowledge_fts using BM25
  • Uses OR semantics (not AND-then-OR) because we're scoring all candidates for ranking, not searching for exact matches — an entry matching 1 of 40 terms should get a low score, not be excluded
  • BM25 naturally weights entries matching more terms higher
  • Scores normalized to 0–1 and multiplied by entry confidence

Improved extractTopTerms() moved to search.ts:

  • Now uses same STOPWORDS set from Phase 1
  • Drops single chars only (not >3 char threshold) — preserves "DB", "CI", "IO"
  • Increased limit from 30 to 40 terms

Safety net preserved

Top 5 project entries by confidence are always included regardless of FTS match, preventing the scoring change from accidentally excluding critical project knowledge.

Test coverage

  • 8 new tests for extractTopTerms() (stopwords, 2-char tokens, limits, punctuation)
  • All 12 existing forSession() tests continue to pass

@BYK BYK enabled auto-merge (squash) March 22, 2026 14:32
@BYK BYK disabled auto-merge March 22, 2026 21:46
@BYK BYK force-pushed the feat/fts-session-scoring branch from 3f8cdeb to 5c1792c Compare March 22, 2026 21:46
@BYK BYK enabled auto-merge (squash) March 22, 2026 21:48
- Move extractTopTerms() to search.ts (shared, uses STOPWORDS + single-char filter)
- Replace scoreEntries() bag-of-words with scoreEntriesFTS():
  - Uses FTS5 bm25() with column weights (title=6, content=2, category=3)
  - OR semantics: ranks all candidates (not AND-then-OR) since we're
    scoring for relevance, not searching for exact matches
  - BM25 naturally weights entries matching more terms higher
  - Normalized scores (0-1) multiplied by confidence for final ranking
- Safety net preserved: top-5 project entries by confidence always included
- Move FTS_WEIGHTS constant to single definition before forSession()
- Add 8 new tests for extractTopTerms (stopwords, 2-char tokens, limits)
@BYK BYK force-pushed the feat/fts-session-scoring branch from 5c1792c to 5c522fe Compare March 22, 2026 21:48
@BYK BYK merged commit 708f298 into main Mar 22, 2026
1 check passed
@BYK BYK deleted the feat/fts-session-scoring branch March 22, 2026 21:49
BYK added a commit that referenced this pull request Mar 22, 2026
## Phase 4 of search improvements (depends on #48)

Adds a configurable search section and optional LLM-based query
expansion for the recall tool.

### New config: `search` section

```json
{
  "search": {
    "ftsWeights": { "title": 6.0, "content": 2.0, "category": 3.0 },
    "recallLimit": 10,
    "queryExpansion": false
  }
}
```

- **`ftsWeights`** — BM25 column weights for knowledge FTS5 search. Tune
how much title/content/category matches matter relative to each other.
- **`recallLimit`** — Max results per source in the recall tool before
RRF fusion (default 10, range 1-50).
- **`queryExpansion`** — When enabled, the recall tool uses the
configured LLM to generate 2-3 alternative query phrasings before
search, improving recall for ambiguous queries.

### Query expansion (`search.queryExpansion: true`)

When enabled:
1. The configured LLM generates 2-3 alternative phrasings of the user's
query
2. FTS5 searches run for each variant (original + expansions)
3. All results are fused via RRF across all query variants
4. Original query naturally gets higher weight (appears first in fusion)

Implementation:
- Uses the same worker session pattern as distillation/curation
- 3-second timeout — if the LLM is slow, falls back to original query
only
- Errors caught silently (logged) — never blocks search
- Registered as `lore-query-expand` hidden agent

### Config wiring
- `ftsWeights` flows through to all BM25 searches in ltm.ts via
`ftsWeights()` function (reads from `config().search.ftsWeights`)
- `recallLimit` used as the per-source limit in the recall tool
- `client` + `searchConfig` passed through `createRecallTool()`

### Test coverage
- 7 new config tests: defaults, partial overrides, range validation,
optional section
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant