BYK · BYK · Mar 22, 2026 · Mar 22, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -13,14 +13,17 @@
 * **Lore DB uses incremental auto\_vacuum to prevent free-page bloat**: Lore's SQLite DB uses incremental auto\_vacuum (schema version 3 migration) to prevent free-page bloat from deletions. The migration sets PRAGMA auto\_vacuum = INCREMENTAL then VACUUM outside a transaction. temporal\_messages is the primary storage consumer (~51MB); knowledge table is tiny.
 
 <!-- lore:019d15de-e2d6-7ff2-ab86-b78ca39688a7 -->
-* **Lore search pipeline: FTS5 with AND-then-OR fallback and RRF fusion**: Lore's search overhaul (planned/in-progress) replaces three independent search systems with a unified pipeline in \`src/search.ts\`. Key design: \`ftsQuery()\` builds AND queries (primary), \`ftsQueryOr()\` builds OR queries (fallback only when AND returns zero results). Blanket OR was rejected empirically — it adds noise even with stopword filtering. Conservative stopword list excludes domain terms like 'handle', 'state', 'type'. FTS5 rank is negative (more negative = better); \`ORDER BY rank\` sorts best first. \`bm25()\` with column weights (title=6, content=2, category=3) verified working in Bun's SQLite. Recall tool uses Reciprocal Rank Fusion (k=60) across knowledge, temporal, and distillation sources. \`forSession()\` scoring uses OR (not AND-then-OR) because it's ranking all candidates, not searching for exact matches — BM25 naturally weights multi-term matches higher.
+* **Lore search pipeline: FTS5 with AND-then-OR fallback and RRF fusion**: Lore's search pipeline (\`src/search.ts\`) uses FTS5 with AND-then-OR fallback and RRF fusion. \`ftsQuery()\` builds AND queries (primary), \`ftsQueryOr()\` builds OR fallback (only when AND returns zero results). Conservative stopword list excludes domain terms like 'handle', 'state', 'type'. FTS5 rank is negative (more negative = better). \`bm25()\` column weights: title=6, content=2, category=3. \`extractTopTerms()\` extracts top-40 frequency-ranked terms with stopword filtering. Recall tool uses \`reciprocalRankFusion\<T>(lists, k=60)\` across knowledge, temporal, and distillation sources into a single ranked list with source-type annotations. \`forSession()\` uses OR-based FTS5 BM25 scoring (not AND-then-OR) because it ranks all candidates — BM25 naturally weights multi-term matches higher. Safety net: top-5 project entries by confidence always included.
 
 <!-- lore:019c8f8c-47c3-71a2-b5fd-248a2cfeba78 -->
 * **Lore temporal pruning runs after distillation and curation on session.idle**: In src/index.ts, session.idle awaits backgroundDistill and backgroundCurate sequentially before running temporal.prune(). Ordering is critical: pruning must not delete unprocessed messages. Pruning defaults: 120-day retention, 1GB max storage (in .lore.json under pruning.retention and pruning.maxStorage). These generous defaults were chosen because the system was new — earlier proposals of 7d/200MB were based on insufficient data.
 
 <!-- lore:019c94bd-042b-7215-b0a0-05719fcd39b2 -->
 * **LTM injection pipeline: system transform → forSession → formatKnowledge → gradient deduction**: LTM injected via experimental.chat.system.transform hook. getLtmBudget() computes ceiling as (contextLimit - outputReserved - overhead) \* ltmFraction (default 10%, configurable 2-30%). forSession() loads project-specific entries unconditionally + cross-project entries scored by term overlap, greedy-packs into budget. formatKnowledge() renders as markdown. setLtmTokens() records consumption so gradient deducts it. Key: LTM goes into output.system (system prompt) — invisible to tryFit(), counts against overhead budget.
 
+<!-- lore:019d15f7-4d00-781e-9512-a4f3e3109f18 -->
+* **OpenCode plugin SDK has no embedding API — vector search blocked**: The OpenCode plugin SDK (\`@opencode-ai/plugin\`, \`@opencode-ai/sdk\`) exposes only session/chat/tool operations. There is no \`client.embed()\`, embeddings endpoint, or raw model inference API. The only LLM access is \`client.session.prompt()\` which creates full chat roundtrips through the agentic loop. This means Lore cannot do vector/embedding search without either: (1) OpenCode adding an embedding API, or (2) direct \`fetch()\` to provider APIs bypassing the SDK (fragile — requires key extraction from \`client.config.providers()\`). The FTS5 + RRF search infrastructure is designed to be additive — vector search would layer on top as another RRF input list, not replace BM25.
+
 ### Decision
 
 <!-- lore:019c904b-7924-7187-8471-8ad2423b8946 -->
@@ -38,7 +41,7 @@
 * **Lore auto-recovery can infinite-loop without re-entrancy guard**: Three v0.5.2 bugs causing excessive background LLM requests: (1) Auto-recovery loop — session.error handler injected recovery prompt → could overflow again → loop. Fix: recoveringSessions Set as re-entrancy guard. (2) Curator ran every idle — \`onIdle || afterTurns\` short-circuited (onIdle=true). Fix: \`||\` → \`&&\`. Lesson: boolean flag gating numeric threshold needs AND not OR. (3) shouldSkip() fell back to session.list() on unknown sessions. Fix: remove list fallback, cache in activeSessions.
 
 <!-- lore:019d15de-e2e1-7ea0-a0bb-ab59227422e8 -->
-* **Lore knowledge FTS search was sorted by updated\_at, not BM25 relevance**: In \`ltm.search()\`, knowledge FTS results were ordered by \`k.updated\_at DESC\` instead of FTS5 BM25 rank — most recently edited won over most relevant. Fix: replace the \`WHERE k.rowid IN (SELECT rowid FROM knowledge\_fts ...)\` subquery pattern with a JOIN that exposes \`rank\`, then \`ORDER BY bm25(knowledge\_fts, 6.0, 2.0, 3.0)\`. Also: distillations had no FTS table at all (LIKE-only search), fixed by adding \`distillation\_fts\` in schema migration v7 with backfill and sync triggers.
+* **Lore knowledge FTS search was sorted by updated\_at, not BM25 relevance**: Three FTS search bugs fixed in the search overhaul: (1) Knowledge FTS sorted by \`updated\_at DESC\` not BM25 — fix: JOIN knowledge\_fts, \`ORDER BY bm25(knowledge\_fts, 6, 2, 3)\`. (2) Distillations had no FTS table (LIKE-only search) — fix: \`distillation\_fts\` virtual table in schema migration v7 with backfill + sync triggers. (3) \`forSession()\` used coarse bag-of-words term-overlap (top 30 terms >3 chars, no stemming) — fix: replaced \`scoreEntries()\` with \`scoreEntriesFTS()\` using FTS5 BM25 with OR semantics. All search functions now use AND-then-OR fallback pattern. \`ftsQuery()\`/\`ftsQueryOr()\` centralized in \`src/search.ts\` with stopword filtering and single-char removal.
 
 <!-- lore:019c8f4f-67ca-7212-a8c4-8a75b230ceea -->
 * **Test DB isolation via LORE\_DB\_PATH and Bun test preload**: Lore test suite uses isolated temp DB via test/setup.ts preload (bunfig.toml). Preload sets LORE\_DB\_PATH to mkdtempSync path before any imports of src/db.ts; afterAll cleans up. src/db.ts checks LORE\_DB\_PATH first. agents-file.test.ts needs beforeEach cleanup for intra-file isolation and TEST\_UUIDS cleanup in afterAll (shared with ltm.test.ts). Individual test files don't need close() calls — preload handles DB lifecycle.

diff --git a/src/config.ts b/src/config.ts
@@ -50,6 +50,28 @@ export const LoreConfig = z.object({
       maxStorage: z.number().min(50).default(1024),
     })
     .default({ retention: 120, maxStorage: 1024 }),
+  search: z
+    .object({
+      /** BM25 column weights for knowledge FTS5 [title, content, category]. */
+      ftsWeights: z
+        .object({
+          title: z.number().min(0).default(6.0),
+          content: z.number().min(0).default(2.0),
+          category: z.number().min(0).default(3.0),
+        })
+        .default({ title: 6.0, content: 2.0, category: 3.0 }),
+      /** Max results per source in recall tool before fusion. Default: 10. */
+      recallLimit: z.number().min(1).max(50).default(10),
+      /** Enable LLM-based query expansion for the recall tool. Default: false.
+       *  When enabled, the configured model generates 2–3 alternative query phrasings
+       *  before search, improving recall for ambiguous queries. */
+      queryExpansion: z.boolean().default(false),
+    })
+    .default({
+      ftsWeights: { title: 6.0, content: 2.0, category: 3.0 },
+      recallLimit: 10,
+      queryExpansion: false,
+    }),
   crossProject: z.boolean().default(false),
   agentsFile: z
     .object({

diff --git a/src/index.ts b/src/index.ts
@@ -236,6 +236,10 @@ export const LorePlugin: Plugin = async (ctx) => {
           hidden: true,
           description: "Lore knowledge curator worker",
         },
+        "lore-query-expand": {
+          hidden: true,
+          description: "Lore query expansion worker",
+        },
       };
     },
 
@@ -660,7 +664,12 @@ End with "I'm ready to continue." so the agent knows to pick up where it left of
 
     // Register the recall tool
     tool: {
-      recall: createRecallTool(projectPath, config().knowledge.enabled),
+      recall: createRecallTool(
+        projectPath,
+        config().knowledge.enabled,
+        ctx.client,
+        config().search,
+      ),
     },
   };
 

diff --git a/src/ltm.ts b/src/ltm.ts
@@ -1,5 +1,6 @@
 import { uuidv7 } from "uuidv7";
 import { db, ensureProject } from "./db";
+import { config } from "./config";
 import { ftsQuery, ftsQueryOr, EMPTY_QUERY, extractTopTerms } from "./search";
 
 // ~3 chars per token — validated as best heuristic against real API data.
@@ -153,8 +154,11 @@ export function forProject(
 
 type Scored = { entry: KnowledgeEntry; score: number };
 
-/** BM25 column weights for knowledge_fts: title, content, category. */
-const FTS_WEIGHTS = { title: 6.0, content: 2.0, category: 3.0 };
+/** BM25 column weights for knowledge_fts: title, content, category.
+ *  Reads from config().search.ftsWeights, falling back to defaults. */
+function ftsWeights() {
+  return config().search.ftsWeights;
+}
 
 /** Max entries per pool to include on first turn when no session context exists. */
 const NO_CONTEXT_FALLBACK_CAP = 10;
@@ -180,7 +184,7 @@ function scoreEntriesFTS(sessionContext: string): Map<string, number> {
   if (!terms.length) return new Map();
 
   const q = terms.map((t) => `${t}*`).join(" OR ");
-  const { title, content, category } = FTS_WEIGHTS;
+  const { title, content, category } = ftsWeights();
 
   try {
     const results = db()
@@ -410,7 +414,7 @@ export function search(input: {
        AND k.confidence > 0.2
        ORDER BY bm25(knowledge_fts, ?, ?, ?) LIMIT ?`;
 
-  const { title, content, category } = FTS_WEIGHTS;
+  const { title, content, category } = ftsWeights();
   const ftsParams = pid
     ? [q, pid, title, content, category, limit]
     : [q, title, content, category, limit];
@@ -452,7 +456,7 @@ export function searchScored(input: {
   if (q === EMPTY_QUERY) return [];
 
   const pid = input.projectPath ? ensureProject(input.projectPath) : null;
-  const { title, content, category } = FTS_WEIGHTS;
+  const { title, content, category } = ftsWeights();
 
   const ftsSQL = pid
     ? `SELECT k.*, bm25(knowledge_fts, ?, ?, ?) as rank FROM knowledge k

diff --git a/src/prompt.ts b/src/prompt.ts
@@ -431,3 +431,18 @@ export function formatKnowledge(
 
   return serialize(root(...children));
 }
+
+// ---------------------------------------------------------------------------
+// Query expansion (Phase 4)
+// ---------------------------------------------------------------------------
+
+export const QUERY_EXPANSION_SYSTEM = `You are a search query expander for a code knowledge base. Given a search query, generate 2–3 alternative queries that would help find relevant results. Focus on:
+- Synonyms and related technical terms
+- Different phrasings of the same concept
+- Broader or narrower scopes
+
+Return ONLY a JSON array of strings. No explanation, no markdown.
+
+Example:
+Input: "SQLite FTS5 ranking"
+Output: ["full text search scoring SQLite", "BM25 relevance ranking database", "FTS5 match order by rank"]`;
diff --git a/src/reflect.ts b/src/reflect.ts
@@ -1,10 +1,14 @@
 import { tool } from "@opencode-ai/plugin/tool";
+import type { createOpencodeClient } from "@opencode-ai/sdk";
 import * as temporal from "./temporal";
 import * as ltm from "./ltm";
 import * as log from "./log";
 import { db, ensureProject } from "./db";
-import { ftsQuery, ftsQueryOr, EMPTY_QUERY, reciprocalRankFusion } from "./search";
+import { ftsQuery, ftsQueryOr, EMPTY_QUERY, reciprocalRankFusion, expandQuery } from "./search";
 import { serialize, inline, h, p, ul, lip, liph, t, root } from "./markdown";
+import type { LoreConfig } from "./config";
+
+type Client = ReturnType<typeof createOpencodeClient>;
 
 type Distillation = {
   id: string;
@@ -186,7 +190,12 @@ function formatFusedResults(
   return serialize(root(h(2, "Recall Results"), ul(items)));
 }
 
-export function createRecallTool(projectPath: string, knowledgeEnabled = true): ReturnType<typeof tool> {
+export function createRecallTool(
+  projectPath: string,
+  knowledgeEnabled = true,
+  client?: Client,
+  searchConfig?: LoreConfig["search"],
+): ReturnType<typeof tool> {
   return tool({
     description:
       "Search your persistent memory for this project. Your visible context is a trimmed window — older messages, decisions, and details may not be visible to you even within the current session. Use this tool whenever you need information that isn't in your current context: file paths, past decisions, user preferences, prior approaches, or anything from earlier in this conversation or previous sessions. Always prefer recall over assuming you don't have the information. Searches long-term knowledge, distilled history, and raw message archives.",
@@ -206,84 +215,103 @@ export function createRecallTool(projectPath: string, knowledgeEnabled = true):
     async execute(args, context) {
       const scope = args.scope ?? "all";
       const sid = context.sessionID;
+      const limit = searchConfig?.recallLimit ?? 10;
 
       // If the query is all stopwords / single chars, short-circuit with guidance
       if (ftsQuery(args.query) === EMPTY_QUERY) {
         return "Query too vague — try using specific keywords, file names, or technical terms.";
       }
 
-      // Run scored searches across all sources
-      const knowledgeResults: ltm.ScoredKnowledgeEntry[] = [];
-      if (knowledgeEnabled && scope !== "session") {
+      // Optional query expansion: generate alternative phrasings via LLM
+      let queries = [args.query];
+      if (searchConfig?.queryExpansion && client && sid) {
         try {
-          knowledgeResults.push(
-            ...ltm.searchScored({
-              query: args.query,
-              projectPath,
-              limit: 10,
-            }),
-          );
+          queries = await expandQuery(client, args.query, sid);
         } catch (err) {
-          log.error("recall: knowledge search failed:", err);
+          log.info("recall: query expansion failed, using original:", err);
         }
       }
 
-      const distillationResults: ScoredDistillation[] = [];
-      if (scope !== "knowledge") {
-        try {
-          distillationResults.push(
-            ...searchDistillationsScored({
-              projectPath,
-              query: args.query,
-              sessionID: scope === "session" ? sid : undefined,
-              limit: 10,
-            }),
-          );
-        } catch (err) {
-          log.error("recall: distillation search failed:", err);
+      // Run scored searches for each query variant
+      // Original query is always first; if expansion produced extras,
+      // we include the original twice in the RRF lists (2× weight).
+      const allRrfLists: Array<{ items: TaggedResult[]; key: (r: TaggedResult) => string }> = [];
+
+      for (const query of queries) {
+        const knowledgeResults: ltm.ScoredKnowledgeEntry[] = [];
+        if (knowledgeEnabled && scope !== "session") {
+          try {
+            knowledgeResults.push(
+              ...ltm.searchScored({
+                query,
+                projectPath,
+                limit,
+              }),
+            );
+          } catch (err) {
+            log.error("recall: knowledge search failed:", err);
+          }
         }
-      }
 
-      const temporalResults: temporal.ScoredTemporalMessage[] = [];
-      if (scope !== "knowledge") {
-        try {
-          temporalResults.push(
-            ...temporal.searchScored({
-              projectPath,
-              query: args.query,
-              sessionID: scope === "session" ? sid : undefined,
-              limit: 10,
-            }),
-          );
-        } catch (err) {
-          log.error("recall: temporal search failed:", err);
+        const distillationResults: ScoredDistillation[] = [];
+        if (scope !== "knowledge") {
+          try {
+            distillationResults.push(
+              ...searchDistillationsScored({
+                projectPath,
+                query,
+                sessionID: scope === "session" ? sid : undefined,
+                limit,
+              }),
+            );
+          } catch (err) {
+            log.error("recall: distillation search failed:", err);
+          }
         }
+
+        const temporalResults: temporal.ScoredTemporalMessage[] = [];
+        if (scope !== "knowledge") {
+          try {
+            temporalResults.push(
+              ...temporal.searchScored({
+                projectPath,
+                query,
+                sessionID: scope === "session" ? sid : undefined,
+                limit,
+              }),
+            );
+          } catch (err) {
+            log.error("recall: temporal search failed:", err);
+          }
+        }
+
+        allRrfLists.push(
+          {
+            items: knowledgeResults.map((item) => ({
+              source: "knowledge" as const,
+              item,
+            })),
+            key: (r) => `k:${r.item.id}`,
+          },
+          {
+            items: distillationResults.map((item) => ({
+              source: "distillation" as const,
+              item,
+            })),
+            key: (r) => `d:${r.item.id}`,
+          },
+          {
+            items: temporalResults.map((item) => ({
+              source: "temporal" as const,
+              item,
+            })),
+            key: (r) => `t:${r.item.id}`,
+          },
+        );
       }
 
-      // Fuse results using Reciprocal Rank Fusion
-      const fused = reciprocalRankFusion<TaggedResult>([
-        {
-          items: knowledgeResults.map((item) => ({
-            source: "knowledge" as const,
-            item,
-          })),
-          key: (r) => `k:${r.item.id}`,
-        },
-        {
-          items: distillationResults.map((item) => ({
-            source: "distillation" as const,
-            item,
-          })),
-          key: (r) => `d:${r.item.id}`,
-        },
-        {
-          items: temporalResults.map((item) => ({
-            source: "temporal" as const,
-            item,
-          })),
-          key: (r) => `t:${r.item.id}`,
-        },
-      ]);
+      // Fuse results using Reciprocal Rank Fusion across all query variants
+      const fused = reciprocalRankFusion<TaggedResult>(allRrfLists);
 
       return formatFusedResults(fused, 20);
     },