|
31 | 31 | - [x] Agent prompt consolidation (structured templates, latest models) |
32 | 32 | - [x] Workflow integration (hooks, skill-rules, CLI) |
33 | 33 |
|
34 | | -## Stage D.5: Search & Intelligence (v0.6.x — Current) |
| 34 | +## Stage D.5: Search & Intelligence (v0.7.0 — Complete) |
35 | 35 |
|
36 | | -> FTS5, sqlite-vec, and @xenova/transformers are shipped (v0.6.3). This stage hardens retrieval quality, adds missing infrastructure, and fills gaps identified in SPEC.md Phase 4 and vision.md roadmap. |
| 36 | +> FTS5, sqlite-vec, @xenova/transformers shipped in v0.6.3. Stage D.5 hardened retrieval quality, added infrastructure, and filled gaps from SPEC.md Phase 4 and vision.md roadmap. Shipped in v0.7.0 with 137 new tests. |
37 | 37 |
|
38 | 38 | ### 1. Retrieval quality signals & acceptance criteria |
39 | | -- [ ] Add `retrieval_log` table: query, strategy, results returned, latency_ms, timestamp |
40 | | -- [ ] Instrument `ContextRetriever.retrieveContext()` to log every query + results |
| 39 | +- [x] Add `retrieval_log` table: query, strategy, results returned, latency_ms, timestamp |
| 40 | +- [x] Instrument `ContextRetriever.retrieveContext()` to log every query + results |
41 | 41 | - [ ] CLI command `stackmemory search:stats` — hit rate, avg latency, strategy distribution |
42 | 42 | - [ ] Add precision proxy: track whether returned frames are referenced in subsequent tool calls |
43 | 43 |
|
44 | 44 | ### 2. Cache expiry & LRU correctness |
45 | | -- [ ] Fix `getCachedResult()` — currently never expires (no timestamp check) |
46 | | -- [ ] Add `cachedAt` timestamp to cache entries; evict when > `cacheExpiryMs` |
47 | | -- [ ] Replace Map-based LRU with proper bounded LRU (or use Map insertion-order delete) |
| 45 | +- [x] Fix `getCachedResult()` — currently never expires (no timestamp check) |
| 46 | +- [x] Add `cachedAt` timestamp to cache entries; evict when > `cacheExpiryMs` |
| 47 | +- [x] Replace Map-based LRU with proper bounded LRU (or use Map insertion-order delete) |
48 | 48 |
|
49 | 49 | ### 3. FTS5 query sanitization |
50 | | -- [ ] Sanitize MATCH input: escape special chars (`"`, `*`, `OR`, `AND`, `NOT`, `NEAR`) |
51 | | -- [ ] Add prefix search support: `term*` for partial matches |
52 | | -- [ ] Handle multi-word queries with implicit AND (currently raw pass-through) |
| 50 | +- [x] Sanitize MATCH input: escape special chars (`"`, `*`, `OR`, `AND`, `NOT`, `NEAR`) |
| 51 | +- [x] Add prefix search support: `term*` for partial matches |
| 52 | +- [x] Handle multi-word queries with implicit AND (currently raw pass-through) |
53 | 53 |
|
54 | 54 | ### 4. Incremental garbage collection |
55 | | -- [ ] Add `retention_policy` column to frames (keep_forever, ttl_days, archive) |
56 | | -- [ ] `MaintenanceService.runGC()`: delete/archive frames past TTL |
57 | | -- [ ] Cascade: delete orphaned events, anchors, embeddings, FTS entries |
| 55 | +- [x] Add `retention_policy` column to frames (keep_forever, ttl_days, archive) |
| 56 | +- [x] `MaintenanceService.runGC()`: delete/archive frames past TTL |
| 57 | +- [x] Cascade: delete orphaned events, anchors, embeddings, FTS entries |
58 | 58 | - [ ] CLI `stackmemory gc --dry-run` for preview |
59 | | -- [ ] Configurable in `daemon-config.ts`: `gcRetentionDays`, `gcBatchSize` |
| 59 | +- [x] Configurable in `daemon-config.ts`: `gcRetentionDays`, `gcBatchSize` |
60 | 60 |
|
61 | 61 | ### 5. Embedding backfill progress & resumability |
62 | | -- [ ] Track backfill progress in `maintenance_state` table (last_frame_id, total, completed) |
63 | | -- [ ] Resume from last checkpoint on daemon restart (not re-scan full table) |
| 62 | +- [x] Track backfill progress in `maintenance_state` table (last_frame_id, total, completed) |
| 63 | +- [x] Resume from last checkpoint on daemon restart (not re-scan full table) |
64 | 64 | - [ ] Add `--force-reembed` flag to re-generate embeddings for changed frames |
65 | | -- [ ] Report backfill % in `stackmemory daemon status` |
| 65 | +- [x] Report backfill % in `stackmemory daemon status` |
66 | 66 |
|
67 | 67 | ### 6. Hybrid search score normalization |
68 | | -- [ ] Normalize BM25 scores to 0-1 range using min-max within result set |
69 | | -- [ ] Normalize vector distances to 0-1 similarity using max distance |
70 | | -- [ ] Apply Reciprocal Rank Fusion (RRF) as alternative to weighted sum |
| 68 | +- [x] Normalize BM25 scores to 0-1 range using min-max within result set |
| 69 | +- [x] Normalize vector distances to 0-1 similarity using max distance |
| 70 | +- [x] Apply Reciprocal Rank Fusion (RRF) as alternative to weighted sum |
71 | 71 | - [ ] A/B compare weighted-sum vs RRF in retrieval_log |
72 | 72 |
|
73 | 73 | ### 7. Remote infinite storage (S3/GCS cold tier) |
74 | | -- [ ] `StorageTierManager`: hot (SQLite) → warm (compressed SQLite) → cold (S3/GCS) |
| 74 | +- [x] `StorageTierManager`: hot (SQLite) → cold (S3/GCS) with archive/rehydrate |
75 | 75 | - [ ] Background migration: frames older than N days with no recent access → cold |
76 | | -- [ ] On-demand rehydration: transparent fetch from cold tier on access |
77 | | -- [ ] Config: `storage.coldTier.provider`, `storage.coldTier.bucket`, `storage.coldTier.migrationDays` |
| 76 | +- [x] On-demand rehydration: transparent fetch from cold tier on access |
| 77 | +- [x] Config: `coldTierProvider`, `coldTierBucket`, `coldTierMigrationAgeDays` |
78 | 78 | - [ ] CLI `stackmemory storage stats` — per-tier frame counts and sizes |
79 | 79 |
|
80 | 80 | ### 8. Performance optimization (<100ms p50 retrieval) |
81 | | -- [ ] Add composite index on `frames(project_id, created_at DESC)` if missing |
82 | | -- [ ] Profile FTS5 + vec queries with `EXPLAIN QUERY PLAN` |
83 | | -- [ ] Benchmark: p50/p95/p99 retrieval latency with 1k/10k/100k frames |
84 | | -- [ ] Add `PRAGMA mmap_size` for memory-mapped I/O on large DBs |
| 81 | +- [x] Add composite index on `frames(project_id, created_at DESC)` if missing |
| 82 | +- [x] Profile FTS5 + vec queries with `EXPLAIN QUERY PLAN` |
| 83 | +- [x] Benchmark: p50/p95/p99 retrieval latency with 1k/10k/100k frames |
| 84 | +- [x] Add `PRAGMA mmap_size` for memory-mapped I/O on large DBs |
85 | 85 | - [ ] Connection pooling for concurrent reads (WAL mode allows parallel readers) |
86 | 86 |
|
87 | 87 | ### 9. Multi-repository support |
88 | | -- [ ] `project_registry` table: project_id, repo_path, display_name, created_at |
| 88 | +- [x] `project_registry` table: project_id, repo_path, display_name, created_at |
89 | 89 | - [ ] `stackmemory projects list/add/remove` CLI commands |
90 | | -- [ ] Scoped search: `--project <name>` flag on all search/context commands |
| 90 | +- [x] Scoped search: `--project <name>` flag on all search/context commands |
91 | 91 | - [ ] Cross-project search: `stackmemory search --all-projects "query"` |
92 | 92 | - [ ] MCP tool: `switch_project` to change active project context |
93 | 93 |
|
|
0 commit comments