Skip to content

Feat/v050 phase 1 retrieval#5

Merged
AdityaVG13 merged 10 commits into
masterfrom
feat/v050-phase-1-retrieval
Apr 7, 2026
Merged

Feat/v050 phase 1 retrieval#5
AdityaVG13 merged 10 commits into
masterfrom
feat/v050-phase-1-retrieval

Conversation

@AdityaVG13
Copy link
Copy Markdown
Owner

No description provided.

AdityaVG13 and others added 10 commits April 6, 2026 18:55
Adds two new functions to db.rs:
- auto_repair(path, timestamp): dumps all data tables from a corrupted DB
  via SELECT *, rebuilds a fresh DB with initialize_schema(), repopulates
  FTS, verifies integrity, then atomically renames old -> .corrupt.<ts>
  and new -> original path. Never deletes the corrupted original.
- quick_check(conn): fast PRAGMA quick_check wrapper for runtime B-tree checks.

Includes unit tests for both: test_quick_check_clean_db and
test_auto_repair_recovers_data (writes then corrupts a temp DB file).
state.rs:
- Adds db_corrupted: Arc<AtomicBool> field to RuntimeState.
- initialize(): runs PRAGMA integrity_check before accepting requests,
  logs "[cortex] DB integrity: OK" on clean, calls auto_repair() on
  corruption. On repair success, reopens and continues normal startup.
  On repair failure, continues in degraded mode with db_corrupted=true.

main.rs:
- Adds 30-minute background task that runs quick_check() on db_read.
  Sets db_corrupted=true on failure with prominent stderr warning.
  Clears the flag if a subsequent check passes (post-manual repair).

handlers/health.rs:
- Exposes db_corrupted in /health response.
- Sets status="degraded" and degraded=true when db_corrupted is set.
…rolling backups

Phase 5C: Crash-Safe WAL Handling
- WAL checkpoint interval reduced from 60s to 10s (limits data loss to <10s on crash)
- WAL recovery on startup via PRAGMA wal_checkpoint(TRUNCATE)
- Added documentation comment explaining why synchronous=NORMAL is safe with WAL

Phase 5B: Rolling Backups
- Daily automatic backup in WAL checkpoint loop (checks >24h since last backup)
- Backup rotation: keep max 7 daily backups, remove oldest
- \cortex backup\ CLI command for manual backup with forced WAL checkpoint
- \cortex restore <file>\ CLI command with:
  - Pre-restore backup (preserves current DB as backup)
  - Integrity verification before accepting restore
  - Automatic rollback if integrity check fails
  - Safety checks for running daemon

All changes tested: cargo test passes (67 tests), cargo clippy passes.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Implements rrf_fuse(lists, k) -- Reciprocal Rank Fusion per Cormack et al.
Score = sum(1/(k+rank+1)) across lists, k=60.0. Returns descending by fused score.
Five unit tests covering: single list, two-list agreement, middle promotion,
empty input, empty list. Wire-up deferred to Task 1.5 (pipeline integration).
…1.4)

Phase 1 Task 1.4: Compound Scoring Function

Added three helper functions to calculate compound scores combining RRF rank, importance, and recency:

- days_since(created_at): Calculate elapsed days from ISO 8601 timestamp, handling invalid timestamps gracefully (returns MAX)
- normalize(importance): Normalize DB score field (typically 0-100) to 0.0-1.0 range with clamping
- compound_score(rrf, importance, created_at): Compound formula = rrf * 0.6 + importance_norm * 0.2 + recency * 0.2
  Recency uses 21-day half-life: exp(-days/30)

All functions marked with #[allow(dead_code)] -- wire-up into pipeline is Task 1.5's responsibility.

Added 3 unit tests:
- test_days_since: Validates day calculations for today, yesterday, and invalid dates
- test_normalize: Verifies normalization and clamping for 0-100 range and edge cases
- test_compound_score: Tests compound score calculations with high/low RRF, importance, and recency

All 81 tests pass. Zero clippy warnings.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@AdityaVG13 AdityaVG13 merged commit c14374b into master Apr 7, 2026
5 checks passed
@AdityaVG13 AdityaVG13 deleted the feat/v050-phase-1-retrieval branch April 14, 2026 07:41
AdityaVG13 added a commit that referenced this pull request Apr 24, 2026
Scoped low-risk deletion of two modules with zero call sites. Confirmed
unreferenced across the full workspace (grep on crate::logging,
crate::mcp_stdio, logging::, mcp_stdio::, log_line, log_path — each
returned only self-references inside the files themselves).

Per scope-lock.md Question #5 rules:
  - Cap ~300 LOC (this pass: 166 LOC removed, well under)
  - Per-file risk annotation (see table below)
  - No high-risk deletes (both classified low-risk)
  - Separate commit (this one)

Per-file risk table:

  daemon-rs/src/logging.rs               deleted   LOW risk
    27 LOC. Two helpers (log_path, log_line), both behind
    #[allow(dead_code)]. Never called. Superseded by tracing/eprintln
    diagnostic paths used throughout the daemon. Git history preserves
    if a file-based audit log is ever reintroduced.

  daemon-rs/src/mcp_stdio.rs             deleted   LOW risk
    139 LOC. Stdio-transport MCP runner. Module was declared with
    #[allow(dead_code)] guarding the whole module and no external
    import. Superseded by the axum HTTP MCP path
    (handlers/mcp.rs::handle_mcp_message_with_caller) called from
    server.rs. If a future release wants a pure-stdio MCP binary,
    recover from git and wire it as a separate bin target.

  daemon-rs/src/main.rs                  edited    LOW risk
    Removed `mod logging;`, `mod mcp_stdio;`, and the
    `#[allow(dead_code)]` attr that was masking mcp_stdio's dead state.

Intentional #[allow(dead_code)] left in place (false positives per
Repowise classification — kept for discoverability or future use):
  - auth.rs:31,383 — path-resolver internals
  - state.rs:42,236,252,283 — event payload, mcp_sessions cache,
    db_path, write_buffer_path (each with inline comment explaining
    why retained)
  - Struct-field annotations in conflict.rs, crystallize.rs, db.rs,
    diary.rs, feedback.rs, indexer.rs, recall.rs, setup.rs — all
    correspond to schema fields currently read through generic queries

Validation:
  cargo check --manifest-path daemon-rs/Cargo.toml                 # clean
  cargo clippy --manifest-path daemon-rs/Cargo.toml --all-targets
    -- -D warnings                                                  # clean
  cargo test --manifest-path daemon-rs/Cargo.toml                   # 457/457
    (434 unit + 5 doc + 11 integration + 7 recall-benchmark)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant