indextables · tlee732 · Apr 4, 2026 · Apr 5, 2026
diff --git a/.claude/rules/auto-review.md b/.claude/rules/auto-review.md
@@ -0,0 +1,20 @@
+# Automatic Code Reviews
+
+After completing a coding task (or a phase within a multi-phase plan), automatically run these reviews before reporting completion to the user:
+
+1. **Code review** — launch the `pr-review-toolkit:code-reviewer` agent, focused on the files changed in the task/phase
+2. **Test coverage** — follow the test coverage workflow rule: identify primary use case, edge cases, classify tests by type, and present a severity-prioritized table for approval
+
+Run step 1, then step 2. Present combined results to the user.
+
+## When to trigger
+
+- After finishing a single coding task
+- After each phase of a multi-phase plan (not just at the end)
+- After fixing issues found by a previous review round
+
+## When NOT to trigger
+
+- Config-only changes (settings.json, CLAUDE.md, rule files)
+- Pure research / read-only tasks with no code changes
+- When the user explicitly says to skip reviews
diff --git a/.claude/rules/jni-boundary.md b/.claude/rules/jni-boundary.md
@@ -0,0 +1,17 @@
+# JNI Boundary
+
+Java and Rust have clear responsibilities. Don't cross the boundary unnecessarily — every JNI call has overhead.
+
+**Java side** handles:
+- I/O orchestration (S3 downloads/uploads, file management)
+- Process management (merge binary extraction, lifecycle)
+- Configuration and API surface
+- Cache coordination (SplitCacheManager)
+
+**Rust side** handles:
+- Search, indexing, and merging (Tantivy core operations)
+- Disk cache storage (LRU, compression, manifest)
+- Batch retrieval optimization
+- Aggregation computation
+
+When adding new functionality, ask: "Does this need Tantivy/native performance, or is it orchestration?" Put it on the right side.
diff --git a/.claude/rules/memory-constants.md b/.claude/rules/memory-constants.md
@@ -0,0 +1,12 @@
+# Memory Constants
+
+Always use `Index.Memory` constants for heap allocation. Never hardcode byte values.
+
+| Constant | Size | Use when |
+|----------|------|----------|
+| `Memory.MIN_HEAP_SIZE` | 15MB | Minimum required by Tantivy for any operation |
+| `Memory.DEFAULT_HEAP_SIZE` | 50MB | Standard usage |
+| `Memory.LARGE_HEAP_SIZE` | 128MB | Bulk operations, heavy indexing |
+| `Memory.XL_HEAP_SIZE` | 256MB | Very large indices, high-performance scenarios |
+
+Hardcoded heap sizes caused "memory arena needs to be at least 15000000" errors across 39+ test files. The constants were introduced to eliminate this entire class of bugs.
diff --git a/.claude/rules/research-before-code.md b/.claude/rules/research-before-code.md
@@ -0,0 +1,11 @@
+# Research Before Code
+
+When discussing any code change, NEVER start coding immediately. Always follow this sequence:
+
+1. **Research** — Read the relevant files, understand the current implementation, check how similar things are done elsewhere in the codebase
+2. **Architecture** — Consider how the change fits into the existing architecture, what it touches, what it could break
+3. **Options** — Propose 2-3 approaches with pros/cons for each
+4. **Recommendation** — State which option you'd pick and why
+5. **Wait** — Get explicit approval before writing any code
+
+This applies to bug fixes, new features, refactors, and infrastructure changes. The only exceptions are trivial typo fixes or config value changes where there's only one obvious approach.
diff --git a/.claude/rules/test-coverage-workflow.md b/.claude/rules/test-coverage-workflow.md
@@ -0,0 +1,28 @@
+# Test Coverage Workflow
+
+When writing or reviewing tests for a code change, follow this sequence:
+
+1. **Understand the primary use case** — Ask what this code is supposed to do in production. What's the happy path?
+2. **Identify edge cases** — Null/empty inputs, boundary values, concurrent access, error conditions, malformed input
+3. **Classify tests by type:**
+   - **Unit tests** — isolated logic, mocked dependencies, fast
+   - **Integration tests** — real dependencies (S3, Kafka, Spark), tagged separately
+   - **E2E tests** — full pipeline validation, run last
+4. **Review coverage once** — don't write tests piecemeal. Compile the full list first.
+5. **Present to user** — Show a table with each test, its type, and severity before writing any test code:
+
+   | Test | Type | Severity | Why |
+   |------|------|----------|-----|
+   | `search_withNullQuery_throwsIllegalArgument` | Unit | Critical | Null queries crash in production |
+   | `search_emptyIndex_returnsZeroResults` | Unit | High | Common cold-start scenario |
+   | `search_withKafkaTimeout_retriesAndSucceeds` | Integration | Medium | Transient failures in prod |
+
+   Severity levels: **Critical** (data loss, crash), **High** (wrong results, degraded UX), **Medium** (edge case, unlikely path), **Low** (cosmetic, defensive)
+
+6. **Wait for approval** — User may reprioritize or cut scope before you start writing tests.
+
+## When to trigger
+
+- After completing any coding task (as part of the auto-review flow)
+- When the user asks for tests explicitly
+- After each phase of a multi-phase plan