Skip to content

Rewrite CLAUDE.md and add Claude Code rules#154

Open
tlee732 wants to merge 2 commits into
indextables:mainfrom
tlee732:cleanup/claude-md
Open

Rewrite CLAUDE.md and add Claude Code rules#154
tlee732 wants to merge 2 commits into
indextables:mainfrom
tlee732:cleanup/claude-md

Conversation

@tlee732
Copy link
Copy Markdown
Contributor

@tlee732 tlee732 commented Apr 4, 2026

Summary

  • Rewrote CLAUDE.md from 1,690 lines to 75 lines — replaced cumulative changelog/feature diary with a concise project guide
  • Added 5 Claude Code rules in .claude/rules/ for consistent contributor behavior

Methodology

We ran the /claude-md-management:claude-md-improver audit skill, which:

  1. Discovered all CLAUDE.md files in the repo
  2. Evaluated each against a quality rubric (commands, architecture clarity, non-obvious patterns, conciseness, currency, actionability)
  3. Produced a scored quality report (see table below)
  4. Recommended targeted updates based on gaps found

The before/after scores below come directly from that audit. The rewrite was then reviewed interactively — we discussed what to keep from the old version (known limitations, debug env var, Python parity context), what to extract into rules (memory constants, JNI boundary), and what to drop entirely (changelog entries, stale benchmarks, local machine paths).

CLAUDE.md Quality Assessment

Criterion Before (F: 25/100) After (B+: 85/100)
Commands/workflows 5/20 — no build/test commands 18/20 — all make targets documented
Architecture clarity 8/20 — buried under "BREAKTHROUGH" headers 17/20 — directory map, API package table
Non-obvious patterns 2/15 — hard to find in 1,690 lines 13/15 — known limitations, debug env var
Conciseness 0/15 — 1,690 lines of feature announcements 14/15 — 75 lines, information-dense
Currency 5/15 — dates spanning Aug 2025 - Jan 2026, never pruned 13/15 — no stale status tracking
Actionability 5/15 — no "how to build/test/contribute" 10/15 — copy-paste ready commands

What was removed

  • Cumulative feature announcements with emoji headers ("REVOLUTIONARY BREAKTHROUGH", "MISSION ACCOMPLISHED")
  • Hardcoded test counts and coverage percentages that go stale
  • Azure OAuth setup guide (belongs in docs/)
  • Performance benchmark numbers (go stale)
  • References to local machine paths (file:/Users/schenksj/...)

What was added

  • Build & test commands (from Makefile)
  • Project structure directory map
  • Java API package reference table
  • Known limitations section (RangeAggregation not implemented, MillionRecordBulkRetrievalTest issue)
  • 5 rules in .claude/rules/:
    • memory-constants.md — always use Index.Memory constants, never hardcode heap sizes
    • jni-boundary.md — Java handles I/O, Rust handles search/indexing
    • auto-review.md — automatic code review + test coverage after every coding task
    • research-before-code.md — research and propose options before writing code
    • test-coverage-workflow.md — structured test planning with severity-prioritized table

Rationale

CLAUDE.md is re-read on every Claude Code turn. A 1,690-line changelog wastes context window on every message and makes it harder for Claude to find actionable information. The rewrite optimizes for "what do I need to know to work in this repo" rather than "what features have been shipped."

🤖 Generated with Claude Code

Replace 1,690-line cumulative changelog with 75-line project guide
covering build/test commands, directory structure, API packages,
known limitations, and architecture notes.

Add 5 Claude Code rules: memory constants, JNI boundary, auto-review,
research-before-code, and test coverage workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@schenksj schenksj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Please address the conflicts and we can merge!

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@schenksj schenksj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall

Big improvement in signal-to-noise. The old CLAUDE.md is ~1,690 lines of cumulative "REVOLUTIONARY BREAKTHROUGH" changelog entries dated across Aug 2025 – Jan 2026 — dense with things that are either stale, duplicated in docs/, or feature announcements rather than actionable guidance. Trimming to ~75 lines is the right call, and since this file is re-read every turn the context savings compound.

The new structure (Build & Test → Project Structure → Java API Packages → Key Concepts → Known Limitations → Architecture Notes) is the right skeleton, and the rule files are a good extraction point for reusable guidance.

Verified accurate

  • Makefile targetscompile, test, test-cloud, test-all, package, clean, setup all exist.
  • Rust directory map — matches reality.
  • Most Java API packagescore, split, query, aggregation, batch, iceberg, delta, parquet, memory, config all exist.
  • Memory constants (MIN=15MB, DEFAULT=50MB, LARGE=128MB, XL=256MB) — match the codebase.
  • Debug env var TANTIVY4JAVA_DEBUG=1 — correct.
  • Package name io.indextables.tantivy4java — correct.

Issues to fix before merge

1. (Blocking) The RangeAggregation "known limitation" is factually wrong

The new CLAUDE.md says:

RangeAggregation — not implemented in the native Rust layer. Returns empty results.

This is incorrect. RangeAggregation is fully implemented:

  • native/src/split_searcher/aggregation_arrow_ffi.rs:412range_to_record_batch produces a real Arrow RecordBatch with key/doc_count/from/to columns plus sub-agg metrics, driven by BucketResult::Range { buckets } from Tantivy.
  • src/test/java/io/indextables/tantivy4java/RangeAggregationTest.java has hard assertions that would fail if results were empty — e.g. assertEquals(4L, rangeCounts.get("cheap")), assertEquals(3L, rangeCounts.get("budget")), total-doc-count checks, etc.

There is a real nearby limitation: extract_inner_buckets (aggregation_arrow_ffi.rs:603) returns Vec::new() for BucketResult::Range { .. }, meaning Range cannot be used as a nested inner bucket under another bucket aggregation for flattening purposes (the comment says "Terms only for now"). If this is what was meant, the correct phrasing is something like:

RangeAggregation as a nested sub-bucket — flattening Range buckets inside a parent bucket aggregation is not supported (see extract_inner_buckets in aggregation_arrow_ffi.rs). Top-level RangeAggregation works correctly.

Publishing "Returns empty results" is worse than not listing it at all — it will mislead contributors into avoiding a working path, and it directly contradicts a passing test suite.

2. (Minor) Missing packages in the Java API table

The table lists 10 packages; the actual layout also has:

  • result — SearchResult, Hit, aggregation results (user-facing, should be listed)
  • util — utilities
  • filter — filter APIs
  • split/merge — merge subpackage

examples is fine to omit. I'd add result at minimum since it's part of the user-facing API surface, and filter if it's non-trivial.

On the rules files

All five read well and are the right size (each <30 lines).

  • memory-constants.md — exact, grounded in a real historical incident. Good.
  • jni-boundary.md — captures a non-obvious decision rule. Good.
  • auto-review.md — This is an opt-in behavior change: every coding task will now auto-spawn a pr-review-toolkit:code-reviewer agent and run the test-coverage workflow. That's a meaningful tool/context cost on every task, not a free convention. Worth confirming the repo owner actually wants that default. The "When NOT to trigger" list is reasonable but doesn't include "small single-line fixes," which may be the highest-frequency case.
  • research-before-code.md — Strong rule. "NEVER start coding immediately" + "Propose 2-3 approaches with pros/cons" means a one-line helper fix turns into a research memo. Consider softening to "For non-trivial changes…" and leaning on the existing trivial-fix exception at the bottom.
  • test-coverage-workflow.md — The severity-prioritized table format is genuinely useful. Minor: "Review coverage once — don't write tests piecemeal" will bite in interactive debugging loops where the test set naturally evolves.

Copy link
Copy Markdown
Collaborator

@schenksj schenksj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see comments in the conversation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants