Skip to content

docs: post-sprint sweep — analyzer docs, installation, semconv, release-testing playbooks#67

Merged
anilmurty merged 4 commits into
mainfrom
task-5-docs-sweep
May 28, 2026
Merged

docs: post-sprint sweep — analyzer docs, installation, semconv, release-testing playbooks#67
anilmurty merged 4 commits into
mainfrom
task-5-docs-sweep

Conversation

@anilmurty
Copy link
Copy Markdown
Contributor

@anilmurty anilmurty commented May 28, 2026

Cross-cutting docs pass after the May-26 strategy-pivot sprint (PR #66) landed. Covers everything the new code surfaces but the original PR didn't document yet — plus the housekeeping that fell out of the sprint queue.

Commits

4 commits, no code changes. All four are docs-only:

Commit What
0d2895d Task-5 consistency sweep across docs/
2b3037c CLAUDE.md "Further Reading" expansion + remove sprint task queue
0ae522d Drop sprint-internal v1.1 spec; keep docs/internal/specs/ for future canonical specs
c4d2fea Rewrite manual release-testing playbooks for v0.3.x

What landed

New docs

  • docs/installation.md — base install vs optional-extras matrix. Documents tokenjam[bloat] (~2GB LLMLingua/torch extra used by the Trim analyzer) with the rationale for keeping it out of base; lists all framework adapter extras plus [mcp] / [dev]. New first-run guide.
  • docs/optimize/downsize.md — the one optimize product that didn't have its own page. Documents the model-downgrade analyzer with the candidate-pair table, plan-tier-aware rendering branches, and the mandatory caveat.
  • docs/backfill/helicone.md and docs/backfill/otlp.md — match the langfuse.md structure (modes / field mapping / idempotency / limitations).
  • docs/architecture.md → "OTel semconv extensions" section — full derivation rules for tokenjam.billing_account (span attribute) + tokenjam.plan_tier (session-level), pricing_mode mapping, and why plan_tier lives on SessionRecord rather than each span.

Updated docs

  • docs/backfill/overview.md — lists all four sources (claude-code / langfuse / helicone / otlp) with partnership-posture framing.
  • docs/optimize/cache.md / script.md / trim.md — consistent treatment: heading leads with the product name, CLI flag example up front, "See also" cross-links to the other three product pages.
  • CHANGELOG.md — reorganised ## Unreleased as a coherent release narrative (headline → product groupings → Changed / Internal). Facts unchanged.
  • CLAUDE.md
    • Expanded "Further Reading" from a single architecture.md pointer into a structured list pointing at every new docs page.
    • Cross-reference from the core/models.py Key-Modules entry to the new "OTel semconv extensions" section in architecture.md.
    • Drop the bullet referencing docs/internal/specs/v1.1-honest-output.md (file removed; see below).

Sprint-internal cleanup

  • Removed docs/internal/tasks/ (14 files — the sprint's per-task specs + decisions-locked.md). Sprint is merged via Strategy pivot: Layer-9 cost-optimization analyzers + ingest adapters #66; the per-task specs were ephemeral by design and risk drift if kept.
  • Removed docs/internal/specs/v1.1-honest-output.md (the spec is now captured in the merged code + docs/optimize/downsize.md + CHANGELOG). Kept docs/internal/specs/ (as .gitkeep) so future features can drop canonical specs there when warranted.

Manual release-testing playbooks rewritten

tests/manual-pre-release-testing.md and tests/manual-new-release-tests.md were missing coverage of every feature added in the sprint. Both rewritten:

  • Pre-release — restructured into 14 numbered sections. New blocks for:
    • Cost-optimization analyzers (one block per product — Downsize / Cache / Script / Trim — covering prereqs, prereq-missing fallback paths, caveat enforcement)
    • Plan-tier-aware rendering (verify subscription headers don't contain "spend" against a dollar figure; --reconfigure; unknown-tier suppression)
    • Backfill adapters (against committed fixtures so the smoke is hermetic — no live API keys needed for langfuse / helicone / otlp; verifies idempotency on re-run)
    • Period comparison (--compare keywords + explicit ranges)
    • Config export (--export-config claude-code verifies caveat comments + absence of --apply)
    • Policy list
    • Trimmed: web-UI section (was 5 numbered subsections with sidebar/theme/CSS bullets — now one spot-check block, since those are one-time UI work) and redundant tj uninstall && rm -rf && onboard cycles between steps.
  • Post-release smoke — tightened into a true smoke check. Every analyzer is exercised (not deep behavior — that lived in pre-release). All three backfill adapters smoke against fixtures. One canonical "pass criteria" table at the bottom mapped to step numbers.

Honest gap noted at handoff: the playbooks reliably exercise Downsize + Cache-efficacy with positive findings, but Cache-recommend / Script / Trim mostly hit their negative / fallback paths because the example data doesn't satisfy their prereqs (capture.prompts off; <20 identical sessions; bloat extra not installed). A synthetic optimize demo to fill that gap will land in a separate PR.

Honesty audit

Clean. cmd_optimize.py scanned — no quality-equivalence claims, "spend" only appears in the api pricing-mode branch, mandatory caveat preserved across all render modes.

Test plan

  • No code changes — automated test suite unaffected
  • All doc cross-links resolve (verified by inspection)
  • git rm cleanups don't break anything: docs/internal/specs/ survives via .gitkeep; no references to removed files left in CLAUDE.md or docs/
  • Manual: run through manual-pre-release-testing.md end-to-end on a fresh checkout of main after merge

🤖 Generated with Claude Code

anilmurty and others added 4 commits May 28, 2026 10:58
…rative

Cross-cutting docs pass after Waves 1-3 + Task 4 work landed in the
"Strategy pivot" commit. No code changes.

Backfill:
- Update docs/backfill/overview.md to list all four sources (claude-code,
  langfuse, helicone, otlp), add partnership-posture framing, drop stale
  "Wave 3 lands later" line.
- Add docs/backfill/helicone.md and docs/backfill/otlp.md matching the
  langfuse.md structure (modes, field mapping, idempotency, limitations).

Installation:
- Add docs/installation.md documenting the tokenjam[bloat] optional extra
  (~2GB torch + transformers) plus the full optional-extras matrix.

Architecture:
- Add "OTel semconv extensions: billing_account and plan_tier" section
  with the pricing_mode derivation rules and rationale for the session-
  level (not span-level) storage of plan_tier.

Optimize product docs (four-product naming consistency):
- Tighten cache.md / trim.md / script.md headings to lead with product
  name (Cache / Trim / Script) + CLI flag examples.
- Add docs/optimize/downsize.md to round out the set (was previously
  only documented inline in the analyzer).
- Add "See also" cross-links to all four product pages.

CHANGELOG narrative:
- Reorganise ## Unreleased as a coherent release story (headline, then
  product groupings: analyzers, plan-tier rendering, backfill, other
  surfaces, docs; then Changed / Internal). Facts unchanged.

Honesty audit: clean. cmd_optimize.py scanned — no quality-equivalence
claims; "spend" only appears in the api pricing-mode branch; mandatory
caveat preserved across all render modes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLAUDE.md:
- Expand "Further Reading" from a single architecture.md pointer into a
  structured list covering the new docs landed in the Task-5 consistency
  sweep: installation.md (optional extras matrix incl. tokenjam[bloat]),
  configuration.md (content-capture privacy section), the four optimize
  product pages (downsize / cache / script / trim), backfill/overview.md,
  policy/overview.md, and docs/internal/specs/v1.1-honest-output.md.
- Cross-reference architecture.md's new "OTel semconv extensions" section
  from the core/models.py description in Key Modules so readers find the
  pricing_mode derivation rules from the entry point they're most likely
  to land on first.

Remove docs/internal/tasks/ — the May 26 sprint's per-task specs +
decisions-locked.md served their purpose during the sprint and are
captured in the merged PR (#66) history. Keeping the v1.1 honest-output
spec under docs/internal/specs/ because Wave 1 / the production code
still references it as the canonical source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…future use

The v1.1 honest-output spec served its purpose during the May 26 sprint
(Wave 1 / cmd_optimize.py implementation). The behavior is now captured
in the merged code, in docs/optimize/downsize.md, and in CHANGELOG.md.
Keeping the spec file would just create drift risk.

Preserve the docs/internal/specs/ folder (via .gitkeep) so future
features can drop a canonical spec there when one is warranted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The May-26 sprint added six new analyzer-level surfaces (Downsize / Cache×2
/ Script / Trim), three new backfill adapters (Langfuse / Helicone / OTLP),
a period-comparison flag, config export, plan-tier-aware rendering, and
the policy preview. None of that was covered in the existing playbooks.

manual-pre-release-testing.md:
- Restructured into numbered sections (Install / Auto tests / Onboard /
  Populate / CLI smoke / Analyzers / Plan-tier rendering / Backfill /
  Compare / Export / Policy / Server / UI / Cleanup).
- New "Cost-optimization analyzers" section with per-analyzer smoke
  blocks. Each covers prereqs, prereq-missing fallback (cache-recommend
  hint when capture.prompts off; Trim install hint when extra not
  installed), and caveat verification.
- New "Plan-tier-aware rendering" section exercising the v1.1 honest-
  output reframing — subscription headers must never contain "spend"
  against a dollar figure.
- New "Backfill adapters" section uses the committed test fixtures so
  the smoke is hermetic (no live API keys needed for langfuse/helicone/
  otlp). Verifies idempotency on re-run.
- New "Period comparison" section verifying ▲/▼ diff rendering and the
  --compare last-month / explicit-range branches.
- New "Config export" section verifying caveat comments are baked into
  the JSONC snippet and confirming the absence of --apply.
- New "Policy list" section.
- Trimmed the web-UI section from 5 numbered checks with bullet-level
  theme/sidebar/CSS verification down to a single spot-check block.
  Those are one-time UI checks, not per-release regression items.
- Trimmed redundant tj uninstall && rm -rf && onboard cycles between
  steps; kept one canonical clean-slate at the top.

manual-new-release-tests.md:
- Tightened into a true post-release smoke check. Verifies every analyzer
  runs (not deep behavior — that lived in pre-release), checks the
  caveat in JSON output, runs all three backfill adapters against the
  committed fixtures, and confirms the policy preview renders.
- Cut the duplicated "what to look for" table back to one summary at the
  bottom that maps cleanly to step numbers.
- Kept Claude Code / Codex integration smoke (these are user-facing
  flows that broke historically when they regressed silently).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anilmurty anilmurty changed the title docs: Task 5 consistency sweep — backfill, installation, semconv, nar… docs: post-sprint sweep — analyzer docs, installation, semconv, release-testing playbooks May 28, 2026
@anilmurty anilmurty merged commit 7f5929b into main May 28, 2026
4 checks passed
@anilmurty anilmurty deleted the task-5-docs-sweep branch May 28, 2026 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant