Skip to content

feat: v2-aware explain with compact transcript support#864

Open
pfleidi wants to merge 48 commits intomainfrom
feat/checkpoints-v2-explain
Open

feat: v2-aware explain with compact transcript support#864
pfleidi wants to merge 48 commits intomainfrom
feat/checkpoints-v2-explain

Conversation

@pfleidi
Copy link
Copy Markdown
Contributor

@pfleidi pfleidi commented Apr 7, 2026

Summary

The explain command had no v2 checkpoint awareness — it only read from the v1 metadata branch. With checkpoints_v2 enabled, checkpoints are dual-written to both stores, and explain needs to read from v2 to take advantage of the compact transcript format and to remain functional as v2 becomes the primary store.

  • v2/v1 fallback resolution: explain --checkpoint resolves checkpoints from v2 first (when preferred), falling back to v1. Listing merges both stores so pre-v2 checkpoints remain discoverable by prefix during the transition period.
  • Compact transcript display: v2 stores a normalized transcript.jsonl alongside the raw transcript. explain now prefers this compact format for human-readable output (default/verbose modes), producing cleaner intent extraction and transcript formatting. --full retains raw semantics. The compact override is applied after --generate so the summarizer always receives the raw transcript it expects.
  • Dual-write summary generation: explain --generate writes summaries to both v1 (required) and v2 (best-effort). This required implementing V2GitStore.UpdateSummary.
  • Compact transcript parsing: Centralized into transcript/compact/parse.go so both formatTranscriptBytes and extractPromptsFromTranscript can fall back to compact format when raw parsing fails.
  • Prompt attribution consistency: PendingPromptAttribution (set during UserPromptSubmit but only moved to PromptAttributions during SaveStep) is now included in persisted prompt_attributions metadata, matching what calculateSessionAttributions actually uses. Without this, mid-turn commits (e.g., Codex) had inconsistent diagnostic metadata.

Test plan

  • mise run fmt && mise run lint && mise run test:ci passes
  • Unit tests for v2 explain resolution, compact transcript parsing, merged listing, summary dual-write, prompt attribution persistence
  • Integration tests for v2-enabled fallback to v1 and v2-preferred dual-write checkpoint explain
  • E2E canary (vogon + roger-roger) all green

Note

Medium Risk
Changes the explain --checkpoint data source selection and transcript rendering paths to prefer v2 with fallback to v1; mistakes could cause missing/incorrect output or regress summary persistence during the v2 migration.

Overview
Makes explain --checkpoint v2-aware by resolving committed checkpoints via a new CommittedReader abstraction and ResolveCommittedReaderForCheckpoint, preferring v2 when checkpoints_v2 is enabled but falling back to v1 on missing/corrupt v2 data (plus a similar resolver for raw transcript bytes).

Adds v2 read capabilities needed by explain: V2GitStore.ListCommitted, /main-only reads via ReadSessionMetadataAndPrompts and ReadSessionCompactTranscript, and v2 summary persistence via V2GitStore.UpdateSummary (with --generate persisting to v1 and best-effort to v2).

Improves default explain output for v2 by using the compact transcript.jsonl from /main (avoiding /full/* raw transcript fetch/rotation issues), and introduces a shared compact transcript parser (transcript/compact) as a fallback for intent extraction and transcript formatting. Also fixes persisted prompt attribution diagnostics to include PendingPromptAttribution for mid-turn commits.

Adds extensive unit + integration coverage for v2/v1 listing+resolution behavior, compact transcript parsing, and summary/attribution persistence.

Reviewed by Cursor Bugbot for commit 7119e1f. Configure here.

peyton-alt and others added 30 commits March 30, 2026 18:45
Pre-session dirty files (CLI config files from `entire enable`, leftover
changes from previous sessions) were incorrectly counted as human
contributions, deflating agent percentage.

Root cause: PA1 (first prompt attribution) captures worktree state at
session start. This data was used to correct agent line counts (correct)
but also added to human contributions (wrong).

Fix:
- Split prompt attributions into baseline (PA1) and session (PA2+)
- PA1 data still subtracted from agent work (correct agent calc)
- PA1 contributions excluded from relevantAccumulatedUser
- PA1 removals excluded from totalUserRemoved
- Include PendingPromptAttribution during condensation for agents
  that skip SaveStep (e.g., Codex mid-turn commits)
- Add .entire/ filter to attribution calc (matches existing PA filter)
- Fix wrapcheck lint errors in updateCombinedAttributionForCheckpoint

Verified end-to-end: 100% agent with config files committed alongside.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: b0cb4216f6bc
…ibution

Checkpoint package changes required by the attribution baseline fix:
- PromptAttributionsJSON field on WriteCommittedOptions and CommittedMetadata
- UpdateCheckpointSummary method on GitStore for multi-session aggregation
- CombinedAttribution field on CheckpointSummary
- Preserve existing CombinedAttribution during summary rewrites

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: b8963737336c
…arentCommitHash

Fixes all 4 issues from Copilot and Cursor Bugbot review:

1. Precompute parentCommitHash on postCommitActionHandler struct
   using ParentHashes[0] (avoids extra object read, no silent error)
2. Remove duplicated 6-line parentCommitHash computation from
   HandleCondense and HandleCondenseIfFilesTouched
3. Thread parentTree through condenseOpts/attributionOpts and use it
   for non-agent file line counting — ensures diffLines uses parent→HEAD
   (consistent with parentCommitHash file scoping) instead of
   sessionBase→HEAD which over-counted intermediate commit changes
4. Add ParentTreeForNonAgentLines test proving the fix (TDD verified:
   HumanAdded=8 without fix → HumanAdded=3 with fix)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 12f5c4373467
Three fixes for multi-session attribution:

1. Cross-session file exclusion: Thread allAgentFiles (union of all
   sessions' FilesTouched) through the attribution pipeline. Files
   created by other agent sessions are no longer counted as human work.

2. Exclude .entire/ from commit session fallback: When the commit
   session has no FilesTouched and falls back to all committed files,
   filter out .entire/ metadata created by `entire enable`.

3. PA1 baseline uses base tree for new sessions: New sessions
   (StepCount == 0) always diff against the base commit tree, not
   the shared shadow branch which may contain other sessions' state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 3790cba265e6
Add a committed-reader resolver that probes v2 first and falls back to v1 when metadata is missing, matching existing resume/rewind behavior. Include focused tests to lock in selection semantics and clearer store naming.

Entire-Checkpoint: bbae6269bfe3
Add a checkpoint helper that resolves raw session logs with v2-first and v1 fallback semantics. Include focused tests for v2 hit, v2 miss fallback, and v2-disabled behavior.

Entire-Checkpoint: afdf6344056a
Route explain checkpoint reads through shared v2-first, v1-fallback helpers to align with resume/rewind behavior. Add v2-only explain tests and v2 ListCommitted support for checkpoint prefix resolution.

Entire-Checkpoint: 96f2eec40ea4
Use v2 transcript.jsonl for explain formatting and intent extraction while keeping --raw-transcript on raw full.jsonl with v2->v1 fallback. Add checkpoint test coverage for compact-transcript intent behavior.

Entire-Checkpoint: e92f790fc080
When checkpoints v2 is enabled, explain now merges v2 and v1 checkpoint lists so prefix resolution can still find v1-only checkpoints. Also prefer compact transcript.jsonl for explain intent/formatting while preserving raw transcript behavior.

Entire-Checkpoint: 84583d0326b0
This reverts commit 9d1f9d6.

Entire-Checkpoint: 6961155c2f6e
This reverts commit a218fd9.

Entire-Checkpoint: 351b67e79b90
This reverts commit 8c3c6c3.

Entire-Checkpoint: afbbb2fd44ff
Route explain checkpoint reads through shared v2-first, v1-fallback helpers to align with resume/rewind behavior. Add v2-only explain tests and v2 ListCommitted support for checkpoint prefix resolution.

Entire-Checkpoint: a20bfb7a992c
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tering

- Test AllAgentFiles cross-session exclusion in CalculateAttributionWithAccumulated
- Test committedFilesExcludingMetadata filters .entire/ paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The combined_attribution field now diffs parent→HEAD once and classifies
files as agent vs human based on the union of sessions with real
checkpoints (SaveStep ran). Filters .entire/ and .claude/ config paths.

Also adds ReadSessionMetadata for lightweight per-session metadata reads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mmit-inflation

Fix attribution inflation from intermediate commits
don't show multiple spaces for codex single line start message rendering
Move compact transcript read parsing into transcript/compact and reuse it from explain output formatting. Add direct v2 compact transcript read tests and compact parser tests while keeping raw transcript behavior unchanged.

Entire-Checkpoint: c512171cfbcb
Add integration tests that verify explain falls back to v1 for legacy checkpoints and prefers v2 data when dual-write checkpoints exist. Remove redundant v2-only path case and focus coverage on real supported modes.

Entire-Checkpoint: 82a312067746
marshalPromptAttributions was inserted between buildSessionMetrics'
doc comment and its declaration, causing Go to attach the wrong
doc comment to both functions.

Entire-Checkpoint: ad7ea548d575
pfleidi added 3 commits April 7, 2026 15:02
Copy state.PromptAttributions before appending PendingPromptAttribution
to avoid mutating the original slice when it has spare capacity.
Also fix stale test comment to reflect dual-write behavior.

Entire-Checkpoint: cddcccd67704
Summarization now always receives the raw transcript instead of the
compact format, which uses a different schema incompatible with
GenerateFromTranscript.

Entire-Checkpoint: 20e4ee566aef
Wrap errors from external packages, handle json.Unmarshal return
values, suppress ireturn for intentional interface return, and
suppress staticcheck for backward-compat deprecated field usage.

Entire-Checkpoint: d92141715d60
Copilot AI review requested due to automatic review settings April 7, 2026 23:31
@pfleidi pfleidi requested a review from a team as a code owner April 7, 2026 23:31
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Silent v1 error drops pre-v2 checkpoints from listing
    • Added a v1 error check in the v2-success merge path so v1 listing failures now surface instead of silently dropping pre-v2 checkpoints.

Create PR

Or push these changes by commenting:

@cursor push d94b37b032
Preview (d94b37b032)
diff --git a/cmd/entire/cli/explain.go b/cmd/entire/cli/explain.go
--- a/cmd/entire/cli/explain.go
+++ b/cmd/entire/cli/explain.go
@@ -368,6 +368,9 @@
 		}
 		return v1Committed, nil
 	}
+	if v1Err != nil {
+		return nil, fmt.Errorf("listing checkpoints: %w", v1Err)
+	}
 
 	// Merge: start with v2, add v1-only entries so pre-v2 checkpoints
 	// remain visible during the transition period.

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the explain command to be checkpoint v2-aware, including v2→v1 fallback resolution and preferential use of v2’s compact transcript format to keep explain functional and readable during the v1→v2 transition.

Changes:

  • Added v2-first (with v1 fallback) checkpoint resolution and merged v1/v2 committed listing for prefix discovery.
  • Prefer v2 transcript.jsonl (compact) for human-readable output while keeping --full and --raw-transcript on raw semantics.
  • Added best-effort dual-write summary persistence (v1 required + v2 optional) and persisted pending prompt attribution diagnostics.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
cmd/entire/cli/transcript/compact/parse.go Adds compact transcript parsing and condensed entry extraction.
cmd/entire/cli/transcript/compact/parse_test.go Unit tests for compact transcript parsing/condensing.
cmd/entire/cli/strategy/manual_commit_test.go Tests for persisting prompt attributions including pending attribution.
cmd/entire/cli/strategy/manual_commit_hooks.go Minor parent commit detection simplification.
cmd/entire/cli/strategy/manual_commit_condensation.go Persists pending prompt attribution in diagnostic metadata.
cmd/entire/cli/integration_test/explain_test.go Integration coverage for v2-enabled explain fallback/preference behavior.
cmd/entire/cli/explain.go Core v2-aware explain resolution, compact transcript preference, dual-write summary update, compact parsing fallback.
cmd/entire/cli/explain_test.go Unit tests for v2-only/dual-write explain behavior and merged listing.
cmd/entire/cli/checkpoint/v2_read.go Adds v2 ListCommitted and ReadSessionCompactTranscript.
cmd/entire/cli/checkpoint/v2_read_test.go Tests for reading compact transcripts and v2 summary update behavior.
cmd/entire/cli/checkpoint/v2_committed.go Implements V2GitStore.UpdateSummary for v2 metadata updates.
cmd/entire/cli/checkpoint/committed_reader_resolve.go Adds a shared v1/v2 committed reader resolver and raw log resolver.
cmd/entire/cli/checkpoint/committed_reader_resolve_test.go Tests for committed reader and raw log resolution behavior.

pfleidi added 2 commits April 7, 2026 16:41
The test validates that v2-only checkpoints fail at the v1 persistence
step, but it can't reach that code path when the claude CLI isn't
installed (e.g., CI). Skip gracefully instead of failing on the
unrelated summarizer error.

Entire-Checkpoint: c0fb2e7d2b82
When v2 listing succeeds but v1 fails, log the error and return
v2-only results instead of silently merging with a nil v1 slice.

Entire-Checkpoint: 1d4d5607ad1e
computermode
computermode previously approved these changes Apr 8, 2026
pfleidi added 2 commits April 7, 2026 17:37
Use v1Store directly for GetCheckpointAuthor so author info is
available for v2-resolved checkpoints during dual-write.

Entire-Checkpoint: c14b4c2515e7
Update doc comment to accurately describe the nil-summary and
error-based fallback conditions matching the v2 ReadCommitted contract.

Entire-Checkpoint: 48043fcee4e7
pfleidi added 5 commits April 8, 2026 11:24
…xplain

# Conflicts:
#	cmd/entire/cli/strategy/manual_commit_condensation.go
#	cmd/entire/cli/strategy/manual_commit_test.go
…eaderForCheckpoint

Entire-Checkpoint: 947ee58c91ad
gofmt strips nolint directives placed as standalone comment lines
above multi-line function signatures. Place it on the func line
itself so it survives formatting.

Entire-Checkpoint: 5a9d0bbf320e
Widen the v2→v1 fallback in ResolveCommittedReaderForCheckpoint and
ResolveRawSessionLogForCheckpoint to fall back to v1 for any v2 error
except context cancellation. Previously only ErrCheckpointNotFound and
ErrNoTranscript triggered fallback, so a corrupt v2 metadata.json would
block access to a valid v1 copy.

For default explain display modes (not --full/--generate/--raw-transcript),
read v2 checkpoint content exclusively from /main (metadata, prompts,
compact transcript). The raw transcript on /full/* is never needed for
human-readable output and may be unavailable due to rotation or fetch
state.

Add V2GitStore.ReadSessionMetadataAndPrompts to read metadata and prompts
from /main without requiring raw transcript from /full/* refs.

Entire-Checkpoint: 8e486d2d6c5c
@pfleidi
Copy link
Copy Markdown
Contributor Author

pfleidi commented Apr 8, 2026

Bugbot run

pfleidi added 2 commits April 8, 2026 13:55
…s, restore ireturn nolint

Read compact transcript from the same session tree in
ReadSessionMetadataAndPrompts to avoid a duplicate tree walk.
Restore inline ireturn nolint directive on ResolveCommittedReaderForCheckpoint.

Entire-Checkpoint: 022555aeb816
Replace git.PlainInit with testutil.InitRepo + git.PlainOpen in the 8
test functions added on this branch. InitRepo configures git user and
disables GPG signing, which is needed for reliable CI execution.

Entire-Checkpoint: 79a063fd8968
@pfleidi
Copy link
Copy Markdown
Contributor Author

pfleidi commented Apr 8, 2026

Bugbot run

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 7119e1f. Configure here.

computermode
computermode previously approved these changes Apr 8, 2026
@pfleidi pfleidi dismissed computermode’s stale review April 8, 2026 22:44

The merge-base changed after approval.

computermode
computermode previously approved these changes Apr 8, 2026
@pfleidi pfleidi dismissed computermode’s stale review April 8, 2026 22:50

The merge-base changed after approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

6 participants