Skip to content

Optimize VCS diff loading to be up to 98% faster#2586

Open
justsomelegs wants to merge 16 commits intopingdotgg:mainfrom
justsomelegs:t3code/9f93372b
Open

Optimize VCS diff loading to be up to 98% faster#2586
justsomelegs wants to merge 16 commits intopingdotgg:mainfrom
justsomelegs:t3code/9f93372b

Conversation

@justsomelegs
Copy link
Copy Markdown
Contributor

@justsomelegs justsomelegs commented May 8, 2026

What Changed

  • Moved checkpoint filesystem operations behind a generic VCS checkpoint capability instead of scripting Git directly in CheckpointStore.
  • Reworked VcsProcess.run and processRunner onto a shared capturedProcess primitive for one-shot collected process execution.
  • Optimized checkpoint diff generation by:
    • removing redundant checkpoint preflight checks on the diff path
    • resolving the VCS driver once per checkpoint operation
    • using explicit commit peeling (^{commit}) for checkpoint diff refs
  • Added a narrower full-thread diff lookup path and supporting tests.

Why

Opening diffs and switching turns was spending too much time in process execution overhead and checkpoint diff orchestration.

This branch improves that by:

  • reducing per-command process overhead in the hot path
  • moving checkpoint behavior behind the VCS abstraction instead of hand-assembling Git commands in the store
  • reducing Git revision-resolution cost for checkpoint refs in repos with large checkpoint ref namespaces

Benchmark Summary

Synthetic workload used for all numbers below:

  • 24 changed files
  • 690,422 patch bytes
  • 26,137 patch lines

Benchmarks were run sequentially to avoid contention between benchmark loops.

Mean Latency

Metric upstream/main Current branch
checkpointStore.diffCheckpoints 1179.81ms 72.84ms
checkpointDiffQuery.getTurnDiff 1649.32ms 52.05ms
parseTurnDiffFilesFromUnifiedDiff 12.42ms 11.97ms

Tail Latency (p99)

Metric upstream/main Current branch
checkpointStore.diffCheckpoints 2262.90ms 355.04ms
checkpointDiffQuery.getTurnDiff 2937.21ms 191.04ms
parseTurnDiffFilesFromUnifiedDiff 16.43ms 19.63ms

Key Measurements

Compared directly with upstream/main:

  • checkpointStore.diffCheckpoints: 1179.81ms -> 72.84ms mean (16.20x faster, 93.8% lower)
  • checkpointDiffQuery.getTurnDiff: 1649.32ms -> 52.05ms mean (31.69x faster, 96.8% lower)
  • checkpointStore.diffCheckpoints: 2262.90ms -> 355.04ms p99
  • checkpointDiffQuery.getTurnDiff: 2937.21ms -> 191.04ms p99

Before

before.vcs.optimisation.mp4

AFTER

vsc.after.optimisations.mp4

Checklist

  • This PR is small and focused
  • I explained what changed and why
  • I included before/after screenshots for any UI changes
  • I included a video for animation/interaction changes

Note

Speed up VCS diff loading by computing diffs from canonical turn 0 checkpoints

  • Refactors CheckpointStore to delegate all checkpoint operations (capture, restore, diff, delete) to the active VCS driver via a new checkpoints interface, removing hardcoded git invocations from the store layer.
  • Adds checkpoint support to GitVcsDriver using git plumbing commands, including diffCheckpoints with a configurable max output byte limit (CHECKPOINT_DIFF_MAX_OUTPUT_BYTES).
  • getFullThreadDiff in CheckpointDiffQuery now uses a dedicated full-thread context lookup (getFullThreadDiffContext) that diffs from turn 0 to the requested turn rather than chaining per-turn diffs, which is the source of the claimed speedup.
  • Adds runCapturedProcess / runCapturedProcessPromise as a new process execution layer with size-limited stdout/stderr, staged timeout termination (SIGTERM then SIGKILL), and Windows process-tree kill via taskkill; runProcess is reimplemented on top of it.
  • Risk: diffCheckpoints now enforces a hard output byte cap; diffs exceeding it are truncated or error depending on configuration, which is a behavioral change for large diffs.

Macroscope summarized 2fb898b.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d8319b2c-e217-4e9c-8320-0eab2bc933aa

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels May 8, 2026
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp Bot commented May 8, 2026

Approvability

Verdict: Needs human review

This PR introduces a significant refactor that restructures checkpoint operations, adds a new process execution abstraction, and creates new query paths - changes that go beyond simple optimization and warrant human review. An unresolved comment also raises architectural concerns about the approach.

You can customize Macroscope's approvability policy. Learn more.

Comment thread apps/server/src/capturedProcess.ts
Comment thread apps/server/src/stream/collectUint8StreamText.ts Outdated
};
}

export async function runProcess(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get rid of this functoin and use the runCapturedProcess in all consumers? or atleast remove the async wrapper and make it effectful e2e?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sorry got busy was in the middle of doing that so that we use the effect e2e process instead of the one i came up with, ill get it done now and remove the async wrappers too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants