Skip to content

feat(diff): investigate row windowing for large single-file reviews#237

Open
benvinegar wants to merge 7 commits intomainfrom
bentlegen/row-windowing-poc
Open

feat(diff): investigate row windowing for large single-file reviews#237
benvinegar wants to merge 7 commits intomainfrom
bentlegen/row-windowing-poc

Conversation

@benvinegar
Copy link
Copy Markdown
Member

@benvinegar benvinegar commented May 7, 2026

Summary

  • enable row windowing by default for mounted diff sections
  • slice planned file rows down to the visible file-body range while preserving total body height with spacer rows
  • add component and PTY coverage for distant-hunk navigation inside large single-file diffs

Why

Large single-file diffs were still expensive even with file-level windowing, because mounting one visible file could still mount thousands of diff rows. This change virtualizes rows within a mounted file section so Hunk keeps the same review-stream geometry while rendering only the visible row slice for that file body.

What changed

  • DiffPane now translates the global review viewport into file-local body bounds for mounted files
  • PierreDiffView now renders only the planned rows that overlap that visible body slice
  • rowWindowing.ts owns the row-slice math and spacer-height preservation
  • tests now cover distant-hunk visibility and PTY navigation across far-apart hunks in one large file

Measurements

Using a synthetic large single-file diff with many distant hunks:

  • row windowing off
    • steady-state ] avg: 39.7ms
    • steady-state ] p95: 50.1ms
  • row windowing on
    • steady-state ] avg: 19.5ms
    • steady-state ] p95: 23.2ms

Using the isolated large-file renderer earlier in investigation:

  • full planned rows: 4793
  • mounted row slice near the viewport: about 140

Notes

  • This PR now enables row windowing by default instead of keeping it behind an env flag.
  • I tried keeping adjacent hunks mounted too, but the measured benefit was modest for keyboard hunk jumps and essentially neutral for wheel scrolling, so that extra complexity was removed.
  • The remaining mouse-wheel “chug” appears more related to selection/reveal behavior than raw row mounting.

Validation

  • bun run typecheck
  • bun test
  • focused PTY and component coverage for distant-hunk navigation and visibility

This PR description was generated by Pi using OpenAI GPT-5

@benvinegar benvinegar marked this pull request as ready for review May 8, 2026 02:46
@benvinegar benvinegar changed the title Improve large single-file diff performance feat(diff): investigate row windowing for large single-file reviews May 8, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR introduces two performance improvements for large single-file diffs: a highlight policy that routes generated lockfiles to a plain-text or no-op highlight path, and an env-gated row-windowing proof-of-concept that slices the mounted diff-row tree down to a visible body window.

  • highlightPolicy.ts: New module classifies generated dependency manifests by basename and applies a line-count threshold to choose between \"full\", \"text\", or \"none\" highlight modes; integrated into pierre.ts, useHighlightedDiff.ts, and the prefetch path.
  • rowWindowing.ts: resolveVisiblePlannedRowWindow slices plannedRows to the range that intersects a VisibleBodyBounds window, inserting top/bottom spacers to preserve total body height; gated behind HUNK_ROW_WINDOWING_POC=1.
  • DiffPane.tsx / DiffSection.tsx / PierreDiffView.tsx: Wire the computed visibleBodyBounds per file through to the render layer, with a safe fallback to full row rendering when geometry or bounds are absent.

Confidence Score: 4/5

The lockfile highlight change and the env-gated row-windowing POC both degrade gracefully to full rendering when geometry or bounds are absent, making this safe to merge.

The two feature tracks are independent and both have targeted tests. The geometry/row-count alignment check in resolveVisiblePlannedRowWindow provides a safe fallback against stale data. All findings are naming clarity, a redundant condition, and an undocumented per-side vs. total-line footprint trade-off.

The overscanRows constant in DiffPane.tsx and the diffLineFootprint threshold logic in highlightPolicy.ts are the two spots most likely to need adjustment as the POC matures.

Important Files Changed

Filename Overview
src/ui/diff/rowWindowing.ts New helper that slices planned rows to a visible body window; logic is correct and well-tested, with a minor asymmetry in how offscreen height is distributed between the two spacers.
src/ui/diff/highlightPolicy.ts New policy module correctly routes lockfiles to cheap highlight paths; diffLineFootprint uses max over sides rather than sum, which keeps near-threshold symmetric diffs on the "text" path — worth documenting.
src/ui/components/panes/DiffPane.tsx Adds visibleBodyBoundsByFile memo that correctly derives per-file visible windows from scroll viewport and section geometry; overscanRows naming is slightly misleading as it is a body-coordinate offset rather than a logical row count.
src/ui/diff/PierreDiffView.tsx Wires visiblePlannedRowWindow into render; correctly falls back to full row list when geometry or bounds are absent, and sandwiches the windowed slice between spacer boxes.
src/ui/diff/useHighlightedDiff.ts Correctly short-circuits loading and caching for "none" mode files; highlightMode === "none" guard in the effect is redundant since appearanceCacheKey is already null in that case.
src/ui/components/panes/DiffSection.tsx Pass-through props for sectionGeometry and visibleBodyBounds added correctly, including updated memo comparator.
src/ui/diff/pierre.ts Integrates resolveDiffHighlightMode into loadHighlightedDiff; early return for "none" mode and metadata override for "text" mode are both correct.
src/ui/diff/rowWindowing.test.ts Tests cover the three key slicing scenarios: partial overlap, zero-height row attachment, and fully offscreen collapse; all assertions trace correctly.
src/ui/diff/highlightPolicy.test.ts Focused tests validate "full", "text", and "none" highlight modes; correct path-prefix agnostic behavior is implicitly covered by using bare lockfile filenames.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[DiffPane
scrollViewport + sectionGeometry] -->|computes per-file
visibleBodyBounds| B[visibleBodyBoundsByFile Map]
    B --> C[DiffSection
visibleBodyBounds prop]
    C --> D[PierreDiffView
visibleBodyBounds + sectionGeometry]
    D -->|rowWindowingPocEnabled?| E{POC enabled?}
    E -->|No / missing bounds| F[Render all plannedRows
topSpacer=0, bottomSpacer=0]
    E -->|Yes| G[resolveVisiblePlannedRowWindow]
    G --> H[Slice plannedRows to
visible index range]
    H --> I[topSpacer box + sliced rows + bottomSpacer box]
    J[DiffFile] --> K[resolveDiffHighlightMode]
    K -->|full| L[prepareHighlighter language]
    K -->|text| M[prepareHighlighter text]
    K -->|none| N[Return empty highlighted arrays]
    L --> O[renderDiffWithHighlighter]
    M --> O
Loading

Comments Outside Diff (2)

  1. src/ui/diff/useHighlightedDiff.ts, line 190 (link)

    P2 The highlightMode === "none" condition is redundant here. When highlightMode is "none", appearanceCacheKey is already null (see line 184–185), so !appearanceCacheKey is already true. The check adds noise and could confuse a reader into thinking appearanceCacheKey can be non-null while highlightMode is "none".

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/ui/diff/useHighlightedDiff.ts
    Line: 190
    
    Comment:
    The `highlightMode === "none"` condition is redundant here. When `highlightMode` is `"none"`, `appearanceCacheKey` is already `null` (see line 184–185), so `!appearanceCacheKey` is already true. The check adds noise and could confuse a reader into thinking `appearanceCacheKey` can be non-null while `highlightMode` is `"none"`.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. src/ui/diff/highlightPolicy.ts, line 26-28 (link)

    P2 diffLineFootprint uses Math.max — a large symmetric diff stays on the "text" path

    Using the larger of the two sides means a lockfile with 2 499 deletions and 2 499 additions (total ~5 000 changed lines) scores a footprint of 2 499 and stays on the "text" highlight path rather than "none". Whether per-side render cost (the highlighter processes each array separately) or total diff size is the right budget unit is worth a comment here, since the choice affects how aggressively the fast path kicks in.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/ui/diff/highlightPolicy.ts
    Line: 26-28
    
    Comment:
    **`diffLineFootprint` uses `Math.max` — a large symmetric diff stays on the "text" path**
    
    Using the larger of the two sides means a lockfile with 2 499 deletions and 2 499 additions (total ~5 000 changed lines) scores a footprint of 2 499 and stays on the `"text"` highlight path rather than `"none"`. Whether per-side render cost (the highlighter processes each array separately) or total diff size is the right budget unit is worth a comment here, since the choice affects how aggressively the fast path kicks in.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
src/ui/components/panes/DiffPane.tsx:599
**Misleading variable name: `overscanRows` is a coordinate offset, not a row count**

The name `overscanRows` suggests it controls how many logical diff rows are kept in the render window, but it's actually a body-coordinate offset added to `sectionLayout.bodyTop`-relative values. Since each terminal row is 1 unit in this coordinate space the semantics aren't broken, but a future contributor tuning overscan might adjust this thinking they're working in logical diff-row units (where a wrapped line can be multiple terminal rows). A name like `overscanPx` or `overscanTerminalRows` would avoid the confusion.

### Issue 2 of 4
src/ui/diff/useHighlightedDiff.ts:190
The `highlightMode === "none"` condition is redundant here. When `highlightMode` is `"none"`, `appearanceCacheKey` is already `null` (see line 184–185), so `!appearanceCacheKey` is already true. The check adds noise and could confuse a reader into thinking `appearanceCacheKey` can be non-null while `highlightMode` is `"none"`.

```suggestion
    if (!file || !appearanceCacheKey) {
```

### Issue 3 of 4
src/ui/diff/highlightPolicy.ts:26-28
**`diffLineFootprint` uses `Math.max` — a large symmetric diff stays on the "text" path**

Using the larger of the two sides means a lockfile with 2 499 deletions and 2 499 additions (total ~5 000 changed lines) scores a footprint of 2 499 and stays on the `"text"` highlight path rather than `"none"`. Whether per-side render cost (the highlighter processes each array separately) or total diff size is the right budget unit is worth a comment here, since the choice affects how aggressively the fast path kicks in.

### Issue 4 of 4
src/ui/diff/rowWindowing.ts:71-76
**Offscreen-file spacer layout always assigns all height to `bottomSpacerHeight`**

When no row intersects the visible range (e.g. the file body is entirely scrolled past or not yet reached), the function returns `topSpacerHeight: 0, bottomSpacerHeight: bodyHeight`. Total height is preserved correctly, but the distribution is asymmetric: the rendered body is `[spacer=0][rows=none][spacer=bodyHeight]`. For a file entirely *above* the viewport it would be more semantically accurate to put the full height in `topSpacerHeight`. This is not a correctness bug today, but it could become one if downstream code assumes `topSpacerHeight` represents the distance from the body top to the first rendered row.

Reviews (1): Last reviewed commit: "refactor(diff): focus the perf branch on..." | Re-trigger Greptile

return next;
}

const overscanRows = Math.max(24, scrollViewport.height * 2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading variable name: overscanRows is a coordinate offset, not a row count

The name overscanRows suggests it controls how many logical diff rows are kept in the render window, but it's actually a body-coordinate offset added to sectionLayout.bodyTop-relative values. Since each terminal row is 1 unit in this coordinate space the semantics aren't broken, but a future contributor tuning overscan might adjust this thinking they're working in logical diff-row units (where a wrapped line can be multiple terminal rows). A name like overscanPx or overscanTerminalRows would avoid the confusion.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ui/components/panes/DiffPane.tsx
Line: 599

Comment:
**Misleading variable name: `overscanRows` is a coordinate offset, not a row count**

The name `overscanRows` suggests it controls how many logical diff rows are kept in the render window, but it's actually a body-coordinate offset added to `sectionLayout.bodyTop`-relative values. Since each terminal row is 1 unit in this coordinate space the semantics aren't broken, but a future contributor tuning overscan might adjust this thinking they're working in logical diff-row units (where a wrapped line can be multiple terminal rows). A name like `overscanPx` or `overscanTerminalRows` would avoid the confusion.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +71 to +76
firstVisibleIndex = index;
}
lastVisibleIndex = index;
}

if (firstVisibleIndex < 0 || lastVisibleIndex < 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Offscreen-file spacer layout always assigns all height to bottomSpacerHeight

When no row intersects the visible range (e.g. the file body is entirely scrolled past or not yet reached), the function returns topSpacerHeight: 0, bottomSpacerHeight: bodyHeight. Total height is preserved correctly, but the distribution is asymmetric: the rendered body is [spacer=0][rows=none][spacer=bodyHeight]. For a file entirely above the viewport it would be more semantically accurate to put the full height in topSpacerHeight. This is not a correctness bug today, but it could become one if downstream code assumes topSpacerHeight represents the distance from the body top to the first rendered row.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ui/diff/rowWindowing.ts
Line: 71-76

Comment:
**Offscreen-file spacer layout always assigns all height to `bottomSpacerHeight`**

When no row intersects the visible range (e.g. the file body is entirely scrolled past or not yet reached), the function returns `topSpacerHeight: 0, bottomSpacerHeight: bodyHeight`. Total height is preserved correctly, but the distribution is asymmetric: the rendered body is `[spacer=0][rows=none][spacer=bodyHeight]`. For a file entirely *above* the viewport it would be more semantically accurate to put the full height in `topSpacerHeight`. This is not a correctness bug today, but it could become one if downstream code assumes `topSpacerHeight` represents the distance from the body top to the first rendered row.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant