Skip to content

fix: Fix graph path lookup for review and impact tools#469

Open
hy2850 wants to merge 1 commit into
tirth8205:mainfrom
hy2850:feature/graph-lookup-path-resolution
Open

fix: Fix graph path lookup for review and impact tools#469
hy2850 wants to merge 1 commit into
tirth8205:mainfrom
hy2850:feature/graph-lookup-path-resolution

Conversation

@hy2850
Copy link
Copy Markdown

@hy2850 hy2850 commented May 11, 2026

Summary

This PR fixes graph lookups for review and impact tools when the file paths passed by the user do not exactly match the file paths stored in the graph database.

The issue showed up while comparing git diff <sha>^ <sha> with get_review_context(changed_files=...) for eval test commits. Even when the graph DB was populated, get_review_context often returned:

0 changed nodes / 0 impacted nodes / 0 edges

That made the graph payload look empty and caused review context to fall back mostly to source snippets.

Root Cause

get_review_context and related tools converted changed files into absolute paths before graph lookup:

abs_files = [str(root / f) for f in changed_files]
impact = store.get_impact_radius(abs_files, max_depth=max_depth)

However, existing graph DBs may store file paths in different formats depending on how they were built, for example:

  • repo-relative paths
  • absolute paths
  • cwd-relative paths such as evaluate/test_repos/fastapi/fastapi/routing.py

Because get_nodes_by_file() requires an exact file path match, a valid changed file like:

fastapi/routing.py

could fail to match both:

/code-review-graph/evaluate/test_repos/fastapi/fastapi/routing.py

and the stored graph path:

evaluate/test_repos/fastapi/fastapi/routing.py

So the graph store had nodes, but the lookup path format prevented them from being found.

Fix

This PR adds path resolution before graph lookup.

The new resolver maps user-facing changed file paths to the actual path format stored in the graph by checking:

  • the raw input path
  • absolute path variants
  • repo-relative variants
  • graph paths matched by suffix

The resolver is then used by:

  • get_review_context
  • get_impact_radius
  • query_graph(..., pattern="file_summary")

This makes the tools robust to graph DBs built with different path styles.

Evidence

Before this fix, graph lookup returned zero graph payload for many eval commits:

repo SHA before after path resolution
express f41d09a3 0/0/0 209/473/510
express cec5780d 0/0/0 200/472/425
fastapi 22381558 0/0/0 162/500/494
fastapi 749cefde 0/0/0 161/500/527
flask c2705ffd 0/0/0 386/403/1106
flask 0ec7f713 0/0/0 195/388/741
gin 0a192fb0 0/0/0 877/252/1964
gin ac0ad2fe 0/0/0 577/255/1415
httpx 8e36f2bc 0/0/0 484/482/1284
httpx ee37a762 0/0/0 119/500/701
nextjs d81d5ab7 0/0/0 1192/500/3067
nextjs d86e1977 0/0/0 1154/500/2469

Format: changed nodes / impacted nodes / edges.

(One commit, gin 0feaf8cb, still resolves to 0/0/0 because it mostly
removes or splits example files that are not represented as current graph nodes)

Why This Matters

Without this fix, get_review_context can incorrectly report that a change has no graph impact even when the graph contains matching nodes.

This affects:

  • review context quality
  • impact radius results
  • file summary queries
  • token efficiency analysis that compares graph context with raw diffs

After this fix, graph-based review tools can use the existing graph data instead of silently falling back to empty graph payloads.

Test Plan

  • Added regression coverage for graph DBs that store cwd-relative file paths.
  • Verified get_review_context resolves repo-relative changed files to stored
    graph paths.
  • Verified get_impact_radius resolves changed file paths before graph lookup.
  • Verified query_graph(..., pattern="file_summary") resolves file paths
    consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant