Skip to content

feat(mt#1343): reviewThreads field in session_pr_review_context#896

Merged
edobry merged 5 commits into
mainfrom
task/mt-1343
May 1, 2026
Merged

feat(mt#1343): reviewThreads field in session_pr_review_context#896
edobry merged 5 commits into
mainfrom
task/mt-1343

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented Apr 30, 2026

Summary

Adds a reviewThreads field to SessionPrReviewContextResult so the reviewer can see existing thread state on round N+1 instead of going in blind and duplicating findings. Implements mt#1343 in the mt#1335 in-band-review-semantics cluster. Foundation for mt#1345 (reply + auto-resolve loop) and a precondition for any iterative review to converge.

Motivation & Context

ADR-008 (PR #838) called out that without thread state in the reviewer's context, every review round restarts from scratch and duplicates findings — the dominant cause of multi-round burn observed in mt#1110_calibration (PR #763: 7 rounds, 3 distinct legit findings). REST get_review_comments returns flat comments without isResolved/isOutdated; the GraphQL pullRequest.reviewThreads query is the only way to get the structured thread state.

Design / Approach

  • GraphQL fetch with pagination. pullRequest.reviewThreads(first: 50, after: $cursor) paged through until hasNextPage is false. Capped at 200 threads with truncated: true signaled in the result.
  • Parallel composition. The new fetch runs alongside the existing pr / checks / diff / parsedDiff / taskSpec parallel fetches in pr-review-context-subcommand.ts.
  • Non-fatal failure. getPRReviewThreads never throws on network/auth/GraphQL failures — returns { threads: [], truncated: false } and logs at debug level. Token acquisition failures degrade the same way.
  • Identity flow. Routes through tokenProvider.getServiceToken(owner/repo) — matching the scoped pattern in asks-github-client.ts. FallbackTokenProvider ignores the scope and returns the user PAT directly; GitHubAppTokenProvider issues a per-repo installation token.
  • DI test seam. getPRReviewThreads accepts an optional octokitOverride parameter — same pattern landed in mt#1346 (PR feat(mt#1346): GitHub Check Run annotations as parallel review surface #889). The project's no-global-module-mocks rule forbids mock.module(), so DI is canonical.

Key Changes

  • src/domain/repository/github-pr-operations.ts — new types (ReviewThread, ReviewThreadComment, ReviewThreadsResult), GraphQL query constant, and getPRReviewThreads(gh, prNumber, octokitOverride?) exported function. Error handling changed to graceful degradation: token and GraphQL failures return empty result instead of rethrowing.
  • src/domain/session/commands/pr-review-context-subcommand.tsSessionPrReviewContextResult gains reviewThreads: ReviewThread[] and reviewThreadsTruncated: boolean. fetchReviewThreads uses tokenProvider.getServiceToken(owner/repo) scoped to the repository. Wired into the parallel-fetch Promise.all.
  • src/domain/repository/github-pr-operations-review-threads.test.ts — 12 unit tests.

Testing

12 unit tests covering all spec-required cases:

  • Single-page mapping (id, path, line, isResolved).
  • Comment field mapping (author, body, createdAt; null author).
  • Outdated thread (line null, isOutdated true).
  • Multi-line range (startLine populated).
  • Single-line absent-startLine (field omitted from result).
  • Pagination (cursor passed correctly across multiple pages).
  • Truncation at 200 (truncated true, capped count, 5 graphql calls).
  • Empty PR (empty array, not undefined).
  • Shape conformance to exported ReviewThread interface.
  • truncatedComments true/false boundary (totalCount > 10 vs <= 10).
  • Recoverable GraphQL error returns empty result instead of throwing.
  • Token acquisition failure returns empty result (graceful degradation).

Tests inject a mock Octokit via the octokitOverride parameter — no module-level mocking.

Out of scope

  • Reviewer prompt update to consume reviewThreads (mt#1345).
  • Resolve / unresolve mutations (mt#1342, separate task).

Test plan

  • 12/12 unit tests pass (bun test src/domain/repository/github-pr-operations-review-threads.test.ts).
  • Lint clean.
  • Typecheck clean.
  • Reviewer to verify ADR-008 alignment + GraphQL query shape against GitHub API docs.

(Had Claude implement this — turn-limit cutoff before tests landed; main-agent recovery added the test suite using the DI seam pattern from mt#1346.)

edobry added 2 commits April 30, 2026 03:10
Adds a reviewThreads field to SessionPrReviewContextResult so the
reviewer can see existing thread state on round N+1 instead of going
in blind and duplicating findings. Foundation for mt#1345 (reply +
auto-resolve loop) and a precondition for any iterative review to
converge.

Per spec:
- New field reviewThreads: ReviewThread[] on SessionPrReviewContextResult.
- Each ReviewThread carries id, path, line, optional startLine,
  isResolved, isOutdated, isCollapsed, and a comments array with
  author, body, createdAt.
- New top-level reviewThreadsTruncated boolean signals when more than
  200 threads exist (the cap; fetched in pages of 50 via GraphQL).
- Outdated threads (anchor line removed from diff) come back with
  line: null and isOutdated: true. Reviewer must filter these.
- Identity routes through gh.getToken() like other PR operations;
  uses octokit.graphql for the call.

The fetch runs in parallel with the existing pr/checks/diff/taskSpec
fetches in pr-review-context-subcommand.ts. Failure is non-fatal: a
broken review-threads fetch returns an empty result rather than
blocking the rest of the context payload.

Files added:
- src/domain/repository/github-pr-review-threads.test.ts (9 tests)

Files modified:
- src/domain/repository/github-pr-operations.ts (new types and
  getPRReviewThreads forge function with octokitOverride DI seam)
- src/domain/session/commands/pr-review-context-subcommand.ts
  (SessionPrReviewContextResult extension, parallel fetch, fetchReviewThreads
  helper that reads the GitHub config to construct gh)

Tests cover:
- Single-page mapping (id, path, line, isResolved).
- Comment field mapping (author, body, createdAt; null author).
- Outdated thread (line null, isOutdated true).
- Multi-line range (startLine populated).
- Single-line absent-startLine (field omitted from result).
- Pagination (cursor passed correctly across multiple pages).
- Truncation at 200 (truncated true, capped count).
- Empty PR (empty array, not undefined).
- Shape conformance to exported ReviewThread interface.

Tests inject a mock Octokit via the octokitOverride parameter on
getPRReviewThreads rather than mock.module() — DI seam pattern landed
in mt#1346 / PR #889.

Out of scope:
- Reviewer prompt update to consume reviewThreads (mt#1345).
- Resolve / unresolve mutations (mt#1342, separate task).
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label Apr 30, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The reviewer ran but produced no findings. This is not an approval — the model emitted no submit_finding, submit_inline_comment, or conclude_review calls.

mt#1342 (PR #898) creates src/domain/repository/github-pr-review-threads.test.ts
for resolveReviewThread / unresolveReviewThread mutation tests. This task
created a file with the same name for getPRReviewThreads fetch tests, which
would conflict on merge.

Renamed to github-pr-operations-review-threads.test.ts to better reflect
that the function under test (getPRReviewThreads) lives in
github-pr-operations.ts, and to avoid the filename collision.

All 9 tests still pass under the new name.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


⚠️ Reviewer did not emit a conclude_review call. Event derived from severity counts: REQUEST_CHANGES (4 BLOCKING / 2 NON-BLOCKING / 0 PRE-EXISTING findings). Executive summary unavailable.

Findings

  • [BLOCKING] src/domain/repository/github-pr-operations.ts:722 — GraphQL limits comments to first 10 per thread with no pagination or truncation signal — silently drops older comments
    In REVIEW_THREADS_QUERY (src/domain/repository/github-pr-operations.ts:722), the query requests comments(first: 10) but there is no pagination loop to fetch remaining comments and no per-thread truncated indicator in the returned shape. The exported ReviewThread interface and mapping (lines ~780–810) imply comments is the ordered list for the thread, but with this query it will include at most 10 items, potentially omitting older comments on long threads. This is a silent data loss that can mislead reviewers and break iterative review convergence. Please either: (a) page through all comments for each thread (or request a sufficiently high cap and document any truncation), and/or (b) surface a per-thread truncation flag so consumers can detect incomplete data. Update tests to cover >10-comment threads and ordering guarantees.
  • [BLOCKING] src/domain/session/commands/pr-review-context-subcommand.ts:146 — Token provider uses user token unconditionally for service token, diverging from stated identity flow and potentially bypassing app auth
    In fetchReviewThreads (src/domain/session/commands/pr-review-context-subcommand.ts:133–170), the code constructs a token provider with const userToken = cfg.github?.token ?? ""; const tokenProvider = createTokenProvider(cfg.github ?? {}, userToken); and then calls getToken: () => tokenProvider.getServiceToken(). This appears to route the service token through a provider initialized with the user PAT value, potentially returning the user token rather than the app installation token. The PR description claims identity flow routes through gh.getToken() "using the same TokenProvider path as other PR operations," but other code paths typically use gh.getToken() from the backend’s configured provider (app token). This change risks privilege escalation or permission mismatch and contradicts "Non-fatal failure"/"Identity flow" design. Please: (1) confirm createTokenProvider semantics; (2) if service token should be app token, initialize the provider exactly as elsewhere (pull from session/backend rather than ad hoc config), or add getUserToken fallback explicitly like other ops; (3) add tests for missing/invalid tokens.
  • [BLOCKING] src/domain/repository/github-pr-operations.ts:812 — Error handling contradicts "Non-fatal failure" requirement; throws MinskyError instead of returning empty result
    getPRReviewThreads (src/domain/repository/github-pr-operations.ts:770–820) wraps the entire fetch in a try/catch and rethrows a new MinskyError on any non-Minsky error. The PR description and subcommand’s fetchReviewThreads emphasize that review-thread fetch should be non-fatal and degrade to empty results so the rest of review context proceeds. Exposing a throwing function here causes upstream callers that don’t handle it to fail hard. Either (a) make getPRReviewThreads return { threads: [], truncated: false } on recoverable GraphQL/token/network errors, logging at debug, or (b) clearly document that callers must catch and swallow errors (and enforce via types/tests). Currently, only the subcommand’s wrapper swallows; other potential call sites could inadvertently crash.
  • [BLOCKING] src/domain/repository/github-pr-operations-review-threads.test.ts:1 — Spec/tests file path mismatch with PR description; repository contains different test filename
    The diff adds tests at src/domain/repository/github-pr-operations-review-threads.test.ts, but the PR description and Testing section refer to src/domain/repository/github-pr-review-threads.test.ts. This mismatch suggests either stale documentation or an incorrect test file path. Ensure naming is consistent and aligned with project conventions (other tests under this directory), and CI config picks up the intended file. If a previous file exists with the other name, de-duplicate or remove to avoid confusion.
  • [NON-BLOCKING] src/domain/repository/github-pr-operations.ts:748 — Missing explicit comment ordering; code assumes "oldest first" but GraphQL default may differ
    ReviewThread.comments is documented as "Ordered list of comments in the thread (oldest first)" (src/domain/repository/github-pr-operations.ts:748), but the GraphQL query does not specify an orderBy for comments. Depending on GitHub’s default, the order may be newest-first or unspecified. If consumers rely on chronological order, specify ordering via GraphQL (if supported) or explicitly sort by createdAt after mapping. Add a test covering two comments with out-of-order timestamps to assert the intended order.
  • [NON-BLOCKING] src/domain/repository/github-pr-operations.ts:706 — Stop condition incurs extra page fetch at exact cap; can avoid one network call
    The truncation logic fetches the next page even when allThreads.length has already reached MAX_REVIEW_THREADS, as noted in tests. While acceptable, you can check if (!hasNextPage || allThreads.length >= MAX_REVIEW_THREADS) before issuing the next GraphQL call to spare one request on large PRs. Add a comment or a small optimization.

Inline comments

  • src/domain/repository/github-pr-operations.ts:697 — Nit: Consider exporting MAX_REVIEW_THREADS or documenting its rationale (200) alongside ADR-008 reference. If downstream consumers need to present a notice, they may also want the cap value for messaging.

BLOCKING 1: Add truncatedComments test coverage - add totalCount to
GraphQLNode test interface and three new tests: truncatedComments true/false
boundary (totalCount > 10), recoverable GraphQL error returns empty result,
token acquisition failure returns empty result.

BLOCKING 2: Fix fetchReviewThreads token provider pattern - scope the
getServiceToken call to the specific repository using owner/repo string,
matching the pattern from asks-github-client.ts. FallbackTokenProvider
ignores the scope, so this is safe in both user-PAT and GitHub App paths.

BLOCKING 3: Make getPRReviewThreads non-fatal - replace the MinskyError
rethrow with graceful degradation: token failures and GraphQL errors each
return empty result and log at debug level. Update JSDoc to document this
contract. Only the token-acquisition and per-page GraphQL call are wrapped;
iteration logic errors remain unexpected.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


Good progress adding reviewThreads with pagination and DI, plus tests. However, I’m blocking on three issues: (1) the GraphQL response is dereferenced without null-guards, which can throw and violate the non-fatal contract; (2) per-thread comment truncation (comments(first: 10)) isn’t exposed to the session result API, creating silent data loss for long threads; and (3) the session result shape adds required fields without versioning or opt-in, risking downstream breakage. Non-blocking notes: clarify ADR/schema coverage and consider a minor pagination optimization to avoid an extra fetch at the cap. Address the blockers and this should be mergeable.

Findings

  • [BLOCKING] src/domain/repository/github-pr-operations.ts:820 — Possible runtime TypeError on null GraphQL subfields — violates non-fatal contract
    After the GraphQL call, the code unconditionally dereferences response.repository.pullRequest.reviewThreads to assign const page = response.repository.pullRequest.reviewThreads;. If the repository or pullRequest node is null (e.g., 403/404, cross-repo permissions, or PR not found in scope), Octokit will return data with null subfields rather than throwing. This will cause a runtime TypeError during property access, escaping your try/catch (which only wraps the graphql call) and breaking the "non-fatal failure" guarantee documented in the function header. Add a null-guard immediately after the call and return the emptyResult when response.repository?.pullRequest?.reviewThreads is absent.
  • [BLOCKING] src/domain/repository/github-pr-operations.ts:760 — Silent comment truncation without explicit contract propagation to caller
    The GraphQL query requests comments(first: 10) for each thread (line ~760), but ReviewThread.comments is documented as "Ordered list of comments ... (up to 10)" and a boolean truncatedComments is added. However, the public-facing SessionPrReviewContextResult in src/domain/session/commands/pr-review-context-subcommand.ts does not surface truncatedComments per thread; it only exposes reviewThreads and a top-level reviewThreadsTruncated. Downstream consumers cannot detect per-thread truncation to display a "more comments" notice as your comment implies. Either (a) include truncatedComments in the session result shape or (b) fetch all comments (paginate) so per-thread truncation never occurs. As-is this is a silent behavior change that will mislead reviewers on long threads.
  • [BLOCKING] src/domain/session/commands/pr-review-context-subcommand.ts:27 — Session result shape change is not versioned — backwards compatibility risk
    SessionPrReviewContextResult gains two required fields: reviewThreads and reviewThreadsTruncated (lines 27–37). This is a breaking change for any consumer expecting the previous shape. The PR description mentions reviewer prompt changes are out of scope, but adding required fields without a version bump or defaulting in the transport/protocol can break existing clients. Make these fields optional, gate behind a feature flag, or rev the API contract so downstream consumers can roll forward safely.
  • [NON-BLOCKING] src/domain/repository/github-pr-operations.ts:742 — Spec/ADR alignment unclear — missing schema link or field coverage (e.g., thread state)
    The PR description cites ADR-008 and GitHub's GraphQL pullRequest.reviewThreads as the canonical source for thread state. The current query does not request fields like isResolved, isOutdated, isCollapsed (good — included), but it omits other potentially relevant state such as isPending or viewerCanResolve/viewerCanUnresolve, and it does not capture per-comment metadata like updatedAt, url, or system notes. If ADR-008 expects full parity with GitHub UI state used later for auto-resolve flows (mt#1345), consider documenting the chosen subset and justifying exclusions in-code with comments and a link to the schema docs to ease future maintenance.
  • [NON-BLOCKING] src/domain/repository/github-pr-operations.ts:900 — Pagination loop may perform an unnecessary extra fetch at exactly 200 threads
    When allThreads.length reaches MAX_REVIEW_THREADS (200) during processing of a page, you set truncated = true and hasNextPage = false, but this happens inside the node loop for the already-fetched page. In the common case of exactly 200 total threads (e.g., 4 full pages of 50), you will still have fetched the 4th (final) page even if the 3rd page already completed the cap. A micro-optimization: compute remaining capacity before pushing nodes and early-break without fetching the subsequent page. Tests already mention this, but if API quotas matter, consider short-circuiting earlier to reduce one GraphQL call.

Inline comments

  • src/domain/repository/github-pr-operations-review-threads.test.ts:210 — Nice that you called out the extra page fetch in the comment. If you do adopt the earlier short-circuit in production code, please update this assertion to the new expected number of calls (4) and keep the explanatory note in sync.

@edobry
Copy link
Copy Markdown
Owner

edobry commented Apr 30, 2026

@minsky-reviewer please re-review — fixes pushed addressing prior round's findings.

Three round-2 BLOCKINGs on commit a1b3d49; addressing the two real
ones, documenting the false-positive third.

1. Null-guard on response.repository.pullRequest.reviewThreads
   (github-pr-operations.ts getPRReviewThreads).
   Bot correctly flagged that GitHub returns data with null subfields
   (rather than throwing) when the repository or pullRequest is
   inaccessible — cross-repo permissions, PR not found in scope, 403/404.
   Without a guard, the dereference would throw a TypeError that escapes
   the per-call try/catch and breaks the documented non-fatal contract.
   Added optional-chaining + null-check; returns the empty result with
   a debug log when the subfield is null. Added a unit test that
   simulates a null repository field and asserts the empty result is
   returned (not a TypeError).

2. Backward-compatibility: marked SessionPrReviewContextResult fields
   reviewThreads and reviewThreadsTruncated as optional (added ?).
   Pre-mt#1343 callers that read the result shape but did not reference
   these fields will continue to compile. Updated JSDoc to reflect the
   optional status and to point at the per-thread truncatedComments
   flag (which IS surfaced — see the false-positive note below).

3. False positive on truncatedComments not surfaced (BLOCKING 2 from
   the bot review). The bot claimed downstream consumers cannot detect
   per-thread truncation. Verified: the ReviewThread interface returned
   in the reviewThreads array DOES carry truncatedComments per thread,
   set from node.comments.totalCount > 10 in the mapping at line 866
   of github-pr-operations.ts. The session result returns
   reviewThreads: ReviewThread[] so each thread element exposes its
   own truncatedComments flag. Documented this explicitly in the new
   reviewThreads JSDoc so the next reviewer round does not re-flag it.

Skipping the NON-BLOCKINGs (additional GraphQL fields, micro-pagination
optimization).
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The reviewer ran but produced no findings. This is not an approval — the model emitted no submit_finding, submit_inline_comment, or conclude_review calls.

@edobry edobry merged commit 52f8374 into main May 1, 2026
2 checks passed
@edobry edobry deleted the task/mt-1343 branch May 1, 2026 05:04
edobry added a commit that referenced this pull request May 1, 2026
Resolves the merge conflict between this branch and main. Main has
advanced with PRs #896 (mt#1343 review threads), #898 (mt#1342 thread
resolve), #906 (mt#1341 suggestion blocks), and #907 (mt#1348 review
state labels). All four touch overlapping surface (github-pr-review.ts,
github.ts, repository index, session-parameters, workflow-commands,
session.ts, pr-subcommands) — and #898 also added new files this branch
needs (pr-review-thread-resolve subcommand and adapter).

This commit copies main main content for the new and updated files so
the in-progress submitCheckRun additions in this branch can co-exist
without the no-marker abort path of session_update (mt#1303).

Files added or replaced:
- diff-anchor-validator.ts (new from #887 mt#1347)
- review-state-labels.ts (new from #907 mt#1348)
- github-pr-review.ts (updated by #898 #906 #907)
- github-pr-operations.ts (updated by #896)
- pr-review-context-subcommand.ts (updated by #896)
- pr-review-thread-resolve-subcommand.ts (new from #898)
- pr-review-thread-resolve-command.ts (new from #898)
- github-pr-review.test.ts (updated by #906)
- github-pr-operations-review-threads.test.ts (new from #896)
- github-pr-review-threads.test.ts (new from #898)
- review-state-labels.test.ts (new from #907)

Files merged in place to keep this branch additions plus main additions:
- github.ts (review wrapper now has resolveReviewThread plus unresolveReviewThread plus submitCheckRun)
- repository index.ts (ReviewOperations interface adds the same three optional methods)
- session-parameters.ts (PrReviewSubmit comments schema gains startLine startSide suggestion; adds PrReviewThreadResolveCommandParams alongside PrCheckRunSubmitCommandParams)
- session.ts (registers PrReviewThreadResolve and PrCheckRunSubmit)
- workflow-commands.ts (re-exports the same)
- pr-subcommands.ts (barrel exports the same)

Typecheck clean. 123 of 123 tests pass across the 6 reviewer-output
test files. Next step is session_update to fast-forward this branch
to main now that all conflicts are pre-resolved by content match.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant