Skip to content

docs(mt#1109): E2E smoke artifact — add v1-deployed status to reviewer README#703

Open
minsky-ai[bot] wants to merge 2 commits into
mainfrom
task/mt-1109
Open

docs(mt#1109): E2E smoke artifact — add v1-deployed status to reviewer README#703
minsky-ai[bot] wants to merge 2 commits into
mainfrom
task/mt-1109

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented Apr 22, 2026

Purpose

This PR is a deliberate minimal-change artifact. The goal is to exercise the minsky-reviewer App end-to-end now that mt#1100's getAppIdentity fix is deployed — not to add anything meaningful to the codebase.

If all infrastructure is working, this PR will receive a review from minsky-reviewer[bot] (distinct App identity from the PR author minsky-ai[bot]), generated by GPT-5 via the Railway-hosted webhook service, posted automatically without any main-agent involvement.

Success criteria (observational)

  • Railway logs show event=review_result status=reviewed for this PR
  • PR's reviews list shows a review authored by minsky-reviewer[bot]
  • Review body begins with "Independent adversarial review" per the Critic Constitution
  • Review event is APPROVE / REQUEST_CHANGES / COMMENT reflecting actual analysis

Scope

Change: one line added to services/reviewer/README.md documenting the v1 deploy status and linking to mt#1099.

Related

  • mt#1099 — parent task (E2E smoke test)
  • mt#1083 — reviewer scaffold
  • mt#1100 — getAppIdentity fix (deployed, unblocking this test)
  • mt#1107 — filing for Railway auto-deploy (so future changes deploy without manual railway up)

…moke artifact

Minimal-diff PR whose purpose is to exercise the reviewer App end-to-end after mt#1100 deployed the getAppIdentity fix. Webhook should fire pull_request.opened, service should post an adversarial review under minsky-reviewer[bot] via GPT-5.

Success criteria for this task are observational, not about the diff itself.
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label Apr 22, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: 2


Independent adversarial review

Findings

  • [BLOCKING] Spec-diff mismatch: promised “linking to mt#1099” but added plain text only

    • File: services/reviewer/README.md:4
    • The PR scope states “Change: one line added … linking to mt#1099.” The inserted line reads “Status: v1 deployed 2026-04-22; end-to-end smoke test tracked in mt#1099.” with “mt#1099” as plain text, not a Markdown link. If “mt#1099” is intended to be an actionable reference, this does not fulfill the stated scope. Please convert to a proper link (e.g., mt#1099) or adjust the PR text to remove the claim about linking.
  • [NON-BLOCKING] Ambiguity of “v1” and deployment context

    • File: services/reviewer/README.md:4
    • “v1 deployed 2026-04-22” lacks context (which environment on Railway, GitHub App slug, release tag/commit). This can lead to confusion and drift. Suggest adding either the environment name (e.g., prod), a link to the Railway service/environment, and/or a release tag/commit hash.
  • [NON-BLOCKING] Staleness risk of hard-coded date

    • File: services/reviewer/README.md:4
    • A static “deployed 2026-04-22” will become stale quickly. Consider “As of 2026-04-22” plus a link to a living source of truth (deployment status dashboard, task, or changelog), or move this to a Status section that is expected to change.
  • [NON-BLOCKING] Style/placement consistency

    • File: services/reviewer/README.md:4
    • The italicized top-of-file status line is a different pattern than the later “Part of Sprint A (mt#1083) under mt#1073.” Consider consolidating into a dedicated “Status” or “Project tracking” section for readability and consistency.
  • [PRE-EXISTING] Unlinked task references elsewhere

    • File: services/reviewer/README.md:6 (existing line)
    • The README already references “mt#1083” and “mt#1073” without links. If the desired convention is to make these clickable, consider normalizing the entire file. If plain “mt#” tokens are the standard, clarify that in contributing/docs to avoid future ambiguity.

Spec verification

Note: The PR has no formal task spec; the description lists observational success criteria that are out-of-band from the code diff. These cannot be verified within this review.

  • Railway logs show event=review_result status=reviewed for this PR — N/A (out-of-band; cannot be verified in code review)
  • PR’s reviews list shows a review authored by minsky-reviewer[bot] — N/A (depends on this very event; not verifiable prior to posting)
  • Review body begins with “Independent adversarial review” — Met (this review begins with that string)
  • Review event is APPROVE / REQUEST_CHANGES / COMMENT reflecting actual analysis — Met (this review is REQUEST_CHANGES with concrete findings)

Documentation impact

  • This PR is itself a docs change. No additional architecture notes required.
  • To align with the PR’s stated scope, add an actual hyperlink target for mt#1099 or revise the text to not claim “linking.”
  • Optional: add environment/release links for traceability; consider a “Status” section for ongoing updates.

Event: REQUEST_CHANGES

The reviewer flagged a real spec-diff mismatch on PR #703 — the PR description said linking to mt#1099 but the actual diff had mt#1099 as plain text. Fix: converted the reference to a proper Markdown link. Also disambiguated the environment with Railway production so a future reader knows which deploy target the v1 line describes.

Demonstrates the Chinese-wall review feedback loop end-to-end: GPT-5-via-reviewer-App flagged a spec-diff mismatch that a rubber-stamp or helpfulness-biased reviewer would have passed over. First live cross-provider acid-test hit.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: 2


Independent adversarial review

Findings

  • [BLOCKING] services/reviewer/README.md:5 — The link to the tracking issue is malformed: https://github.com/edobry/minsky/issues/mt-1099. GitHub issue URLs are numeric (e.g., issues/1099). As written, issues/mt-1099 will 404. This defeats the purpose of documenting the smoke test linkage. Please update to the correct issue URL or a valid alternative (e.g., discussions or a label link).
  • [NON-BLOCKING] services/reviewer/README.md:5 — The hardcoded deploy status (“v1 deployed 2026-04-22 to Railway production”) is likely to drift. Consider linking to a tag/release or commit SHA and, if relevant, a status badge (or environment URL) to keep this accurate without manual edits.
  • [NON-BLOCKING] services/reviewer/README.md:5 — “v1” is undefined in this README. If you’re introducing a versioning concept, consider referencing the corresponding tag/release or adding a brief note in a CHANGELOG to avoid ambiguity about what “v1” maps to.
  • [PRE-EXISTING] services/reviewer/README.md:3 — The Notion link for the Structural Review position paper may be inaccessible to external contributors. Consider mirroring or adding a public reference. This is pre-existing and not introduced by this PR.

Spec verification table

  • Scope: “one line added to services/reviewer/README.md documenting the v1 deploy status and linking to mt#1099” — Not Met (line added as described, but the link target appears invalid)
  • Railway logs show event=review_result status=reviewed — N/A (operational criterion, not verifiable in code review)
  • PR’s reviews list shows a review by minsky-reviewer[bot] — N/A (operational)
  • Review body begins with “Independent adversarial review” — N/A (operational)
  • Review event reflects actual analysis — N/A (operational)

Documentation impact

  • This PR itself changes docs. If keeping a status line in the README, consider:
    • Linking to a stable artifact (release/tag or commit) and the production environment URL or a status badge to minimize drift.
    • Defining “v1” in a CHANGELOG or Releases page to ground the version reference.

Event: REQUEST_CHANGES

@edobry
Copy link
Copy Markdown
Owner

edobry commented Apr 22, 2026

Closing as successful smoke artifact for mt#1099 (Sprint A E2E verification).

What this PR confirmed (the signal we wanted):

  • minsky-reviewer[bot] posted independent reviews via GPT-5 on the Railway webhook, with a distinct App identity from the author minsky-ai[bot]
  • mt#1100's getAppIdentity fix is live in production
  • End-to-end: PR create → tier extraction → model dispatch → posted review works

Why not merge: the reviewer's second finding was actually correct — the /issues/mt-1099 link target would 404 since Minsky tasks aren't GitHub issues. Merging would ship a stale-dated README line with a broken link.

What was also confirmed (the calibration problem): on a 1-line README PR the reviewer REQUEST_CHANGES'd twice with valid-but-pedantic findings. This is the opposite-failure-mode the Structural Review paper predicted (0% approval on trivial PRs vs. rubber-stamp). Tracked as mt#1110 (Sprint B).

(had Claude draft this on my behalf)

edobry added a commit that referenced this pull request May 11, 2026
…073 audit-traceability)

## Summary

Closes the audit-traceability residual of mt#1511 (parent: mt#1073 Sprint A epic). The substantive Chinese-wall isolation work shipped via **mt#1083** (DONE — minsky-reviewer Railway service) and the existing `.claude/agents/reviewer.md` `tools:` allowlist; this PR adds explicit mt#1073 design-constraint citations to the two load-bearing files so the structural enforcement is traceable from code back to the originating design requirement.

## Status before this PR

4 of 5 of mt#1511's original success criteria were already met:

- Reviewer subagent invoked in fresh session with no inherited implementer context — enforced by Claude Code's Agent tool (local surface) and Railway process separation (cloud surface).
- Tool allowlist excludes write/mutation tools — `tools:` frontmatter in `.claude/agents/reviewer.md`.
- Reviewer receives exactly diff + spec + read-only codebase — `session_pr_review_context` (Mode 2), diff-file-path (Mode 1), and `fetchPullRequestContext` + `resolveTaskSpec` + `tools.ts` read-only allowlist (cloud).
- `.claude/agents/reviewer.md` reflects allowed tool set and context contract — cousin-surface block + Mode 1/Mode 2 protocol.

The only residual was the 5th criterion: explicit mt#1073 cite in code for audit-traceability. This PR closes it.

## Key changes

- `.claude/agents/reviewer.md` — extended the cousin-surface block with:
  - explicit "(per mt#1073 design constraints 2 and 3)" cite
  - new "Isolation boundary enforcement" sub-block naming the structural mechanisms on both surfaces (harness-level fresh prompt + curated tool allowlist for local; Railway process + separate App identity for cloud)
- `services/reviewer/src/review-worker.ts` — extended file-header JSDoc to document the Chinese-wall isolation contract: separate process, separate identity (mt#1083 + ADR-006), diff + spec + read-only access only. Names the cousin local surface and cross-references mt#1083/mt#1511.

## Out of scope

- `services/reviewer/README.md` — originally in-scope but collides with stale **open PR #703** (mt#1109, 2026-04-22). The README isolation-boundary section is informational; the agent file + worker file together satisfy audit-traceability. Recover when PR #703 resolves.

## Testing

No behavioral changes — pure documentation / code-comment edits. CI checks (TypeScript, lint, format) cover the file edits.

(Had Claude look into this — AI-assisted change.)

Co-Authored-By: minsky-ai[bot] <minsky-ai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant