fix(#1839): add technical doc accuracy to correctness sub-agent by fullsend-ai-coder[bot] · Pull Request #1840 · fullsend-ai/fullsend

fullsend-ai-coder · 2026-06-03T14:42:07Z

The correctness sub-agent was declaring "zero correctness surface area" on documentation-only PRs, even when those documents contained implementation plans with verifiable technical claims (algorithm descriptions, pseudocode, CLI flag semantics, API behavior claims). Human reviewers on PR #1804 found 9 confirmed technical accuracy issues that the bot missed.

Changes:

Updated the correctness sub-agent definition to own technical
accuracy in implementation plans and design documents, with
specific evaluation guidance for algorithm logic, API/library
behavior claims, design document alignment, internal
consistency, and edge case correctness.
Updated SKILL.md section 3b to classify docs/plans/ files and
technical documentation as having correctness surface area,
ensuring the correctness sub-agent is dispatched for such PRs.
Added an implementation plan example to the dispatch table.

Note: make lint could not run (sandbox Go toolchain permission error unrelated to these markdown-only changes). Pre-commit encountered the same infrastructure error (exit code 3). The post-script runs authoritative pre-commit on the runner.

Closes #1839

Post-script verification

Branch is not main/master (agent/1839-correctness-technical-docs)
Secret scan passed (gitleaks — 1088f9b74b9ed046b902bf25e6ce4204339c99ee..HEAD)
Pre-commit hooks passed (authoritative run on runner)
Tests ran inside sandbox

The correctness sub-agent was declaring "zero correctness surface area" on documentation-only PRs, even when those documents contained implementation plans with verifiable technical claims (algorithm descriptions, pseudocode, CLI flag semantics, API behavior claims). Human reviewers on PR #1804 found 9 confirmed technical accuracy issues that the bot missed. Changes: - Updated the correctness sub-agent definition to own technical accuracy in implementation plans and design documents, with specific evaluation guidance for algorithm logic, API/library behavior claims, design document alignment, internal consistency, and edge case correctness. - Updated SKILL.md section 3b to classify docs/plans/ files and technical documentation as having correctness surface area, ensuring the correctness sub-agent is dispatched for such PRs. - Added an implementation plan example to the dispatch table. Note: make lint could not run (sandbox Go toolchain permission error unrelated to these markdown-only changes). Pre-commit encountered the same infrastructure error (exit code 3). The post-script runs authoritative pre-commit on the runner. Closes #1839 Signed-off-by: fullsend-code <fullsend-code@users.noreply.github.com>

github-actions · 2026-06-03T14:43:50Z

Site preview

Preview: https://c7e4c249-site.fullsend-ai.workers.dev

Commit: 542ab8c498a4d488ac3b40b41087a9b13746cca8

fullsend-ai-review · 2026-06-03T14:51:40Z

Review

Findings

Medium

[edge-case] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:196 — The category table in section 3a was not updated with documentation-accuracy categories. The correctness sub-agent now owns technical accuracy in implementation plans, but the re-review routing table only lists code-oriented categories (logic-error, nil-deref, off-by-one, etc.). New categories the sub-agent may produce (e.g., algorithm-error, api-claim-incorrect, design-inconsistency) will only route correctly via the fallback rule ("to correctness as a fallback"). If the fallback rule is ever changed, these findings would be misrouted during re-reviews.
Remediation: Add documentation-accuracy categories (e.g., algorithm-error, api-claim-incorrect, design-inconsistency, edge-case-gap) to the correctness row of the category table in section 3a.
[incomplete-doc] docs/problems/code-review.md:55 — The Correctness agent section in the code-review problem doc describes the sub-agent's scope (logic errors, edge cases, test adequacy, split-payload attacks) but does not mention the new responsibility for technical documentation with correctness surface area. This doc is now stale relative to the expanded correctness.md definition.
Remediation: Add a bullet noting that technical documentation with correctness surface area (algorithm logic, API behavior claims, design documents under docs/plans/) is also reviewed by the correctness sub-agent.

Low

[pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:215 — The new classification bullet uses an em dash (—) as an explanatory aside before the arrow (→), a pattern not used in the other bullets in this list. Minor stylistic inconsistency; the em dash serves a legitimate clarification purpose.
[design-direction] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:214 — The new classification criterion introduces content-based detection patterns ("algorithm descriptions, pseudocode, data structure definitions") but existing dispatch triage in section 3b is primarily domain-based (file paths, changed symbols). It is unclear whether the orchestrator should inspect file contents or rely solely on the docs/plans/ path prefix. See also: [edge-case] finding at SKILL.md:196.
[code-organization] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:253 — The new dispatch example "Implementation plan in docs/" is inserted at the top of the table. The existing table appears roughly ordered by complexity. Consider placing it after "Typo fix in README" since both are documentation-focused.
[pattern-inconsistency] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/correctness.md:14 — The Own: section now mixes parenthetical clarifying questions with a declarative addition, slightly breaking parallel structure.

Info

[architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:13 — Pre-existing ADR-0018 deviation. The skill already documents this as an "approved temporary exception." Not introduced by this PR.
[api-contract] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/correctness.md:22 — The guidance to "cross-check against known behavior" for API/library claims relies on model training knowledge, which has a cutoff date. For claims about internal APIs, tool-based verification against repo source would be more reliable.

github-actions Bot deployed to site-preview June 3, 2026 14:43 View deployment

fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#1839): add technical doc accuracy to correctness sub-agent#1840

fix(#1839): add technical doc accuracy to correctness sub-agent#1840
fullsend-ai-coder[bot] wants to merge 1 commit into
mainfrom
agent/1839-correctness-technical-docs

fullsend-ai-coder Bot commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

fullsend-ai-review Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

fullsend-ai-coder Bot commented Jun 3, 2026

Post-script verification

Uh oh!

github-actions Bot commented Jun 3, 2026

Site preview

Uh oh!

fullsend-ai-review Bot commented Jun 3, 2026

Review

Findings

Medium

Low

Info

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants