Skip to content

[spec-review] Update Safe Outputs conformance checker for recent spec changes#23534

Merged
pelikhan merged 1 commit intomainfrom
spec-review/safe-outputs-conformance-2026-03-30-d0e4040490868938
Mar 30, 2026
Merged

[spec-review] Update Safe Outputs conformance checker for recent spec changes#23534
pelikhan merged 1 commit intomainfrom
spec-review/safe-outputs-conformance-2026-03-30-d0e4040490868938

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Summary

Updates the Safe Outputs conformance checker script to align with the newly-added Safe Outputs specification (v1.15.0, added in commit b19fe61).

Specification Changes Reviewed

  • Commit b19fe61 (2026-03-29): Added docs/src/content/docs/reference/safe-outputs-specification.md v1.15.0 — a brand-new 4,718-line W3C-style specification covering:
    • Section 8.3: Protocol Exchange Patterns — MCE1–MCE5 requirements for MCP server constraint enforcement and dual enforcement pattern
    • Section 11: Cache Memory Integrity — CI1–CI12 requirements for integrity-aware cache branching
📋 Script Updates & Testing Details

Script Updates

New Checks Added

  • MCE-001 (Section 8.3 MCE2): Verifies that add_comment tool descriptions in pkg/workflow/js/safe_outputs_tools.json surface enforcement constraint limits to the LLM (65536 character limit, 10 mention limit, 50 link limit, and CONSTRAINTS/IMPORTANT keyword presence).

  • MCE-002 (Section 8.3 MCE4): Verifies the dual enforcement pattern — constraint limits must be enforced both at MCP gateway invocation time (safe_outputs_handlers.cjs) and at safe output processing time (add_comment.cjs). Both files must import comment_limit_helpers.cjs.

  • CI-001 (Section 11, CI6 + CI10): Verifies that both cache memory integrity scripts exist: actions/setup/sh/setup_cache_memory_git.sh and actions/setup/sh/commit_cache_memory_git.sh.

  • CI-002 (Section 11.2, CI7–CI12): Verifies the cache memory setup script supports all four integrity levels (merged, approved, unapproved, none), implements merge-down from higher-integrity branches (CI8), the commit script invokes git gc --auto for compaction (CI11), and handles missing .git directories gracefully (CI12).

Checks Modified (false positive fixes)

  • USE-001: Added skip patterns for apm_unpack and run_apm_unpack files (APM Bundle Unpacker — not a safe output handler). Added an additional filter: only check files that actually use octokit. API calls or record safe output operations. Previously flagged 2 false positives.

  • USE-003: Narrowed the staged mode check to only flag files that reference the safe outputs-specific env var GH_AW_SAFE_OUTPUTS_STAGED (or the helper functions logStagedPreviewInfo/generateStagedPreview). Previously flagged generate_observability_summary.cjs as a false positive because it uses a generic staged field unrelated to the spec's staged mode.

Updated Documentation

  • Updated script header to reference the specification file path and version (v1.15.0, 2026-03-29).

Testing

Ran the updated script — all 19 checks pass with zero failures:

PASS: SEC-001 through IMP-003 (14 existing checks)
PASS: MCE-001: Tool descriptions properly disclose enforcement constraints
PASS: MCE-002: Dual enforcement pattern implemented in both gateway and processor
PASS: CI-001: Cache memory integrity scripts exist
PASS: CI-002: Cache memory integrity branching properly implemented

Critical Failures: 0 | High Failures: 0 | Medium Failures: 0 | Low Failures: 0
PASS: All checks passed

Previous baseline had 3 LOW false positives (USE-001 × 2, USE-003 × 1) — all resolved.

Related Files

  • Specification: docs/src/content/docs/reference/safe-outputs-specification.md
  • Conformance Script: scripts/check-safe-outputs-conformance.sh

Generated by Weekly Safe Outputs Specification Review ·

  • expires on Apr 6, 2026, 11:16 AM UTC

- Update header to reference the specification file and version
- Fix USE-001 false positives: skip apm_unpack and observability files;
  only check handlers that actually use octokit or record safe outputs
- Fix USE-003 false positives: only flag files using the actual staged
  mode env var (GH_AW_SAFE_OUTPUTS_STAGED), not generic 'staged' patterns
- Add MCE-001: verify tool descriptions disclose constraint limits
  (Section 8.3 MCE2 - 65536 chars, 10 mentions, 50 links for add_comment)
- Add MCE-002: verify dual enforcement of comment constraints at both
  MCP invocation time and processing time (Section 8.3 MCE4)
- Add CI-001: verify cache memory integrity scripts exist
  (setup_cache_memory_git.sh and commit_cache_memory_git.sh - CI6, CI10)
- Add CI-002: verify integrity branch support in cache scripts
  (all 4 levels, merge-down, git gc compaction, no-.git fallback - CI7-CI12)

Specification: docs/src/content/docs/reference/safe-outputs-specification.md v1.15.0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added automation documentation Improvements or additions to documentation safe-outputs labels Mar 30, 2026
@pelikhan pelikhan merged commit f318b34 into main Mar 30, 2026
@pelikhan pelikhan deleted the spec-review/safe-outputs-conformance-2026-03-30-d0e4040490868938 branch March 30, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation documentation Improvements or additions to documentation safe-outputs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant