DEVOP-617: org-wide go.mod replace-directive audit by srt0422 · Pull Request #9 · allora-network/.github

srt0422 · 2026-05-25T22:18:25Z

Summary

Adds a weekly + manual-dispatch audit that catches the Go-side
Shai-Hulud vector: a compromised go.mod replace directive
redirecting a legitimate import to an attacker fork. Complements
DEVOP-560 (PR #8) — DEVOP-560 is the deep daily forensic clone sweep,
this is the lighter no-clone Contents-API pass that runs weekly across
every Go module in the org.

Linear: https://linear.app/alloralabs/issue/DEVOP-617

What this PR adds

.github/workflows/gomod-replace-audit.yml — weekly Mon 05:17 UTC + workflow_dispatch.
- Enumerates every org Go module via gh search code --owner allora-network 'filename:go.mod' --limit 200.
- Fetches each go.mod via gh api repos/<r>/contents/<p> (no cloning).
- Runs the canonical awk extractor from shai-hulud-defense/REFERENCE.md (handles single-line and replace (...) block form).
- Classifies each RHS:
  - legitimate-allowlisted-host — RHS host in canonical trusted-host allowlist (github.com/(allora-network|cosmos|ethereum|fluxcd) | gopkg.in | google.golang.org | go.uber.org | go.opentelemetry.io | k8s.io | sigs.k8s.io).
  - legitimate-version-pin — LHS module path == RHS module path. Structurally cannot redirect to an attacker fork; flagged only by the host filter otherwise.
  - legitimate-local-relative — ./... / ../... workspace replace.
  - investigate-absolute — /... (IOC-grade).
  - SUSPICIOUS — non-allowlisted host AND LHS != RHS (IOC-grade).
- SUSPICIOUS or investigate-absolute → rolling GitHub Issue (label gomod-replace-audit, distinct from DEVOP-560's shai-hulud-sweep) + Slack page via SLACK_SECURITY_WEBHOOK.
- Fetch failures → rolling-issue update only.
- Permissions: contents: read, issues: write. All uses: SHA-pinned (matches DEVOP-560 pins).
- actionlint clean (incl. shellcheck).
docs/security/gomod-replace-audit-2026-05-25.md — initial point-in-time audit report (executed locally before opening this PR).
docs/plans/2026-05-25-devop-617-gomod-replace-audit.md — short execution plan.

Top-line audit results (from the report)

Metric	Count
Go modules scanned	12 (across 11 repos)
Modules with zero replace directives	8
Total `replace` directives found	6
Allowlisted-host RHS	2
Same-path version-pin RHS (non-allowlisted host but `LHS == RHS`)	4
Local relative / absolute / SUSPICIOUS	0 / 0 / 0
Escalated to incident response	0

No SUSPICIOUS findings. No escalation. Every current replace directive in the org is either:

on an allowlisted host (cosmos), OR
a same-path version pin (LHS module path == RHS module path — structurally cannot redirect).

Specifically the four non-allowlisted entries are all same-path version pins:

allora-chain/go.mod: gin-gonic/gin v1.9.1 and syndtr/goleveldb v1.0.1-... (canonical Cosmos SDK simapp pins; adjacent comments reference cosmos/cosmos-sdk#10409).
allora-sdk-go/go.mod + forge-v2/backend/go.mod: cometbft/cometbft v0.38.17 (canonical Cosmos BFT consensus engine).

See the full audit report for the row-by-row classification + recommendations (adding github.com/cometbft to the canonical allowlist is the obvious follow-up).

Coordination with PR #8 (DEVOP-560)

Different branch (scott/devop-617-gomod-replace-audit), different files (workflow + plan + report all new).
Different rolling-issue label (gomod-replace-audit vs shai-hulud-sweep) so the two pipelines don't collide.
Cron offset (Mon 05:17 UTC vs daily 04:07 UTC) so they don't compete for the same scheduler slot.
This workflow does NOT touch scripts/shai-hulud-ioc-sweep.sh or any of PR DEVOP-560: add org-wide daily Shai-Hulud IOC sweep workflow #8's files.

Test plan

actionlint .github/workflows/gomod-replace-audit.yml (incl. shellcheck) — clean.
Local end-to-end audit produced 6 findings, 0 SUSPICIOUS (matches the report).
workflow_dispatch after merge to validate the workflow run produces an artifact + (with current org state) leaves the rolling issue untouched.
Inject a synthetic suspicious replace in a sandbox repo and re-run to confirm the rolling-issue + Slack path fires.

Made with Cursor

Summary by cubic

Adds a weekly and manual org audit that checks every go.mod replace for attacker redirects and alerts on anything outside the trusted-host allowlist. Meets DEVOP-617 acceptance criteria; improves parsing, scope checks (including partial private-coverage detection), and alerting to avoid false-clean runs and noisy pages.

New Features
- Adds .github/workflows/gomod-replace-audit.yml (Mon 05:17 UTC + workflow_dispatch).
- Scans via gh search code + gh api (no cloning); classifies replaces; SUSPICIOUS or absolute → rolling issue + Slack; fetch failures → issue only.
- Uploads audit.tsv and summary.md; SHA-pinned actions; minimal perms (contents: read, issues: write).
- Token-scope probe: fail loudly if the token can’t see private repos; require secrets.GH_ORG_READ_TOKEN or ack public-only runs via vars.ACCEPT_PUBLIC_ONLY_AUDIT=true.
- Docs added (plan + initial report). Initial audit: 12 modules, 6 replaces, 0 suspicious.
Bug Fixes
- AWK extractor: skip full-line // comments and strip trailing // ... on real directives to avoid false SUSPICIOUS.
- Contents API: normalize JSON null to empty; accept spec-valid go.mod with a module directive anywhere (leading comments allowed); otherwise treat as fetch failure.
- Slack paging: gate on success() and non-empty outputs; final run summary reflects audit failure vs clean.
- Warn when gh search code hits the 200-result cap.
- Corrected report: zero-replace modules is 9; added “modules with at least one replace” row.
- Scope probe: paginate private-repo listing to detect partial coverage (selected-repo tokens); fail unless acknowledged via vars.ACCEPT_PUBLIC_ONLY_AUDIT=true.

^{Written for commit be4718d. Summary will update on new commits. Review in cubic}

Adds a weekly + manual-dispatch audit that enumerates every Go module across the allora-network org, fetches each `go.mod` via the Contents API, extracts every `replace` directive (single-line + `replace (...)` block form) with the canonical awk extractor from shai-hulud-defense REFERENCE.md, and classifies the RHS against the same trusted-host allowlist that scripts/shai-hulud-ioc-sweep.sh uses. Findings: - SUSPICIOUS (RHS non-allowlisted host AND LHS module path != RHS) → rolling GitHub Issue (label `gomod-replace-audit`) + Slack page via SLACK_SECURITY_WEBHOOK. - legitimate-version-pin (LHS == RHS module path, structurally cannot redirect) → no-op. - Fetch failures → rolling-issue update only (operational, not IOC). Distinct rolling-issue label from DEVOP-560's `shai-hulud-sweep` so the two pipelines don't collide. SHA-pinned `uses:`. Permissions are `contents: read` + `issues: write` only. Initial point-in-time audit committed at docs/security/gomod-replace-audit-2026-05-25.md: 12 modules scanned, 6 replace directives, 0 SUSPICIOUS findings. All current replaces are same-path version pins from the Cosmos SDK simapp pattern (gin-gonic/gin, syndtr/goleveldb, cosmos/cosmos-sdk, cometbft/cometbft). Refs: https://linear.app/alloralabs/issue/DEVOP-617 Co-authored-by: Cursor <cursoragent@cursor.com>

cubic-dev-ai

cubic analysis

2 issues found across 3 files

Linked issue analysis

Linked issue: DEVOP-617: Audit all org go.mod files for suspicious replace directives

Status	Acceptance criteria	Notes
✅	Enumerate every org Go repo via gh search (filename:go.mod --limit 200).	The workflow's 'Enumerate org Go modules' step runs gh search code --owner "$ORG" 'filename:go.mod' --limit 200 and writes the repo+path tuples to paths.tsv.
✅	For each repo, fetch go.mod and run the replace extractor (awk) to list replace directives (single-line and replace(...) block form).	The 'Fetch and audit each go.mod' step uses gh api repos//contents/ to fetch and base64-decode go.mod and runs an awk extractor that handles single-line and block-form replace directives.
✅	Cross-check each replace RHS against the canonical trusted-host allowlist.	The workflow sets GO_TRUSTED_HOSTS_RE to the canonical allowlist regex and classifies RHS using grep -qE against that variable.
⚠️	For every non-allowlisted match, document repo + go.mod line, target path, reason for the replace (commit history), and disposition (legitimate / remove / investigate).	The workflow records repo, path, line number, LHS/RHS, classification and the original line (audit.tsv and summary.md) and the docs include row-by-row classifications and dispositions, but it does not capture commit history or provenance (reason for the replace) from git metadata.
✅	Produce a final report on the ticket; escalate SUSPICIOUS findings to incident response.	The PR includes an initial point-in-time report file and the workflow uploads summary/artifacts; the workflow appends/creates a rolling issue for non-clean runs and pages Slack for SUSPICIOUS findings (intended escalation).

Architecture diagram

sequenceDiagram
    participant GC as GitHub Cron (Mon 05:17 UTC)
    participant WFA as Workflow: gomod-replace-audit
    participant GHAPI as gh CLI (GitHub API)
    participant GS as GitHub Search (code)
    participant GCONT as GitHub Contents API
    participant AWR as awk Extractor
    participant CLS as Classifier (allowlist)
    participant ISSUE as Rolling GitHub Issue (gomod-replace-audit)
    participant SLACK as Slack Security Webhook
    participant ART as Workflow Artifacts

    Note over WFA,ART: Weekly org-wide no-clone go.mod replace audit

    alt Scheduled trigger
        GC->>WFA: cron: ‘17 5 * * 1’
    else Manual trigger
        WFA->>WFA: workflow_dispatch
    end

    WFA->>GHAPI: Checkout .github repo (SHA-pinned actions/checkout)
    WFA->>GHAPI: Verify gh, jq, awk, base64

    Note over WFA,GS: Step: Enumerate org Go modules

    WFA->>GS: gh search code --owner allora-network 'filename:go.mod' --limit 200
    GS-->>WFA: JSON list of {repository.nameWithOwner, path}
    WFA->>WFA: Sort unique <repo,path> tuples → paths.tsv
    WFA->>WFA: Count discovered modules

    loop For each <repo,path> in paths.tsv
        Note over WFA,GCONT: Step: Fetch and audit each go.mod

        WFA->>GCONT: gh api repos/<repo>/contents/<path> --jq '.content'
        GCONT-->>WFA: base64-encoded go.mod content
        WFA->>WFA: base64 -d → gomod file

        alt Fetch failure (missing repo, moved default branch)
            WFA->>WFA: Record in fetch-failures.tsv
            WFA->>WFA: Skip to next module
        else Success
            WFA->>AWR: awk extract replace directives (single-line + block form)
            AWR-->>WFA: Parsed {repo, path, line_no, lhs, rhs, original_line}

            loop For each extracted replace directive
                WFA->>CLS: Classify RHS module path

                alt RHS starts with ./ or ../
                    WFA->>WFA: Class = legitimate-local-relative
                else RHS starts with /
                    WFA->>WFA: Class = investigate-absolute (IOC-grade)
                else RHS matches allowlist regex
                    WFA->>WFA: Class = legitimate-allowlisted-host
                else LHS module path == RHS module path
                    WFA->>WFA: Class = legitimate-version-pin
                else RHS not allowlisted AND LHS != RHS
                    WFA->>WFA: Class = SUSPICIOUS (IOC-grade)
                end

                WFA->>WFA: Append to audit.tsv with classification
            end
        end
    end

    Note over WFA,SLACK: Step: Generate summary & escalate if needed

    WFA->>WFA: Count total directives, SUSPICIOUS, fetch failures
    WFA->>WFA: Generate summary.md (markdown report)

    alt SUSPICIOUS or investigate-absolute count > 0
        WFA->>ISSUE: Update/create rolling issue (label: gomod-replace-audit)
        Note over WFA,SLACK: Append SUSPICIOUS findings to issue body
        WFA->>SLACK: Page Slack via SLACK_SECURITY_WEBHOOK
    else Fetch failures > 0 only
        WFA->>ISSUE: Update rolling issue with fetch failures only
        Note over WFA,SLACK: No Slack page - only issue update
    else Clean run (no findings)
        WFA->>WFA: No issue update, no Slack page
    end

    WFA->>ART: Upload audit.tsv and summary.md as workflow artifacts
    ART-->>WFA: Artifacts stored (downloadable from workflow run)

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

Addresses cubic-dev-ai review on PR #9: - P1 (.github/workflows/gomod-replace-audit.yml:60): Add a 'Probe token scope' step that fails loudly when the workflow runs with only the default GITHUB_TOKEN against an org that has private repos. Without this, the audit could silently return a false-clean result because 'gh search code' and 'gh api orgs/<org>/repos?type=private' both omit private repos under that token. Operators who consciously accept a public-only audit can ack via the ACCEPT_PUBLIC_ONLY_AUDIT org variable. - P2 (docs/security/gomod-replace-audit-2026-05-25.md:30): Correct the zero-replace count from 8 to 9 to match the list of nine modules in the section below (12 scanned − 3 with replace directives = 9). Add a companion 'Modules with at least one replace directive' row for cross-checking. Refs: https://linear.app/alloralabs/issue/DEVOP-617 Co-authored-by: Cursor <cursoragent@cursor.com>

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/workflows/gomod-replace-audit.yml">

<violation number="1" location=".github/workflows/gomod-replace-audit.yml:123">
P1: This probe treats visibility of one private repo as full private coverage, so selected-repository PAT/App tokens can still leave some private repos unaudited while the workflow reports complete coverage.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

ce-correctness-reviewer + cubic-dev-ai found issues that would let the audit silently false-clean or page Slack with malformed alerts. All addressed: - AWK extractor: skip full-line `//` comments and strip trailing `// ...` from real directives. Without this, a commented historical replace inside a `replace (...)` block (a common dependency-migration pattern, e.g. Cosmos SDK simapp) parsed as `lhs="// ...", rhs="..."` → lhs != rhs → SUSPICIOUS → false-positive Slack page on every weekly run. ce-correctness-reviewer P1 conf 85. - Contents API: `--jq '.content // ""'` so JSON null (directory, submodule, oversized symlink) normalizes to empty string and is caught by the existing `[ ! -s ]` fetch-failure guard. Previously the literal "null" base64-decoded to 3 garbage bytes and silently produced an empty audit row. Plus a sanity check that decoded files start with `module ` before classifying. ce-correctness-reviewer P2. - Slack-page step: gated on `success()` (not `always()`) and on the audit step producing non-empty outputs. Prevents a malformed incident-grade Slack page with no count and "summary unavailable" body when the audit step fails before emitting outputs. ce-correctness-reviewer P2. - Final-summary step: branch on `steps.audit.outcome` so a failed audit doesn't render as "Audit clean — 0 replace directives". - gh search code limit: warn loudly when results hit the 200 cap so a future Shai-Hulud-vector go.mod at position 201+ doesn't go silently unscanned. ce-correctness-reviewer residual. - Token scope: pre-flight probe step that detects when the workflow has only `GITHUB_TOKEN` and the org has private repos, fails the audit loudly (or accepts a documented `ACCEPT_PUBLIC_ONLY_AUDIT` override) instead of silently false-cleaning every private Go module. cubic-dev-ai P1 conf 9/10. - Audit report counts: fix off-by-one (8 zero-replace modules → 9; the bullet list always had 9 entries). cubic-dev-ai P2 conf 10/10. Cross-pipeline follow-ups (shared regex file, shared awk extractor, fixture-based parser tests, lookalike-bypass regex corpus) are tracked in docs/security/gomod-replace-audit-2026-05-25.md under "Cross-pipeline follow-ups" — they require touching the sibling DEVOP-560 PR's files and stay out of scope here. Refs: https://linear.app/alloralabs/issue/DEVOP-617 Co-authored-by: Cursor <cursoragent@cursor.com>

cubic-dev-ai

2 issues found across 2 files (changes from recent commits).

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic}

Cubic P1 (PRRT_kwDOLZ5Xss6EqQn3): the previous probe checked only the first page of org private repos with per_page=1, so a selected-repository PAT or GitHub App token granting access to ONE private repo would falsely report 'full coverage' while leaving every other private repo unaudited. Paginate the full private-repo list, compare visible vs total_private_repos, and fail (or warn under ACCEPT_PUBLIC_ONLY_AUDIT) when the counts don't match. Preserves the existing public-only / org-object-permission-denied escape hatches. Co-authored-by: Cursor <cursoragent@cursor.com>

…ty check cubic-dev-ai re-review of ba9ff05 (P2 conf 9): the previous `head -1 "$gomod" | grep -q '^module '` rejected any go.mod whose first line is a comment or blank line, even though both are spec-valid. The check would silently mark those files as fetch failures and skip the audit entirely — the same false-clean failure mode the sanity check was added to prevent. Switch to `grep -qm1 '^[[:space:]]*module '` so the check accepts any spec-valid go.mod and only flags responses that contain no `module` directive anywhere (the actual binary/garbage case we want to catch). Refs: https://linear.app/alloralabs/issue/DEVOP-617 Co-authored-by: Cursor <cursoragent@cursor.com>

srt0422 · 2026-05-28T15:30:50Z

Needs-human follow-up #3 verification done.

Re-checked the close-out concern that `aeb5cc0` swept up the 'Cross-pipeline follow-ups (deferred)' section in `docs/security/gomod-replace-audit-2026-05-25.md`:

State now: the deletion was REVERSED by `be4718d` (2026-05-26 13:09 PT, after the close-out at 10:26 PT). The deferred-follow-ups section is back in the doc (lines ~102\u2013129 on the branch HEAD). No action needed in the PR itself.

Linear tracking: the three items in the deferred section had no dedicated Linear coverage prior to this run \u2014 they were tracked only in the at-risk doc. Filed three Low-priority follow-up tickets so the tracking is durable beyond the doc:

Item	Ticket
Extract `GO_TRUSTED_HOSTS_RE` to a single committed source (so workflow + sweep + REFERENCE.md don't drift)	DEVOP-632
Fixture-based parser tests for go.mod `replace` extraction (8+ edge cases + classification matrix)	DEVOP-633
Regex lookalike-bypass corpus test for the `(/\|$)` boundary anchor in `GO_TRUSTED_HOSTS_RE`	DEVOP-634

(Note: the deferred section also lists a fourth item \u2014 'extract the awk replace-directive extractor to scripts/extract-go-replace.awk' \u2014 which I folded into DEVOP-632's acceptance criteria since extracting the shared regex and extracting the shared extractor are the same atomic 'consolidate the duplicated parsing pipeline' chunk of work. Happy to split it out if anyone disagrees.)

All three tickets link back to this PR and to the specific anchor line in the deferred section.

srt0422 added shai-hulud Shai-Hulud supply-chain defense work needs-human-review labels May 25, 2026

cubic-dev-ai Bot reviewed May 25, 2026

View reviewed changes

Comment thread .github/workflows/gomod-replace-audit.yml

Comment thread docs/security/gomod-replace-audit-2026-05-25.md Outdated

cubic-dev-ai Bot reviewed May 26, 2026

View reviewed changes

Comment thread .github/workflows/gomod-replace-audit.yml

cubic-dev-ai Bot reviewed May 26, 2026

View reviewed changes

Comment thread .github/workflows/gomod-replace-audit.yml Outdated

Comment thread docs/security/gomod-replace-audit-2026-05-25.md

srt0422 and others added 2 commits May 26, 2026 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEVOP-617: org-wide go.mod replace-directive audit#9

DEVOP-617: org-wide go.mod replace-directive audit#9
srt0422 wants to merge 5 commits into
mainfrom
scott/devop-617-gomod-replace-audit

srt0422 commented May 25, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

srt0422 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srt0422 commented May 25, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this PR adds

Top-line audit results (from the report)

Coordination with PR #8 (DEVOP-560)

Test plan

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

cubic analysis

Linked issue analysis

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

srt0422 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

srt0422 commented May 25, 2026 •

edited by cubic-dev-ai Bot

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading