DEVOP-560: add org-wide daily Shai-Hulud IOC sweep workflow by srt0422 · Pull Request #8 · allora-network/.github

srt0422 · 2026-05-25T07:23:55Z

Summary

Adds the first workflow under .github/workflows/ in this repo — a scheduled
daily sweep for Shai-Hulud indicators of compromise across every repo in the
allora-network org, plus a rolling GitHub Issue and Slack page on findings.

Closes DEVOP-560.

What ships

.github/workflows/shai-hulud-sweep.yml — schedule: '7 4 * * *'
(04:07 UTC daily, off-peak + off-minute) plus workflow_dispatch for manual
runs. Permissions limited to contents: read + issues: write. Pinned SHAs
for actions/checkout@v4.2.2 and actions/upload-artifact@v4.4.3, matching
the convention in allora-network/ci-workflows-private.
scripts/shai-hulud-ioc-sweep.sh — canonical detection logic, vendored
verbatim from allora-network/skills@71aeefb (skills/shai-hulud-defense/scripts/).
See file header for refresh procedure / pinned commit.
docs/plans/2026-05-25-devop-560-shai-hulud-sweep.md — design notes for
why we vendor the script, how the rolling issue is maintained, when Slack
fires, and the GH_ORG_READ_TOKEN follow-up.

Detection coverage (per the vendored script)

Lockfile entries (npm/pip/Go) matching .github/security/ioc-packages.txt.
Any .js/.cjs/.mjs file ≤ 2 MB whose SHA-256 matches
.github/security/ioc-hashes.txt (filename-agnostic — bundle.js rename
doesn't bypass).
Persistence: */.github/workflows/shai-hulud*.{yml,yaml} at repo root.
npm install/postinstall/preinstall lifecycle scripts matching
node …bundle.js, curl|sh, wget|sh, base64 -d|--decode|-D,
eval $(…), or npx … bundle.
Go replace directives: untrusted-host RHS, absolute-path RHS,
top-level-path mismatch (Scenario C in-org redirect), and local replacements
(./ / ../) flagged for human review.
Go workflow env settings (GOSUMDB=off, GONOSUMCHECK, GOINSECURE,
GOFLAGS=*-insecure) — direct and indirect (vars/secrets/env/inputs).
Public exfil repos matching ^[Ss]hai-[Hh]ulud under org:allora-network
AND under each org member (rate-limited).

Outputs

Sweep result	Script exit	Rolling issue	Slack
Clean (no findings)	`0`	no-op	no-op
Operational (clone_failed / check_skipped / go_local_replace)	`2`	comment appended (or issue opened with label `shai-hulud-sweep`)	no-op
IOC-grade	`1`	comment appended (or issue opened)	paged via `${{ secrets.SLACK_SECURITY_WEBHOOK }}`

Forensic evidence (clones of repos that produced IOC findings) is uploaded as
a workflow artifact for 30 days so humans can inspect the matched file without
re-cloning point-in-time evidence.

The workflow never auto-closes the rolling issue; humans drive close/reopen so
triage state survives across daily runs.

Secrets used

SLACK_SECURITY_WEBHOOK — org secret; payload only delivered on IOC-grade
findings. No-ops gracefully if unset (warning, not failure).
GH_ORG_READ_TOKEN — optional org secret. When present, preferred over
the default GITHUB_TOKEN for org-wide enumeration so private repos and the
org-members exfil search are covered. When absent, member enumeration emits
check_skipped operational findings — visible partial coverage, never a
silent false-clean.

Verification

actionlint .github/workflows/shai-hulud-sweep.yml — clean (no findings).
python3 -c 'yaml.safe_load(...)' — parses.
Manual workflow_dispatch recommended after merge to verify
gh issue and Slack paths end-to-end against the live org.

Followups (intentionally out-of-scope for this PR)

Provision GH_ORG_READ_TOKEN (fine-grained PAT or GitHub App token with
read:org + repo:read) once org-admin signs off — the workflow already
prefers it when present.
Quarterly review of the trusted Go module-path allowlist (GO_TRUSTED_HOSTS_RE
in the script) to keep go_suspicious_replace false-positive rate low.

Made with Cursor

Summary by cubic

Adds a daily org‑wide Shai‑Hulud IOC sweep that scans all allora-network repos, updates a rolling issue, and pages Slack on incident‑grade findings with alert‑dedup and a weekly re‑page. Meets DEVOP-560 requirements: scheduled workflow at .github/workflows/shai-hulud-sweep.yml, org repo iteration, IOC list checks, member exfil search, rolling issue updates, and Slack notifications with minimal perms.

New Features
- Added .github/workflows/shai-hulud-sweep.yml (cron 7 4 * * * + workflow_dispatch; minimal perms; serialized concurrency). Pins actions/checkout@v4.2.2 and actions/upload-artifact@v4.4.3.
- Vendored scripts/shai-hulud-ioc-sweep.sh with SHA‑256 sidecar verification; reads .github/security/ioc-packages.txt and .github/security/ioc-hashes.txt (# schema:v1). Detection covers lockfiles (incl. structured package-lock.json), sub‑2MB JS SHA‑256 hashes, exact Shai‑Hulud persistence workflow filenames, suspicious npm lifecycle scripts, Go replace/path‑mismatch/unsafe‑env, and org/member public exfil search.
- Outputs: maintains a rolling issue labeled shai-hulud-sweep; Slack via SLACK_SECURITY_WEBHOOK on IOC with dedup gating; prefers GH_ORG_READ_TOKEN, else falls back to GITHUB_TOKEN and emits check_skipped; uploads only findings.json, summary.md, repos.txt for 30 days.
- Added .github/CODEOWNERS to require @allora-network/security (and @allora-network/devops for the workflow). Added plan doc with action SHA‑pin rotation and follow‑ups.
Bug Fixes
- Slack: IOC alert dedup by IOC hash‑stamp (first‑seen/changed/≥7‑day re‑page); filter dedup markers to github-actions[bot]; write the paged-at marker only after a successful Slack send; fail‑open if the stamp can’t be computed; 3‑attempt retry with backoff and Retry‑After honoring; fixed HTTP code capture.
- Rolling issue: new shared “Find rolling issue” step (oldest‑open via --search sort:created-asc) reused by dedup, updates, and paged‑marker steps; IOC comments include visible page‑decision plus hidden stamp markers; Slack still fires on IOC even if the issue update fails.
- Safety: sanitize untrusted strings in Slack and issue bodies; verify the vendored script via a locked‑path SHA‑256 sidecar before execution.
- Artifacts/robustness: restrict uploads to structured outputs; suffix artifact name with ${{ github.run_attempt }}; placeholder summary on pre‑aggregation failure; final run summary surfaces the Slack‑dedup tri‑state explicitly.
- Detection tuning: narrow persistence detection to exact IOC filenames (avoids self‑alerts); add # schema:v1 headers and assertions; extend GO_TRUSTED_HOSTS_RE to include cometbft; refresh the .sha256 sidecar.

^{Written for commit c10d0dc. Summary will update on new commits. Review in cubic}

Adds the first workflow under .github/workflows/ for the allora-network/.github repo: a scheduled daily sweep (04:07 UTC) plus workflow_dispatch that scans every repo in the org for Shai-Hulud indicators of compromise, maintains a rolling GitHub issue labelled `shai-hulud-sweep`, and pages Slack via the SLACK_SECURITY_WEBHOOK secret on incident-grade findings. Detection logic lives in scripts/shai-hulud-ioc-sweep.sh, vendored verbatim from allora-network/skills@71aeefb (skills/shai-hulud-defense). The script is vendored rather than cloned at workflow time because that repo is private and the workflow's default GITHUB_TOKEN cannot read it; vendoring also keeps the daily sweep working through upstream rename/outage. See the script header for the refresh procedure. Key design choices documented in docs/plans/2026-05-25-devop-560-shai-hulud-sweep.md: - IOC inputs read from .github/security/ioc-packages.txt + ioc-hashes.txt (DEVOP-561, merged in PR #2). Script validates the `# schema:v1` header before running so a silent seed-list format change fails closed. - Rolling issue: workflow finds an existing open issue with the label and appends a comment, else opens a new one. Humans drive close/reopen so triage state survives across daily runs. - Slack page only fires on IOC-grade findings (script exit 1). Operational findings (exit 2 — clone_failed / check_skipped / go_local_replace) update the issue but do not page after-hours. - Permissions: `contents: read` + `issues: write` only. - Pinned SHAs for actions/checkout (v4.2.2) and actions/upload-artifact (v4.4.3) match the convention in allora-network/ci-workflows-private. - Prefers a `GH_ORG_READ_TOKEN` secret if present (private-repo + member enumeration); falls back to GITHUB_TOKEN. In the fallback path, member enumeration emits `check_skipped` operational findings so the partial coverage is visible in the rolling issue. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>

cubic-dev-ai

cubic analysis

1 issue found

Linked issue analysis

Linked issue: DEVOP-560: Create org-wide daily IOC sweep workflow in .github repo

Status	Acceptance criteria	Notes
✅	.github/workflows/shai-hulud-sweep.yml exists in questa repo	PR adds the workflow file at .github/workflows/shai-hulud-sweep.yml.
✅	Cron: daily at an off-peak local time, off-minute (e.g. '7 4 * * *')	Workflow schedule is set to '7 4 * * *' and includes workflow_dispatch for manual runs.
✅	Iterate all org repos via gh api orgs/allora-network/repos --paginate	The vendored script enumerates repos using gh api --paginate /orgs/$ORG/repos and the workflow runs that script.
✅	For each repo: shallow clone, run sweep checks, report findings	Script uses git clone --depth 1 to shallow-clone each repo, runs the collection/checks, and aggregates findings; workflow uploads artifacts and updates the rolling issue.
✅	Compare against IOC lists at .github/security/ioc-packages.txt and .github/security/ioc-hashes.txt	Workflow passes those two paths to the script; the script validates the schema header and uses both lists in its matching logic.
⚠️	Search the GitHub API for public repos under org members named ^[Ss]hai-[Hh]ulud	Script implements org-scoped and per-member searches and records findings, but per-member enumeration can be limited without a provisioned GH_ORG_READ_TOKEN and will emit check_skipped; member-side search is rate-limited with sleeps. Implementation exists but effective coverage depends on token provisioning and rate limits.
✅	Maintain a single rolling GitHub Issue in .github repo; append new findings	Workflow finds an open issue with label shai-hulud-sweep (oldest first), appends a comment if present, or creates a labeled rolling issue otherwise.
⚠️	If any new finding fires, post to a Slack incoming webhook (SLACK_SECURITY_WEBHOOK org secret)	Workflow posts to the Slack webhook, but only for IOC-grade runs (script exit code 1). Operational findings (exit 2) update the rolling issue but do not page Slack. The step also no-ops if SLACK_SECURITY_WEBHOOK is unset.
✅	Sweep checks — Lockfile entries matching ioc-packages.txt (name@version)	Script builds per-ecosystem needle lists and performs structured and substring matching against npm/pip/Go lockfiles.
✅	Sweep checks — bundle.js files anywhere with SHA-256 matching ioc-hashes.txt	Script collects .js/.cjs/.mjs files ≤ 2 MB and compares SHA-256 against the provided hashes list, emitting ioc_bundle_hash findings.
✅	Sweep checks — .github/workflows/shai-hulud*.yaml / shai-hulud-workflow.yml persistence detection	Script scopes workflow-file scanning to the repo-root .github/workflows directory and flags shai-hulud*.{yml,yaml} as persistence_workflow findings.
✅	Sweep checks — Postinstall patterns: node bundle.js, curl\|sh, wget\|sh, base64-decode chains in package.json scripts	Script inspects package.json scripts for install/postinstall/preinstall and matches a broad regex covering node …bundle.js, curl\|sh, wget\|sh, base64 -d/--decode/-D, eval $(…), npx … bundle patterns.
❌	Sweep checks — The webhook.site exfil URL substring	The acceptance criteria list webhook.site substring matching, but I cannot find a specific check for the webhook.site URL substring in the vendored script or workflow; other exfil detection (public-exfil repo name, suspicious curl targets) exists, but no explicit webhook.site pattern match is present.
✅	Workflow uses minimal permissions: contents: read, issues: write	Workflow permissions block sets contents: read and issues: write and no broader permissions are granted.
❌	PR merged (closure of DEVOP-560)	Acceptance requires merging the PR; the PR is open (this is the review), so the 'merged' criterion is not yet satisfied by the current state.

Architecture diagram

sequenceDiagram
    participant Sched as Cron Schedule (04:07 UTC)
    participant Action as GitHub Actions Workflow
    participant Script as shai-hulud-ioc-sweep.sh
    participant GHAPI as GitHub API (REST/GraphQL)
    participant Repos as Org Repos (clone targets)
    participant Issue as Rolling Issue (.github repo)
    participant Slack as Slack Security Webhook
    participant Artifact as Uploaded Artifact

    Note over Sched,Artifact: NEW: Daily org-wide IOC sweep

    alt Scheduled trigger (cron '7 4 * * *')
        Sched->>Action: Trigger workflow
    else Manual trigger (workflow_dispatch)
        Action->>Action: Manual run started
    end

    Action->>Action: concurrency serialization (no cancel-in-progress)
    Action->>Action: Set GH_TOKEN (GH_ORG_READ_TOKEN || GITHUB_TOKEN)
    Action->>Action: Checkout .github repo (IOC lists + script)
    Action->>Script: Run with ORG, IOC package/hash files

    Note over Script: Detection logic (vendored from alla-network/skills)

    Script->>GHAPI: List all repos in org (public + private if token allows)
    GHAPI-->>Script: Repo list

    Script->>GHAPI: List org members (for exfil repo search)
    alt GH_TOKEN has read:org
        GHAPI-->>Script: Member list
    else GH_TOKEN lacks read:org
        Script->>Script: Emit check_skipped operational finding
    end

    loop Per repo in org
        Script->>Repos: git clone (via gh auth credential helper)
        alt Clone succeeds
            Script->>Script: Scan for IOC patterns
            alt IOC package match found
                Script->>Script: finding() - ioc_package_match
            end
            alt IOC hash match found (.js/.cjs/.mjs <= 2MB)
                Script->>Script: finding() - ioc_bundle_hash
            end
            alt Suspicious workflow files found
                Script->>Script: finding() - persistence_workflow
            end
            alt Suspicious npm lifecycle scripts found
                Script->>Script: finding() - suspicious_lifecycle_script
            end
            alt Go replace directives anomalies found
                Script->>Script: finding() - go_suspicious_replace
            end
            alt Go unsafe CI env vars found
                Script->>Script: finding() - go_unsafe_env
            end
            opt Finding detected (non-operational)
                Script->>Script: Preserve clone as forensic evidence
                Script->>Script: Append to dirty-repos list
            end
        else Clone fails
            Script->>Script: finding() - clone_failed (operational)
        end
    end

    loop Check public exfil repos (org scope)
        Script->>GHAPI: Search repos matching ^[Ss]hai-[Hh]ulud under org
        GHAPI-->>Script: Exfil repo list
        alt Exfil repos found
            Script->>Script: finding() - public_exfil_repo
        end
    end

    loop Check public exfil repos (member scope)
        alt GH_TOKEN supports member search
            Script->>GHAPI: Search repos per member
            GHAPI-->>Script: Exfil repo list per member
            alt Exfil repos found
                Script->>Script: finding() - public_exfil_repo_member
            end
        else Token lacks permission
            Script->>Script: finding() - check_skipped (operational)
        end
    end

    Script->>Script: Aggregate findings to findings.ndjson + summary.md
    alt Exit code 0 (clean)
        Script-->>Action: rc=0
    else Exit code 1 (IOC findings)
        Script-->>Action: rc=1
    else Exit code 2 (operational only)
        Script-->>Action: rc=2
    end

    Action->>Artifact: Upload sweep output + forensic evidence clones (30-day retention)

    alt rc == 1 or rc == 2 (findings exist)
        Action->>Issue: Find open issue with label "shai-hulud-sweep"
        alt Existing issue found
            Action->>Issue: Append comment with run summary
        else No existing issue
            Action->>Issue: Create new issue with label + summary
        end
    end

    alt rc == 1 (IOC-grade findings only)
        alt SLACK_SECURITY_WEBHOOK is set
            Action->>Slack: POST run summary (capped at ~2.8 KB)
            Slack-->>Action: 200 OK
        else Webhook not set
            Action->>Action: Log warning, skip (no failure)
        end
    end

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

Four mechanical fixes flagged as safe_auto by the ce-code-review headless pass on PR #8: - Add `--connect-timeout 5 --max-time 15` to the Slack webhook curl so a hung incoming-webhook endpoint cannot stall the workflow up to the 60-minute job timeout. - Gate the Page-Slack step with `always() && rc == '1'` so an IOC-grade finding still pages Slack when the preceding Update-rolling-issue step failed (the rolling issue is a redundant channel — Slack is the primary page). - Suffix the upload-artifact name with `${{ github.run_attempt }}` so a re-run does not 409 on actions/upload-artifact@v4's unique-name rule. - Hoist the `output_dir` GITHUB_OUTPUT write to immediately after `mkdir -p "$OUTPUT_DIR"` so the upload-artifact step's `always() && output_dir != ''` gate survives a mid-sweep crash (script abort would otherwise leave output_dir unset and skip evidence upload). Re-validated with `actionlint`. No behavior change beyond the four hardening fixes; the rc-based control flow and rolling-issue / Slack semantics are unchanged. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>

… and silent failures (DEVOP-560) Apply six ce-code-review findings to the daily Shai-Hulud IOC sweep: - Finding B (P1): rolling-issue lookup now uses --search with sort:created-asc, so a long-running incident issue stays canonical even when a duplicate is filed later (`gh issue list` defaults to newest-first and exposes no --sort flag). - Finding C (P1): wrap the Slack webhook POST in a 3-attempt retry loop with 5s/15s/45s backoff. Retries on 408/429/5xx + curl-level failures; honors Retry-After on 429; bails on terminal 4xx. - Finding E (P1): strip backtick/<>|*_ from the attacker-controllable IOC `detail` field before wrapping it in a Slack code fence or inlining it in the GitHub issue body. The uploaded findings.json remains the canonical un-sanitized source for forensic review. - Finding D (P1): restrict the uploaded artifact path to the structured outputs (findings.json, summary.md, repos.txt) instead of the entire output_dir. Anyone with actions:read on this repo can download artifacts, and the previous wildcard included raw clones / preserved evidence trees of private org repos. - Finding H (P2): when the script exits rc != 0 without producing summary.md (precondition failure / pre-aggregation crash), emit a minimal placeholder so the rc != 0 -> rolling-issue contract holds and the failure surfaces in triage instead of being silently dropped. - Finding G (P2): commit scripts/shai-hulud-ioc-sweep.sh.sha256 and verify it as the first action of the sweep step. The vendored script's canonical source lives upstream in allora-network/skills; the sidecar is the in-repo integrity gate. A PR that modifies the script body without refreshing the sidecar fails this step loudly instead of executing a tampered detector. Workflow validated with `actionlint` and `python3 -c "import yaml; yaml.safe_load(...)"` after each edit. 346 lines (under the 350 cap). Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>

…EVOP-560) A single PR that modifies the daily Shai-Hulud sweep workflow, the vendored detector script, the SHA-256 integrity sidecar, or the IOC seed lists can silently disable detection if no human review is enforced. This adds an in-repo CODEOWNERS rule requiring `@allora-network/security` approval on those paths (with `@allora-network/devops` co-owning the workflow file for routine operational tweaks). CODEOWNERS itself is self-owned so a single PR cannot rewrite the rules + disable detection in lockstep. Team slugs were verified via `gh api orgs/allora-network/teams/security` and `.../devops` on 2026-05-25. The complementary "Require review from Code Owners" branch-protection rule is an org-admin task and is documented as a follow-up in the plan doc; this commit only handles the in-repo half. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>

Two ce-code-review findings against the plan doc: - Finding I (P2): document the rotation procedure for the third-party action SHA pins (actions/checkout, actions/upload-artifact). Names owner (@allora-network/devops, with security co-review via the workflow's CODEOWNERS rule), cadence (quarterly + on CVE), canonical source for the latest release SHA per action, and a 4-step rotation procedure. Notes `.github/dependabot.yml` as the automation follow-up. - Finding F (P1, deferred): document missed-daily-run / cron-disabled observability as out-of-scope for this PR. The fix is materially additive (separate watchdog workflow or external healthcheck) and doesn't belong inline with the initial sweep ship. Also adds the branch-protection "Require review from Code Owners" follow-up surfaced by Finding A — the in-repo CODEOWNERS rule was landed in the prior commit but the actual blocking gate is org-admin territory. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

- P0 (script): Narrow persistence_workflow glob to exact known IOC filenames (shai-hulud.yml / shai-hulud.yaml / shai-hulud-workflow.yml / shai-hulud-workflow.yaml) so the legitimate defense workflow .github/workflows/shai-hulud-sweep.yml no longer self-detects as an IOC on every daily sweep — guaranteed false page → alert fatigue. - P1 (seed files): Add '# schema:v1' header to ioc-packages.txt and ioc-hashes.txt. Without the packages header the new schema-version assertion in the detector exits 2 at startup every run, leaving the sweep structurally inert. - P2 (script): Add parallel '# schema:v1' assertion against HASHES_FILE — mirrors the packages-file gate so a future reformat of the hashes seed list fails loud instead of silently zero-matching. - P2 (script): Add cometbft to default GO_TRUSTED_HOSTS_RE so Cosmos/ CometBFT same-path version pins (replace github.com/cometbft/cometbft => github.com/cometbft/cometbft <version>) no longer trip go_suspicious_replace once the sweep is unblocked from the schema:v1 gate above. - Regenerate scripts/shai-hulud-ioc-sweep.sh.sha256 in lockstep with the detector edits so the workflow's integrity-gate passes. Co-authored-by: Cursor <cursoragent@cursor.com>

…-560) Without this gate the bare `if: steps.sweep.outputs.rc == '1'` Slack step pages on every IOC-grade run, so a standing unresolved IOC pages the channel daily and conditions responders to mute it — classic alert fatigue. Raised by cubic at PRRT_kwDOLZ5Xss6Ee5gN and independently by four ce-code-review reviewers (P1, anchor 100). Implementation: - New `ioc-dedup` step (rc=1 only) computes a stable IOC stamp as the sha256 of the sorted `{repo,rule,path,detail}` TSV of IOC-grade rows in findings.json (`ts` deliberately excluded so an identical IOC set produces an identical stamp across daily runs). - Looks up the rolling issue's full comment history (cross-page sort) for the most recent `` and `` markers. - Decides `should_page`: * first IOC-grade run after clean (no prior stamp) → page * IOC set differs from previous stamp → page * same IOC set but >= 7d since last Slack page → page * otherwise → skip Fail-open when findings.json is missing/empty on rc=1: page so an unknown-state run surfaces visibly rather than dedup-silencing. - Rolling-issue update step now embeds the stamp marker on every rc=1 comment and the paged-at marker only when Slack actually fires, so a deduped comment carries forward the older real paged-at timestamp and the weekly re-page window stays honest. - Slack step gated on `should_page == 'true'`. A new `Slack page suppressed by IOC dedup` step emits a workflow notice for visibility, and the final-run-summary step surfaces the dedup decision too. - Visible `- **Slack page:** yes|suppressed (reason: ...)` footer in the rolling-issue comment body makes the decision obvious to humans scanning the issue, alongside the hidden HTML markers used by the next run's dedup lookup. Plan doc: the Slack-alert-path decision now spells out the dedup + weekly-repage policy and warns explicitly against regressing to a bare `rc == '1'` gate, so the next reviewer doesn't reintroduce the alert-fatigue regression. IOC_RULES_RE drift between workflow and script is called out as a coupling that must stay in sync. Refs: DEVOP-560, PRRT_kwDOLZ5Xss6Ee5gN (cubic), ce-code-review anchor 100 Co-authored-by: Cursor <cursoragent@cursor.com>

- (P2) Append `|| true` to the ioc-dedup current_stamp `jq | sha256sum | awk` pipeline so a malformed findings.json (or mid-run mutation) routes through the documented fail-open guard at lines 230-244 instead of aborting the step under `set -euo pipefail` and silently fail-CLOSING the Slack page. Mirrors the `|| true` already present on the four sibling pipelines in the same step. - (P2) Fix Slack curl http_code capture: replace `... || echo 000)` with `... || true)` followed by `: "${http_code:=000}"`. The prior form appended an extra '000' to curl's own '%{http_code}' output, producing the literal '000000' which fell through the `000|408|429|5*` transient-classification case to terminal=0 and disabled the curl-level retry path the loop exists for. - (P3) Replace the two-branch `if [ "${SHOULD_PAGE:-true}" = "true" ]` in the Final run summary with an explicit three-way `case` (true / false / *) so the unknown-state branch emits an `::error::` rather than defaulting to a false "Slack paged" claim when the ioc-dedup step crashed before writing $GITHUB_OUTPUT. Resolves the three-way contradiction between the Slack gate (strict ==true), the suppression-notice gate (!=true), and this summary. Co-authored-by: Cursor <cursoragent@cursor.com>

- (P2 #5) Extract a new `Find rolling issue` step (gated on `rc=='1' || rc=='2'`) that resolves the rolling-issue number ONCE per run via the canonical `gh issue list ... sort:created-asc` query and exposes it as `steps.find-rolling-issue.outputs.issue_num`. Replace the duplicated inline `gh issue list` calls in the ioc-dedup and rolling-issue-update steps with the shared output. Removes the drift-hazard `# same query as the update step below — keep in sync` coupling and closes the TOCTOU window where a human could close the rolling issue between the two independent lookups. - (P1 #1) Filter the ioc-dedup comment scan to `github-actions[bot]` authorship. Previously the `gh api ... --jq '.[] | {body, created_at}'` projection accepted markers from ANY commenter, so anyone with `issues: write` (or anyone able to social-engineer a maintainer into pasting attacker-supplied marker text) could forge `` or `` into the rolling issue and silently suppress real Slack pages by poisoning the dedup chain. Only this workflow (running as GITHUB_TOKEN) emits canonical markers, and its comments are attributed to `github-actions[bot]` — restrict the source set accordingly. Defense-in-depth follow-up (binding markers to the emitting run_id and verifying via gh api) deferred. - (P1 #2) Move paged-at marker emission to a dedicated post-Slack step (`Persist Slack-paged marker`) gated on `success() && rc=='1' && should_page=='true'` so a failed Slack delivery never writes a paged-at timestamp. The rolling-issue update step keeps writing the IOC stamp marker (which represents the dedup decision input, NOT the Slack-delivery outcome — that's correct gating). The dedup reader already scans the most-recent paged-at marker across ALL bot-authored comments, so splitting the markers across two comments composes correctly with no parser change. Previously the paged-at marker was committed BEFORE the Slack page ran, so a failed Slack send would still record a paged-at timestamp and silently corrupt the dedup chain for up to 7 days (next IOC-grade run would believe Slack had paged, suppress its own page, and the standing IOC would stop alerting until the weekly re-page window expired). The new step has a `gh issue list` fallback for the rare case where the update step created a fresh rolling issue this run (so find-rolling-issue's output was empty); fail-OPEN warning if no issue is resolvable at all so a missing paged-at marker just forces the next run to page conservatively. Verification: actionlint clean; YAML parses (11 steps in canonical order: checkout → verify-tools → sweep → upload → find-rolling-issue → ioc-dedup → update-rolling-issue → slack-page → persist-paged-at → slack-suppressed-notice → final-summary). Refs: DEVOP-560, ce-code-review run 20260526-101810-4793bf13 findings #1 (anchor 100, security+adversarial), #2 (anchor 100, correctness+adversarial+reliability), #5 (anchor 75, maintainability). Co-authored-by: Cursor <cursoragent@cursor.com>

srt0422 added shai-hulud Shai-Hulud supply-chain defense work needs-human-review labels May 25, 2026

cubic-dev-ai Bot reviewed May 25, 2026

View reviewed changes

Comment thread .github/workflows/shai-hulud-sweep.yml Outdated

srt0422 and others added 4 commits May 25, 2026 00:50

srt0422 mentioned this pull request May 25, 2026

DEVOP-617: org-wide go.mod replace-directive audit #9

Open

4 tasks

srt0422 and others added 5 commits May 26, 2026 10:07

Address ce-code-review safe_auto findings (#8)

979fa7b

Co-authored-by: Cursor <cursoragent@cursor.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEVOP-560: add org-wide daily Shai-Hulud IOC sweep workflow#8

DEVOP-560: add org-wide daily Shai-Hulud IOC sweep workflow#8
srt0422 wants to merge 10 commits into
mainfrom
scott/devop-560-shai-hulud-sweep

srt0422 commented May 25, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srt0422 commented May 25, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What ships

Detection coverage (per the vendored script)

Outputs

Secrets used

Verification

Followups (intentionally out-of-scope for this PR)

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

cubic analysis

Linked issue analysis

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

srt0422 commented May 25, 2026 •

edited by cubic-dev-ai Bot

Loading

cubic-dev-ai Bot left a comment •

edited

Loading