DEVOP-560: add org-wide daily Shai-Hulud IOC sweep workflow#8
Open
srt0422 wants to merge 10 commits into
Open
Conversation
Adds the first workflow under .github/workflows/ for the allora-network/.github repo: a scheduled daily sweep (04:07 UTC) plus workflow_dispatch that scans every repo in the org for Shai-Hulud indicators of compromise, maintains a rolling GitHub issue labelled `shai-hulud-sweep`, and pages Slack via the SLACK_SECURITY_WEBHOOK secret on incident-grade findings. Detection logic lives in scripts/shai-hulud-ioc-sweep.sh, vendored verbatim from allora-network/skills@71aeefb (skills/shai-hulud-defense). The script is vendored rather than cloned at workflow time because that repo is private and the workflow's default GITHUB_TOKEN cannot read it; vendoring also keeps the daily sweep working through upstream rename/outage. See the script header for the refresh procedure. Key design choices documented in docs/plans/2026-05-25-devop-560-shai-hulud-sweep.md: - IOC inputs read from .github/security/ioc-packages.txt + ioc-hashes.txt (DEVOP-561, merged in PR #2). Script validates the `# schema:v1` header before running so a silent seed-list format change fails closed. - Rolling issue: workflow finds an existing open issue with the label and appends a comment, else opens a new one. Humans drive close/reopen so triage state survives across daily runs. - Slack page only fires on IOC-grade findings (script exit 1). Operational findings (exit 2 — clone_failed / check_skipped / go_local_replace) update the issue but do not page after-hours. - Permissions: `contents: read` + `issues: write` only. - Pinned SHAs for actions/checkout (v4.2.2) and actions/upload-artifact (v4.4.3) match the convention in allora-network/ci-workflows-private. - Prefers a `GH_ORG_READ_TOKEN` secret if present (private-repo + member enumeration); falls back to GITHUB_TOKEN. In the fallback path, member enumeration emits `check_skipped` operational findings so the partial coverage is visible in the rolling issue. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
cubic analysis
1 issue found
Linked issue analysis
Linked issue: DEVOP-560: Create org-wide daily IOC sweep workflow in .github repo
| Status | Acceptance criteria | Notes |
|---|---|---|
| ✅ | .github/workflows/shai-hulud-sweep.yml exists in questa repo | PR adds the workflow file at .github/workflows/shai-hulud-sweep.yml. |
| ✅ | Cron: daily at an off-peak local time, off-minute (e.g. '7 4 * * *') | Workflow schedule is set to '7 4 * * *' and includes workflow_dispatch for manual runs. |
| ✅ | Iterate all org repos via gh api orgs/allora-network/repos --paginate | The vendored script enumerates repos using gh api --paginate /orgs/$ORG/repos and the workflow runs that script. |
| ✅ | For each repo: shallow clone, run sweep checks, report findings | Script uses git clone --depth 1 to shallow-clone each repo, runs the collection/checks, and aggregates findings; workflow uploads artifacts and updates the rolling issue. |
| ✅ | Compare against IOC lists at .github/security/ioc-packages.txt and .github/security/ioc-hashes.txt | Workflow passes those two paths to the script; the script validates the schema header and uses both lists in its matching logic. |
| Search the GitHub API for public repos under org members named ^[Ss]hai-[Hh]ulud | Script implements org-scoped and per-member searches and records findings, but per-member enumeration can be limited without a provisioned GH_ORG_READ_TOKEN and will emit check_skipped; member-side search is rate-limited with sleeps. Implementation exists but effective coverage depends on token provisioning and rate limits. | |
| ✅ | Maintain a single rolling GitHub Issue in .github repo; append new findings | Workflow finds an open issue with label shai-hulud-sweep (oldest first), appends a comment if present, or creates a labeled rolling issue otherwise. |
| If any new finding fires, post to a Slack incoming webhook (SLACK_SECURITY_WEBHOOK org secret) | Workflow posts to the Slack webhook, but only for IOC-grade runs (script exit code 1). Operational findings (exit 2) update the rolling issue but do not page Slack. The step also no-ops if SLACK_SECURITY_WEBHOOK is unset. | |
| ✅ | Sweep checks — Lockfile entries matching ioc-packages.txt (name@version) | Script builds per-ecosystem needle lists and performs structured and substring matching against npm/pip/Go lockfiles. |
| ✅ | Sweep checks — bundle.js files anywhere with SHA-256 matching ioc-hashes.txt | Script collects .js/.cjs/.mjs files ≤ 2 MB and compares SHA-256 against the provided hashes list, emitting ioc_bundle_hash findings. |
| ✅ | Sweep checks — .github/workflows/shai-hulud*.yaml / shai-hulud-workflow.yml persistence detection | Script scopes workflow-file scanning to the repo-root .github/workflows directory and flags shai-hulud*.{yml,yaml} as persistence_workflow findings. |
| ✅ | Sweep checks — Postinstall patterns: node bundle.js, curl|sh, wget|sh, base64-decode chains in package.json scripts | Script inspects package.json scripts for install/postinstall/preinstall and matches a broad regex covering node …bundle.js, curl|sh, wget|sh, base64 -d/--decode/-D, eval $(…), npx … bundle patterns. |
| ❌ | Sweep checks — The webhook.site exfil URL substring | The acceptance criteria list webhook.site substring matching, but I cannot find a specific check for the webhook.site URL substring in the vendored script or workflow; other exfil detection (public-exfil repo name, suspicious curl targets) exists, but no explicit webhook.site pattern match is present. |
| ✅ | Workflow uses minimal permissions: contents: read, issues: write | Workflow permissions block sets contents: read and issues: write and no broader permissions are granted. |
| ❌ | PR merged (closure of DEVOP-560) | Acceptance requires merging the PR; the PR is open (this is the review), so the 'merged' criterion is not yet satisfied by the current state. |
Architecture diagram
sequenceDiagram
participant Sched as Cron Schedule (04:07 UTC)
participant Action as GitHub Actions Workflow
participant Script as shai-hulud-ioc-sweep.sh
participant GHAPI as GitHub API (REST/GraphQL)
participant Repos as Org Repos (clone targets)
participant Issue as Rolling Issue (.github repo)
participant Slack as Slack Security Webhook
participant Artifact as Uploaded Artifact
Note over Sched,Artifact: NEW: Daily org-wide IOC sweep
alt Scheduled trigger (cron '7 4 * * *')
Sched->>Action: Trigger workflow
else Manual trigger (workflow_dispatch)
Action->>Action: Manual run started
end
Action->>Action: concurrency serialization (no cancel-in-progress)
Action->>Action: Set GH_TOKEN (GH_ORG_READ_TOKEN || GITHUB_TOKEN)
Action->>Action: Checkout .github repo (IOC lists + script)
Action->>Script: Run with ORG, IOC package/hash files
Note over Script: Detection logic (vendored from alla-network/skills)
Script->>GHAPI: List all repos in org (public + private if token allows)
GHAPI-->>Script: Repo list
Script->>GHAPI: List org members (for exfil repo search)
alt GH_TOKEN has read:org
GHAPI-->>Script: Member list
else GH_TOKEN lacks read:org
Script->>Script: Emit check_skipped operational finding
end
loop Per repo in org
Script->>Repos: git clone (via gh auth credential helper)
alt Clone succeeds
Script->>Script: Scan for IOC patterns
alt IOC package match found
Script->>Script: finding() - ioc_package_match
end
alt IOC hash match found (.js/.cjs/.mjs <= 2MB)
Script->>Script: finding() - ioc_bundle_hash
end
alt Suspicious workflow files found
Script->>Script: finding() - persistence_workflow
end
alt Suspicious npm lifecycle scripts found
Script->>Script: finding() - suspicious_lifecycle_script
end
alt Go replace directives anomalies found
Script->>Script: finding() - go_suspicious_replace
end
alt Go unsafe CI env vars found
Script->>Script: finding() - go_unsafe_env
end
opt Finding detected (non-operational)
Script->>Script: Preserve clone as forensic evidence
Script->>Script: Append to dirty-repos list
end
else Clone fails
Script->>Script: finding() - clone_failed (operational)
end
end
loop Check public exfil repos (org scope)
Script->>GHAPI: Search repos matching ^[Ss]hai-[Hh]ulud under org
GHAPI-->>Script: Exfil repo list
alt Exfil repos found
Script->>Script: finding() - public_exfil_repo
end
end
loop Check public exfil repos (member scope)
alt GH_TOKEN supports member search
Script->>GHAPI: Search repos per member
GHAPI-->>Script: Exfil repo list per member
alt Exfil repos found
Script->>Script: finding() - public_exfil_repo_member
end
else Token lacks permission
Script->>Script: finding() - check_skipped (operational)
end
end
Script->>Script: Aggregate findings to findings.ndjson + summary.md
alt Exit code 0 (clean)
Script-->>Action: rc=0
else Exit code 1 (IOC findings)
Script-->>Action: rc=1
else Exit code 2 (operational only)
Script-->>Action: rc=2
end
Action->>Artifact: Upload sweep output + forensic evidence clones (30-day retention)
alt rc == 1 or rc == 2 (findings exist)
Action->>Issue: Find open issue with label "shai-hulud-sweep"
alt Existing issue found
Action->>Issue: Append comment with run summary
else No existing issue
Action->>Issue: Create new issue with label + summary
end
end
alt rc == 1 (IOC-grade findings only)
alt SLACK_SECURITY_WEBHOOK is set
Action->>Slack: POST run summary (capped at ~2.8 KB)
Slack-->>Action: 200 OK
else Webhook not set
Action->>Action: Log warning, skip (no failure)
end
end
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
Four mechanical fixes flagged as safe_auto by the ce-code-review headless pass on PR #8: - Add `--connect-timeout 5 --max-time 15` to the Slack webhook curl so a hung incoming-webhook endpoint cannot stall the workflow up to the 60-minute job timeout. - Gate the Page-Slack step with `always() && rc == '1'` so an IOC-grade finding still pages Slack when the preceding Update-rolling-issue step failed (the rolling issue is a redundant channel — Slack is the primary page). - Suffix the upload-artifact name with `${{ github.run_attempt }}` so a re-run does not 409 on actions/upload-artifact@v4's unique-name rule. - Hoist the `output_dir` GITHUB_OUTPUT write to immediately after `mkdir -p "$OUTPUT_DIR"` so the upload-artifact step's `always() && output_dir != ''` gate survives a mid-sweep crash (script abort would otherwise leave output_dir unset and skip evidence upload). Re-validated with `actionlint`. No behavior change beyond the four hardening fixes; the rc-based control flow and rolling-issue / Slack semantics are unchanged. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>
… and silent failures (DEVOP-560) Apply six ce-code-review findings to the daily Shai-Hulud IOC sweep: - Finding B (P1): rolling-issue lookup now uses --search with sort:created-asc, so a long-running incident issue stays canonical even when a duplicate is filed later (`gh issue list` defaults to newest-first and exposes no --sort flag). - Finding C (P1): wrap the Slack webhook POST in a 3-attempt retry loop with 5s/15s/45s backoff. Retries on 408/429/5xx + curl-level failures; honors Retry-After on 429; bails on terminal 4xx. - Finding E (P1): strip backtick/<>|*_ from the attacker-controllable IOC `detail` field before wrapping it in a Slack code fence or inlining it in the GitHub issue body. The uploaded findings.json remains the canonical un-sanitized source for forensic review. - Finding D (P1): restrict the uploaded artifact path to the structured outputs (findings.json, summary.md, repos.txt) instead of the entire output_dir. Anyone with actions:read on this repo can download artifacts, and the previous wildcard included raw clones / preserved evidence trees of private org repos. - Finding H (P2): when the script exits rc != 0 without producing summary.md (precondition failure / pre-aggregation crash), emit a minimal placeholder so the rc != 0 -> rolling-issue contract holds and the failure surfaces in triage instead of being silently dropped. - Finding G (P2): commit scripts/shai-hulud-ioc-sweep.sh.sha256 and verify it as the first action of the sweep step. The vendored script's canonical source lives upstream in allora-network/skills; the sidecar is the in-repo integrity gate. A PR that modifies the script body without refreshing the sidecar fails this step loudly instead of executing a tampered detector. Workflow validated with `actionlint` and `python3 -c "import yaml; yaml.safe_load(...)"` after each edit. 346 lines (under the 350 cap). Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>
…EVOP-560) A single PR that modifies the daily Shai-Hulud sweep workflow, the vendored detector script, the SHA-256 integrity sidecar, or the IOC seed lists can silently disable detection if no human review is enforced. This adds an in-repo CODEOWNERS rule requiring `@allora-network/security` approval on those paths (with `@allora-network/devops` co-owning the workflow file for routine operational tweaks). CODEOWNERS itself is self-owned so a single PR cannot rewrite the rules + disable detection in lockstep. Team slugs were verified via `gh api orgs/allora-network/teams/security` and `.../devops` on 2026-05-25. The complementary "Require review from Code Owners" branch-protection rule is an org-admin task and is documented as a follow-up in the plan doc; this commit only handles the in-repo half. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>
Two ce-code-review findings against the plan doc: - Finding I (P2): document the rotation procedure for the third-party action SHA pins (actions/checkout, actions/upload-artifact). Names owner (@allora-network/devops, with security co-review via the workflow's CODEOWNERS rule), cadence (quarterly + on CVE), canonical source for the latest release SHA per action, and a 4-step rotation procedure. Notes `.github/dependabot.yml` as the automation follow-up. - Finding F (P1, deferred): document missed-daily-run / cron-disabled observability as out-of-scope for this PR. The fix is materially additive (separate watchdog workflow or external healthcheck) and doesn't belong inline with the initial sweep ship. Also adds the branch-protection "Require review from Code Owners" follow-up surfaced by Finding A — the in-repo CODEOWNERS rule was landed in the prior commit but the actual blocking gate is org-admin territory. Linear: https://linear.app/alloralabs/issue/DEVOP-560 Co-authored-by: Cursor <cursoragent@cursor.com>
4 tasks
Co-authored-by: Cursor <cursoragent@cursor.com>
- P0 (script): Narrow persistence_workflow glob to exact known IOC filenames (shai-hulud.yml / shai-hulud.yaml / shai-hulud-workflow.yml / shai-hulud-workflow.yaml) so the legitimate defense workflow .github/workflows/shai-hulud-sweep.yml no longer self-detects as an IOC on every daily sweep — guaranteed false page → alert fatigue. - P1 (seed files): Add '# schema:v1' header to ioc-packages.txt and ioc-hashes.txt. Without the packages header the new schema-version assertion in the detector exits 2 at startup every run, leaving the sweep structurally inert. - P2 (script): Add parallel '# schema:v1' assertion against HASHES_FILE — mirrors the packages-file gate so a future reformat of the hashes seed list fails loud instead of silently zero-matching. - P2 (script): Add cometbft to default GO_TRUSTED_HOSTS_RE so Cosmos/ CometBFT same-path version pins (replace github.com/cometbft/cometbft => github.com/cometbft/cometbft <version>) no longer trip go_suspicious_replace once the sweep is unblocked from the schema:v1 gate above. - Regenerate scripts/shai-hulud-ioc-sweep.sh.sha256 in lockstep with the detector edits so the workflow's integrity-gate passes. Co-authored-by: Cursor <cursoragent@cursor.com>
…-560)
Without this gate the bare `if: steps.sweep.outputs.rc == '1'` Slack step
pages on every IOC-grade run, so a standing unresolved IOC pages the
channel daily and conditions responders to mute it — classic alert
fatigue. Raised by cubic at PRRT_kwDOLZ5Xss6Ee5gN and independently by
four ce-code-review reviewers (P1, anchor 100).
Implementation:
- New `ioc-dedup` step (rc=1 only) computes a stable IOC stamp as the
sha256 of the sorted `{repo,rule,path,detail}` TSV of IOC-grade rows
in findings.json (`ts` deliberately excluded so an identical IOC set
produces an identical stamp across daily runs).
- Looks up the rolling issue's full comment history (cross-page sort)
for the most recent `<!-- shai-hulud-ioc-stamp: ... -->` and
`<!-- shai-hulud-paged-at: ... -->` markers.
- Decides `should_page`:
* first IOC-grade run after clean (no prior stamp) → page
* IOC set differs from previous stamp → page
* same IOC set but >= 7d since last Slack page → page
* otherwise → skip
Fail-open when findings.json is missing/empty on rc=1: page so an
unknown-state run surfaces visibly rather than dedup-silencing.
- Rolling-issue update step now embeds the stamp marker on every rc=1
comment and the paged-at marker only when Slack actually fires, so a
deduped comment carries forward the older real paged-at timestamp and
the weekly re-page window stays honest.
- Slack step gated on `should_page == 'true'`. A new `Slack page
suppressed by IOC dedup` step emits a workflow notice for visibility,
and the final-run-summary step surfaces the dedup decision too.
- Visible `- **Slack page:** yes|suppressed (reason: ...)` footer in the
rolling-issue comment body makes the decision obvious to humans
scanning the issue, alongside the hidden HTML markers used by the
next run's dedup lookup.
Plan doc: the Slack-alert-path decision now spells out the dedup +
weekly-repage policy and warns explicitly against regressing to a bare
`rc == '1'` gate, so the next reviewer doesn't reintroduce the
alert-fatigue regression. IOC_RULES_RE drift between workflow and
script is called out as a coupling that must stay in sync.
Refs: DEVOP-560, PRRT_kwDOLZ5Xss6Ee5gN (cubic), ce-code-review anchor 100
Co-authored-by: Cursor <cursoragent@cursor.com>
- (P2) Append `|| true` to the ioc-dedup current_stamp `jq | sha256sum
| awk` pipeline so a malformed findings.json (or mid-run mutation)
routes through the documented fail-open guard at lines 230-244 instead
of aborting the step under `set -euo pipefail` and silently
fail-CLOSING the Slack page. Mirrors the `|| true` already present
on the four sibling pipelines in the same step.
- (P2) Fix Slack curl http_code capture: replace
`... || echo 000)` with `... || true)` followed by
`: "${http_code:=000}"`. The prior form appended an extra '000'
to curl's own '%{http_code}' output, producing the literal '000000'
which fell through the `000|408|429|5*` transient-classification
case to terminal=0 and disabled the curl-level retry path the loop
exists for.
- (P3) Replace the two-branch `if [ "${SHOULD_PAGE:-true}" = "true" ]`
in the Final run summary with an explicit three-way `case`
(true / false / *) so the unknown-state branch emits an
`::error::` rather than defaulting to a false "Slack paged" claim
when the ioc-dedup step crashed before writing $GITHUB_OUTPUT.
Resolves the three-way contradiction between the Slack gate
(strict ==true), the suppression-notice gate (!=true), and this
summary.
Co-authored-by: Cursor <cursoragent@cursor.com>
- (P2 #5) Extract a new `Find rolling issue` step (gated on `rc=='1' || rc=='2'`) that resolves the rolling-issue number ONCE per run via the canonical `gh issue list ... sort:created-asc` query and exposes it as `steps.find-rolling-issue.outputs.issue_num`. Replace the duplicated inline `gh issue list` calls in the ioc-dedup and rolling-issue-update steps with the shared output. Removes the drift-hazard `# same query as the update step below — keep in sync` coupling and closes the TOCTOU window where a human could close the rolling issue between the two independent lookups. - (P1 #1) Filter the ioc-dedup comment scan to `github-actions[bot]` authorship. Previously the `gh api ... --jq '.[] | {body, created_at}'` projection accepted markers from ANY commenter, so anyone with `issues: write` (or anyone able to social-engineer a maintainer into pasting attacker-supplied marker text) could forge `<!-- shai-hulud-ioc-stamp: <sha256> -->` or `<!-- shai-hulud-paged-at: <iso8601> -->` into the rolling issue and silently suppress real Slack pages by poisoning the dedup chain. Only this workflow (running as GITHUB_TOKEN) emits canonical markers, and its comments are attributed to `github-actions[bot]` — restrict the source set accordingly. Defense-in-depth follow-up (binding markers to the emitting run_id and verifying via gh api) deferred. - (P1 #2) Move paged-at marker emission to a dedicated post-Slack step (`Persist Slack-paged marker`) gated on `success() && rc=='1' && should_page=='true'` so a failed Slack delivery never writes a paged-at timestamp. The rolling-issue update step keeps writing the IOC stamp marker (which represents the dedup decision input, NOT the Slack-delivery outcome — that's correct gating). The dedup reader already scans the most-recent paged-at marker across ALL bot-authored comments, so splitting the markers across two comments composes correctly with no parser change. Previously the paged-at marker was committed BEFORE the Slack page ran, so a failed Slack send would still record a paged-at timestamp and silently corrupt the dedup chain for up to 7 days (next IOC-grade run would believe Slack had paged, suppress its own page, and the standing IOC would stop alerting until the weekly re-page window expired). The new step has a `gh issue list` fallback for the rare case where the update step created a fresh rolling issue this run (so find-rolling-issue's output was empty); fail-OPEN warning if no issue is resolvable at all so a missing paged-at marker just forces the next run to page conservatively. Verification: actionlint clean; YAML parses (11 steps in canonical order: checkout → verify-tools → sweep → upload → find-rolling-issue → ioc-dedup → update-rolling-issue → slack-page → persist-paged-at → slack-suppressed-notice → final-summary). Refs: DEVOP-560, ce-code-review run 20260526-101810-4793bf13 findings #1 (anchor 100, security+adversarial), #2 (anchor 100, correctness+adversarial+reliability), #5 (anchor 75, maintainability). Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the first workflow under
.github/workflows/in this repo — a scheduleddaily sweep for Shai-Hulud indicators of compromise across every repo in the
allora-networkorg, plus a rolling GitHub Issue and Slack page on findings.Closes DEVOP-560.
What ships
.github/workflows/shai-hulud-sweep.yml—schedule: '7 4 * * *'(04:07 UTC daily, off-peak + off-minute) plus
workflow_dispatchfor manualruns. Permissions limited to
contents: read+issues: write. Pinned SHAsfor
actions/checkout@v4.2.2andactions/upload-artifact@v4.4.3, matchingthe convention in
allora-network/ci-workflows-private.scripts/shai-hulud-ioc-sweep.sh— canonical detection logic, vendoredverbatim from
allora-network/skills@71aeefb(skills/shai-hulud-defense/scripts/).See file header for refresh procedure / pinned commit.
docs/plans/2026-05-25-devop-560-shai-hulud-sweep.md— design notes forwhy we vendor the script, how the rolling issue is maintained, when Slack
fires, and the
GH_ORG_READ_TOKENfollow-up.Detection coverage (per the vendored script)
.github/security/ioc-packages.txt..js/.cjs/.mjsfile ≤ 2 MB whose SHA-256 matches.github/security/ioc-hashes.txt(filename-agnostic —bundle.jsrenamedoesn't bypass).
*/.github/workflows/shai-hulud*.{yml,yaml}at repo root.install/postinstall/preinstalllifecycle scripts matchingnode …bundle.js,curl|sh,wget|sh,base64 -d|--decode|-D,eval $(…), ornpx … bundle.replacedirectives: untrusted-host RHS, absolute-path RHS,top-level-path mismatch (Scenario C in-org redirect), and local replacements
(
.//../) flagged for human review.GOSUMDB=off,GONOSUMCHECK,GOINSECURE,GOFLAGS=*-insecure) — direct and indirect (vars/secrets/env/inputs).^[Ss]hai-[Hh]uludunderorg:allora-networkAND under each org member (rate-limited).
Outputs
02shai-hulud-sweep)1${{ secrets.SLACK_SECURITY_WEBHOOK }}Forensic evidence (clones of repos that produced IOC findings) is uploaded as
a workflow artifact for 30 days so humans can inspect the matched file without
re-cloning point-in-time evidence.
The workflow never auto-closes the rolling issue; humans drive close/reopen so
triage state survives across daily runs.
Secrets used
SLACK_SECURITY_WEBHOOK— org secret; payload only delivered on IOC-gradefindings. No-ops gracefully if unset (warning, not failure).
GH_ORG_READ_TOKEN— optional org secret. When present, preferred overthe default
GITHUB_TOKENfor org-wide enumeration so private repos and theorg-members exfil search are covered. When absent, member enumeration emits
check_skippedoperational findings — visible partial coverage, never asilent false-clean.
Verification
actionlint .github/workflows/shai-hulud-sweep.yml— clean (no findings).python3 -c 'yaml.safe_load(...)'— parses.workflow_dispatchrecommended after merge to verifygh issueand Slack paths end-to-end against the live org.Followups (intentionally out-of-scope for this PR)
GH_ORG_READ_TOKEN(fine-grained PAT or GitHub App token withread:org+repo:read) once org-admin signs off — the workflow alreadyprefers it when present.
GO_TRUSTED_HOSTS_REin the script) to keep
go_suspicious_replacefalse-positive rate low.Made with Cursor
Summary by cubic
Adds a daily org‑wide Shai‑Hulud IOC sweep that scans all
allora-networkrepos, updates a rolling issue, and pages Slack on incident‑grade findings with alert‑dedup and a weekly re‑page. Meets DEVOP-560 requirements: scheduled workflow at.github/workflows/shai-hulud-sweep.yml, org repo iteration, IOC list checks, member exfil search, rolling issue updates, and Slack notifications with minimal perms.New Features
.github/workflows/shai-hulud-sweep.yml(cron7 4 * * *+workflow_dispatch; minimal perms; serialized concurrency). Pinsactions/checkout@v4.2.2andactions/upload-artifact@v4.4.3.scripts/shai-hulud-ioc-sweep.shwith SHA‑256 sidecar verification; reads.github/security/ioc-packages.txtand.github/security/ioc-hashes.txt(# schema:v1). Detection covers lockfiles (incl. structuredpackage-lock.json), sub‑2MB JS SHA‑256 hashes, exact Shai‑Hulud persistence workflow filenames, suspicious npm lifecycle scripts, Go replace/path‑mismatch/unsafe‑env, and org/member public exfil search.shai-hulud-sweep; Slack viaSLACK_SECURITY_WEBHOOKon IOC with dedup gating; prefersGH_ORG_READ_TOKEN, else falls back toGITHUB_TOKENand emitscheck_skipped; uploads onlyfindings.json,summary.md,repos.txtfor 30 days..github/CODEOWNERSto require@allora-network/security(and@allora-network/devopsfor the workflow). Added plan doc with action SHA‑pin rotation and follow‑ups.Bug Fixes
github-actions[bot]; write thepaged-atmarker only after a successful Slack send; fail‑open if the stamp can’t be computed; 3‑attempt retry with backoff andRetry‑Afterhonoring; fixed HTTP code capture.--search sort:created-asc) reused by dedup, updates, and paged‑marker steps; IOC comments include visible page‑decision plus hidden stamp markers; Slack still fires on IOC even if the issue update fails.${{ github.run_attempt }}; placeholder summary on pre‑aggregation failure; final run summary surfaces the Slack‑dedup tri‑state explicitly.# schema:v1headers and assertions; extendGO_TRUSTED_HOSTS_REto includecometbft; refresh the.sha256sidecar.Written for commit c10d0dc. Summary will update on new commits. Review in cubic