Skip to content

ci(test): lint compiled bash bodies with shellcheck#496

Merged
jamesadevine merged 2 commits intomainfrom
lint-bash-steps
May 10, 2026
Merged

ci(test): lint compiled bash bodies with shellcheck#496
jamesadevine merged 2 commits intomainfrom
lint-bash-steps

Conversation

@jamesadevine
Copy link
Copy Markdown
Collaborator

Replaces #492 (the heavy-handed set -eo pipefail sweep) with a shellcheck-based integration test plus targeted bash fixes for the silent-failure patterns the lint actually surfaces. Closes the dissatisfaction expressed there: noise without enforcement that drifted on the next bash step.

Why this is different from #492

#492 (sweep) this PR (lint)
Mechanism 27 inline set -eo pipefail lines spread across templates and Rust generators one shellcheck integration test, no global prelude
Enforcement none — relies on contributors remembering to add the line cargo test --test bash_lint_tests blocks regressions; CI sets ENFORCE_BASH_LINT=1
Drift inevitable: 4 sites already missed impossible: the lint walks every compiled fixture
Effect on real silent-failures did not touch any 2>/dev/null || true cleanup; in two spots actually masked failures more than the original (grep ... | tail -1 || true) catches cd without ||, masked-return assignments, tilde-in-quotes, unquoted variables
Template noise every step gets boilerplate none — templates stay clean

Real bugs surfaced and fixed

These were latent and would have stayed latent under the #492 approach:

  • Copilot logs were never collected. Engine::Copilot::log_dir() returned "~/.copilot/logs". Tilde does not expand inside the double-quoted [ -d "..." ] test, so the directory check always failed and the 2>/dev/null || true masked it. Fixed to $HOME/.copilot/logs.
  • Ensure .npmrc exists and Ensure nuget.config exists emit invalid YAML. Both generators used Rust \<newline> line continuations in format!, which strip leading whitespace. The emitted body was flush-left against - bash: |, breaking YAML. Fixed by switching to raw string literals. (The fixture runtime-coverage-agent.md would not have parsed before this fix.)
  • ado-aw compiler download had no set -o pipefail. grep "ado-aw-linux-x64" checksums.txt | sha256sum -c - silently passes when grep returns no match because sha256sum -c - returns 0 on empty input. The unverified binary would install successfully. Same pattern in the trigger-filters script download. AWF download was already protected (PR fix(compile): fail pipeline step on AWF download errors #439); ado-aw compiler download was missed.
  • Lean install used curl ... | sh without pipefail — silent on curl failure.

Plus minor hygiene: cd "$DOWNLOAD_DIR" || exit 1, exit "$AGENT_EXIT_CODE", quoted {{ working_directory }} substitutions, ${RESULT_LINE##*THREAT_DETECTION_RESULT:} instead of echo | sed.

What's NOT changed

  • The two pre-existing targeted set -eo pipefail instances on AWF download / docker pull (PR fix(compile): fail pipeline step on AWF download errors #439) and on the tee-piped agent + threat-analysis runs are preserved. Those were correct.
  • No BashStep Rust helper, no template injection mechanism — out of scope.
  • No global set -eo pipefail sweep; targeted only where masked-pipeline exit codes are real.

How the lint works

  1. Compiles seven fixtures (including a new runtime-coverage-agent.md that exercises Lean, Node-with-feed-url, .NET-with-feed-url, and cache-memory).
  2. For each *.lock.yml, walks the YAML and collects every bash: body that is the value of a sequence-element mapping (so an arbitrary bash key inside an env: block is not treated as a step).
  3. Pipes each body to shellcheck --shell=bash --format=json --exclude=SC1090,SC1091 -.
  4. Asserts every entry in REQUIRED_STEP_DISPLAY_NAMES was produced — catches fixture/generator drift.
  5. Fails the test with a structured per-fixture/per-step report on any finding.

Excludes are limited to SC1090 / SC1091 (dynamic source paths) with a documented justification. Project-specific suppressions go inline as # shellcheck disable=SCxxxx comments — used in exactly one place for the wait-N-times for i in $(seq …) loop pattern, where i is intentionally unused.

Skip vs. enforce

  • Locally: the test prints a notice and returns early when shellcheck is not on PATH. Devs without it installed are unblocked.
  • CI: .github/workflows/rust-tests.yml installs shellcheck and sets ENFORCE_BASH_LINT=1 so a missing shellcheck becomes a hard failure rather than a silent skip. This was a blocking concern from the design review — without it, a runner image change could have silently disabled the lint.

Validation

  • All 1434 tests pass (1332 unit + 2 bash_lint + …).
  • cargo clippy --tests clean for the new test.
  • Verified the lint catches regressions: re-introduced the tilde bug, confirmed the test fails with a structured report, restored the fix.
  • Verified ENFORCE_BASH_LINT=1 runs the lint without skip.

Documentation

  • AGENTS.md Testing section: brief mention of the lint and the install command.
  • docs/extending.md: new "Bash steps in pipeline templates" section explaining the lint, the exclude rationale, the inline-disable pattern for intentional cases, and explicit guidance not to sprinkle set -eo pipefail to silence the lint.

Adds tests/bash_lint_tests.rs, an integration test that compiles a
representative set of fixtures and runs shellcheck against every
literal bash: body in the generated YAML. The lint catches the actual
silent-failure patterns ADO's "fail on last command" default lets
through (SC2164 cd-without-||, SC2155 masked-return, SC2086/2046
unquoted variables, SC2154 unset refs, SC2088 tilde-in-quotes).

This replaces the previously proposed approach of sprinkling
`set -eo pipefail` across every bash step (PR #492). That approach
added boilerplate to ~27 sites without enforcement, drifted as new
steps were added, and in two spots actually masked errors more than
the original code (`grep ... | tail -1 || true`).

Real bugs surfaced and fixed by the new lint:

* `src/engine.rs` — `Engine::Copilot::log_dir()` returned
  `~/.copilot/logs`. Tilde does not expand inside the double-quoted
  `[ -d "..." ]` test that consumes this value, so the directory check
  always failed and Copilot logs were silently never collected to the
  pipeline artifact. Replaced with `$HOME/.copilot/logs`.
* `src/runtimes/node/mod.rs` and `src/runtimes/dotnet/mod.rs` — the
  ensure-`.npmrc` and ensure-`nuget.config` step generators used Rust
  `\<newline>` line continuations in their format strings, which strip
  leading whitespace. The emitted YAML had body lines flush-left
  against `- bash: |`, producing invalid YAML. Replaced with raw
  string literals so indentation is preserved.
* Multiple `cd "$DOWNLOAD_DIR"` in `base.yml` / `1es-base.yml` had no
  `|| exit` guard. Added.
* `exit $AGENT_EXIT_CODE` (multiple sites) — quoted.
* `mkdir -p {{ working_directory }}/safe_outputs` and the matching
  `cp -a ...` — quoted the substitution.
* `JSON_CONTENT=$(echo "$RESULT_LINE" | sed 's/.*PFX://')` rewritten
  to `${RESULT_LINE##*PFX:}` (avoids forking sed and removes a
  shellcheck SC2001 finding).

Targeted `set -eo pipefail` additions (only where masked-pipeline
exit codes matter):

* `base.yml` / `1es-base.yml` ado-aw download steps (3 stages × 2
  templates): `grep "ado-aw-linux-x64" checksums.txt | sha256sum -c -`
  silently passes when grep matches nothing because sha256sum returns
  0 on empty stdin. Without pipefail, the unverified binary would
  install successfully.
* `src/compile/extensions/trigger_filters.rs` script-download step:
  same `grep | sha256sum` pattern.
* `src/runtimes/lean/mod.rs` install step: `curl ... | sh` would
  silently install nothing on curl failure.

The two pre-existing `set -eo pipefail` instances on the AWF download
+ docker pull steps (introduced in PR #439) and on the `tee`-piped
agent / threat-analysis runs are preserved — those were correct.

Skip vs. enforce:
* Locally, the test prints a notice and returns early when shellcheck
  is missing.
* CI installs shellcheck and sets `ENFORCE_BASH_LINT=1` so a missing
  shellcheck becomes a hard failure rather than a silent skip.

A new `tests/fixtures/runtime-coverage-agent.md` exercises the Lean,
Node-with-feed-url, and .NET-with-feed-url runtimes plus the
cache-memory tool, ensuring every code-generated bash step is reached.
The lint enforces a `REQUIRED_STEP_DISPLAY_NAMES` coverage list to
catch fixture/generator drift.

Documented in AGENTS.md and docs/extending.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Solid approach with real bugs fixed — one documentation inconsistency worth fixing before merge.

Findings

🐛 Bugs / Logic Issues

  • docs/extending.md:99 — The "exclude list" description claims SC1090, SC1091, SC2034, SC2016 are globally excluded, but SHELLCHECK_EXCLUDE in tests/bash_lint_tests.rs only contains "SC1090,SC1091". SC2034 is suppressed inline (correctly, per the stated design), and SC2016 doesn't appear in the code at all. A contributor reading extending.md would believe they can rely on SC2034/SC2016 being globally suppressed when they can't. Should read something like:

    The global exclude list (SC1090, SC1091) is documented in tests/bash_lint_tests.rs. SC2034 suppressions for intentional loop-counter variables use per-line # shellcheck disable directives.

✅ What Looks Good

  • Real security fix, not cosmetic: grep ... | sha256sum -c - silently passed on empty input (grep returning no match → sha256sum reading /dev/null-equivalent). Adding set -eo pipefail before those pipelines means the integrity check now actually fails if the binary isn't in the checksum file. This is the most important change in the PR.
  • $HOME vs ~ fix in engine.rs: Correct diagnosis. Tilde doesn't expand inside [ -d "..." ] double-quoting — log collection was silently broken, and the || true swallowed it.
  • Raw string literals in generate_ensure_npmrc / generate_ensure_nuget_config: The backslash-continuation format! approach was stripping leading whitespace from lines, producing flush-left YAML that didn't parse correctly. The switch to r#"..."# is the right fix and the heredoc indentation arithmetic (4-space YAML block indent → EOF at column 0 after YAML strips the block prefix) is correct.
  • YAML walker in the lint test: The in_sequence_element guard correctly avoids treating bash: keys nested inside env: blocks as step bodies. The coverage-enforcement via REQUIRED_STEP_DISPLAY_NAMES is a good way to catch generator drift between fixtures and templates.
  • Skip-vs-enforce pattern: Local devs unblocked without shellcheck; CI enforces via ENFORCE_BASH_LINT=1. Clean design.

Generated by Rust PR Reviewer for issue #496 · ● 412.9K ·

Two changes the CI surfaced after PR #496 landed locally:

1. **shellcheck 0.9.0 (Ubuntu's pinned) flags SC2002 ("Useless cat")
   on `cat file | sed ...` patterns that 0.11.0 does not.** Fixed by
   rewriting the two offending sites in the MCPG start step:
   * `MCPG_CONFIG=$(cat … | sed | sed | sed)` →
     `MCPG_CONFIG=$(sed -e … -e … -e … file)`. Semantically equivalent
     because the three substitutions are over independent placeholder
     patterns.
   * `cat … | python3 -m json.tool` → `python3 -m json.tool < …`.

   Avoids forking `cat` for nothing and is stable across shellcheck
   versions.

2. **Add a `runtime-coverage-1es-agent.md` fixture and assert that
   every known compile target is exercised by at least one fixture.**
   Previously only `1es-test-agent.md` compiled to the 1ES target, and
   it had no `runtimes:` or `tools.cache-memory`. The code-generated
   bash bodies from those extensions (Lean install, `.npmrc`,
   `nuget.config`, cache-memory restore/init) were being linted only
   on the standalone target. Today their bodies are byte-identical
   across targets, but a future target-specific divergence would slip
   past the lint without a 1ES variant.

   `compile_fixture()` now parses `Generated <target> pipeline:` from
   stdout, accumulates targets seen, and the test asserts every entry
   in `REQUIRED_TARGETS = ["standalone", "1es"]` is covered. Sanity-
   checked that removing the 1ES fixtures causes the test to fail with
   `no fixture compiles to the following target(s): ["1es"]`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good overall — solid engineering with real bugs fixed and a well-designed enforcement mechanism. One documentation inaccuracy and a handful of minor observations.

Findings

🐛 Bugs / Logic Issues

  • docs/extending.md (line 99): The sentence "The exclude list (SC1090, SC1091, SC2034, SC2016) is documented in tests/bash_lint_tests.rs" is incorrect. SHELLCHECK_EXCLUDE in the test is only "SC1090,SC1091". SC2034 is suppressed via inline # shellcheck disable=SC2034 comments (the right approach, as noted in the same doc), and SC2016 doesn't appear in any suppress list anywhere in this PR. Contributors reading the doc will be misled into thinking both codes are globally excluded.

    Suggested fix: update the sentence to something like: "The global exclude list (SC1090, SC1091) is documented in tests/bash_lint_tests.rs; per-line suppressions (e.g. SC2034 on the wait-loop i) are inline comments in the bash body."

⚠️ Suggestions

  • src/data/base.yml + 1es-base.yml — AWF download steps: The AWF download pipeline grep "awf-linux-x64" checksums.txt | sha256sum -c - is the same pipe-without-pipefail pattern described in the PR's "Real bugs surfaced" section. The PR asserts AWF download was already protected by PR fix(compile): fail pipeline step on AWF download errors #439, but the diff shows only cd "$DOWNLOAD_DIR" || exit 1 being added — no set -eo pipefail is visible for the AWF download steps. If PR fix(compile): fail pipeline step on AWF download errors #439 already added it outside the diff window, this is fine; worth a sanity check.

  • tests/bash_lint_tests.rs compile_fixture (lines 82–84): dest.to_str().unwrap() panics on non-UTF-8 temp paths. Fine in practice (temp dirs are always UTF-8), but unwrap_or_else(|_| panic!("non-UTF-8 path")) would give a cleaner failure message — minor nit for a test file.

✅ What Looks Good

  • Engine::log_dir() tilde fix is a real, silent, long-standing bug. $HOME/.copilot/logs vs ~/.copilot/logs inside [ -d "..." ] is exactly the kind of thing that only surfaces under inspection. Good catch.
  • Switching format!("...\n\...") to r#"..."# raw strings in generate_ensure_npmrc / generate_ensure_nuget_config is the right fix. The old \ line-continuation form really did strip leading whitespace and the resulting YAML was malformed. The YAML indentation in the raw strings is correct (4-space block-scalar strip produces valid bash and valid heredoc markers).
  • collect() traversal logic correctly uses in_sequence_element to avoid treating bash: keys inside env: blocks as step bodies. The reset to false on every mapping descent and reset to true on every sequence descent is sound.
  • ENFORCE_BASH_LINT / graceful skip split is a good design — developers without shellcheck aren't blocked, CI is never silently green.
  • Coverage assertion via REQUIRED_STEP_DISPLAY_NAMES is a clever way to prevent the fixture list from drifting away from the generators it's supposed to exercise.

Generated by Rust PR Reviewer for issue #496 · ● 665.2K ·

@jamesadevine jamesadevine merged commit 55e546f into main May 10, 2026
8 checks passed
@jamesadevine jamesadevine deleted the lint-bash-steps branch May 10, 2026 20:36
jamesadevine added a commit that referenced this pull request May 10, 2026
…498)

* ci(test): lint compiled bash bodies with shellcheck

Adds tests/bash_lint_tests.rs, an integration test that compiles a
representative set of fixtures and runs shellcheck against every
literal bash: body in the generated YAML. The lint catches the actual
silent-failure patterns ADO's "fail on last command" default lets
through (SC2164 cd-without-||, SC2155 masked-return, SC2086/2046
unquoted variables, SC2154 unset refs, SC2088 tilde-in-quotes).

This replaces the previously proposed approach of sprinkling
`set -eo pipefail` across every bash step (PR #492). That approach
added boilerplate to ~27 sites without enforcement, drifted as new
steps were added, and in two spots actually masked errors more than
the original code (`grep ... | tail -1 || true`).

Real bugs surfaced and fixed by the new lint:

* `src/engine.rs` — `Engine::Copilot::log_dir()` returned
  `~/.copilot/logs`. Tilde does not expand inside the double-quoted
  `[ -d "..." ]` test that consumes this value, so the directory check
  always failed and Copilot logs were silently never collected to the
  pipeline artifact. Replaced with `$HOME/.copilot/logs`.
* `src/runtimes/node/mod.rs` and `src/runtimes/dotnet/mod.rs` — the
  ensure-`.npmrc` and ensure-`nuget.config` step generators used Rust
  `\<newline>` line continuations in their format strings, which strip
  leading whitespace. The emitted YAML had body lines flush-left
  against `- bash: |`, producing invalid YAML. Replaced with raw
  string literals so indentation is preserved.
* Multiple `cd "$DOWNLOAD_DIR"` in `base.yml` / `1es-base.yml` had no
  `|| exit` guard. Added.
* `exit $AGENT_EXIT_CODE` (multiple sites) — quoted.
* `mkdir -p {{ working_directory }}/safe_outputs` and the matching
  `cp -a ...` — quoted the substitution.
* `JSON_CONTENT=$(echo "$RESULT_LINE" | sed 's/.*PFX://')` rewritten
  to `${RESULT_LINE##*PFX:}` (avoids forking sed and removes a
  shellcheck SC2001 finding).

Targeted `set -eo pipefail` additions (only where masked-pipeline
exit codes matter):

* `base.yml` / `1es-base.yml` ado-aw download steps (3 stages × 2
  templates): `grep "ado-aw-linux-x64" checksums.txt | sha256sum -c -`
  silently passes when grep matches nothing because sha256sum returns
  0 on empty stdin. Without pipefail, the unverified binary would
  install successfully.
* `src/compile/extensions/trigger_filters.rs` script-download step:
  same `grep | sha256sum` pattern.
* `src/runtimes/lean/mod.rs` install step: `curl ... | sh` would
  silently install nothing on curl failure.

The two pre-existing `set -eo pipefail` instances on the AWF download
+ docker pull steps (introduced in PR #439) and on the `tee`-piped
agent / threat-analysis runs are preserved — those were correct.

Skip vs. enforce:
* Locally, the test prints a notice and returns early when shellcheck
  is missing.
* CI installs shellcheck and sets `ENFORCE_BASH_LINT=1` so a missing
  shellcheck becomes a hard failure rather than a silent skip.

A new `tests/fixtures/runtime-coverage-agent.md` exercises the Lean,
Node-with-feed-url, and .NET-with-feed-url runtimes plus the
cache-memory tool, ensuring every code-generated bash step is reached.
The lint enforces a `REQUIRED_STEP_DISPLAY_NAMES` coverage list to
catch fixture/generator drift.

Documented in AGENTS.md and docs/extending.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* ci(workflows): add daily bash-step hygiene auditor agentic workflow

Adds `.github/workflows/bash-lint-auditor.md`, a daily agentic workflow
that complements the PR-gate lint added in PR #496. The PR gate gives
fast feedback on every PR; this workflow runs once a day and lands
small, mechanical improvements that the gate can't:

* When a finding does slip onto main (e.g. via merge conflict), the
  auditor fixes it the next morning instead of waiting for the next
  contributor PR.
* Audits stale `# shellcheck disable=` directives — removes ones that
  no longer fire (i.e. the underlying code has been cleaned up but
  the suppression was forgotten).
* Audits whether the lint's exclude list could be tightened.
* Verifies fixture coverage of every bash-step generator and proposes
  fixture additions when a new generator appears.

When the auditor finds something actionable, it opens a focused PR
(one concern per PR) with the structured "what was found / how it was
fixed / verification" body. When the lint is green and no proactive
improvement is feasible, it exits cleanly with `noop`.

Configuration notes:

* `schedule: daily around 09:00` — fuzzy schedule scattering across
  the hour, matching the convention of other daily workflows in this
  repo (e.g. `cyclomatic-complexity-reducer.md`).
* `allowed-files` restricts the auditor to bash-generator code paths
  plus the tests/fixtures it depends on. `protected-files:
  fallback-to-issue` ensures that if it tries to edit anything else,
  the change falls back to an issue rather than a PR.
* `cache-memory: true` persists state across runs so the auditor
  doesn't loop on the same suggestion if a maintainer rejects it.
* `bash: ["*"]` + `network.allowed: [defaults, rust]` gives the
  agent what it needs to install shellcheck (via apt with a static-
  binary fallback) and run cargo against the rust ecosystem.

Compiled with `gh aw compile bash-lint-auditor`; the matching
`.lock.yml` is included along with new SHAs in
`.github/aw/actions-lock.json` (cache, checkout, download-artifact
registered for the first time by this workflow's setup steps).

Stacked on top of branch `lint-bash-steps` (PR #496) because the
auditor relies on `tests/bash_lint_tests.rs` and
`tests/fixtures/runtime-coverage-agent.md`, which are introduced
there.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant