Source: PR #3344 Fro Bot review (two non-blocking concerns)
Scope: .github/workflows/survey-repo.yaml
Two related follow-ups left out of PR #3344 to keep the diagnostic surface bounded.
1. Apply the stderr-surfacing pattern to the recheck step
The 🔒 Recheck visibility step (mid-workflow, after the survey agent completes) still uses gh api graphql ... 2>/dev/null. If that step fails after the upstream resolve-step issue is fixed, we'll be back in the same diagnostic blackout we just escaped.
Fix: mirror the resolve step's pattern — capture stderr to a tempfile, dump on failure between --- gh stderr --- markers. Keep marker text identical so any log-parsing tooling can anchor on both occurrences.
2. Robust tempfile cleanup via trap
The current cleanup in the resolve step relies on inline rm -f calls at each exit path. If a future edit adds another early exit (e.g., a new validation check), the tempfile leaks for the life of the runner. Trivial on an ephemeral VM but a real footgun.
Fix: add trap 'rm -f "$gh_stderr"' EXIT immediately after gh_stderr=$(mktemp), drop the inline rm -f calls. Apply to both the resolve and recheck steps once #1 lands.
Acceptance
- Both
gh api graphql invocations in survey-repo.yaml surface stderr on failure between identical --- gh stderr --- / --- end gh stderr --- markers.
- Both invocations use
trap ... EXIT for tempfile cleanup.
- actionlint clean.
Why deferred
PR #3344 was diagnostic-only with a single concrete goal: surface the real error blocking the bfra-me survey dispatches. Expanding scope to mirror the pattern across all gh-api call sites before the root cause is in hand would have drifted from that goal. Now appropriate as a polish follow-up.
Source: PR #3344 Fro Bot review (two non-blocking concerns)
Scope:
.github/workflows/survey-repo.yamlTwo related follow-ups left out of PR #3344 to keep the diagnostic surface bounded.
1. Apply the stderr-surfacing pattern to the recheck step
The
🔒 Recheck visibilitystep (mid-workflow, after the survey agent completes) still usesgh api graphql ... 2>/dev/null. If that step fails after the upstream resolve-step issue is fixed, we'll be back in the same diagnostic blackout we just escaped.Fix: mirror the resolve step's pattern — capture stderr to a tempfile, dump on failure between
--- gh stderr ---markers. Keep marker text identical so any log-parsing tooling can anchor on both occurrences.2. Robust tempfile cleanup via
trapThe current cleanup in the resolve step relies on inline
rm -fcalls at each exit path. If a future edit adds another early exit (e.g., a new validation check), the tempfile leaks for the life of the runner. Trivial on an ephemeral VM but a real footgun.Fix: add
trap 'rm -f "$gh_stderr"' EXITimmediately aftergh_stderr=$(mktemp), drop the inlinerm -fcalls. Apply to both the resolve and recheck steps once #1 lands.Acceptance
gh api graphqlinvocations insurvey-repo.yamlsurface stderr on failure between identical--- gh stderr ---/--- end gh stderr ---markers.trap ... EXITfor tempfile cleanup.Why deferred
PR #3344 was diagnostic-only with a single concrete goal: surface the real error blocking the bfra-me survey dispatches. Expanding scope to mirror the pattern across all gh-api call sites before the root cause is in hand would have drifted from that goal. Now appropriate as a polish follow-up.