ci+fix: opt into --workspace --lib + detached-HEAD branch fallback (PMAT-159)#934
Closed
noahgift wants to merge 8 commits into
Closed
ci+fix: opt into --workspace --lib + detached-HEAD branch fallback (PMAT-159)#934noahgift wants to merge 8 commits into
noahgift wants to merge 8 commits into
Conversation
PMAT-155 investigation (paiml/infra#70) found the sovereign-ci.yml reusable workflow runs `cargo nextest run --lib` at the repo root, which only tests the root package. For aprender, the root lib.rs is a stub (added in #19) so ci/test runs 0 tests — all 60 workspace-crate libs are silent. paiml/.github#29 switches the reusable workflow's primary invocation to `cargo nextest run --workspace --lib $TEST_ARGS`. When that merges, aprender's ci/test will try to compile workspace members that don't build in the sovereign-ci container: - aprender-gpu (cuBLAS not present) - aprender-cuda-edge (CUDA toolchain not present) - aprender-compute (SIGSEGV at test-harness exit; workspace-test handles this crate specially with a grep-based pass check) This commit pre-stages `test_args` on the sovereign-ci call so ci/test excludes those three crates, matching the workspace-test job's current exclusions. Safe to land BEFORE paiml/.github#29 — while the reusable workflow still uses --lib (root only), the --exclude flags are no-ops on a 0-test run. Refs PMAT-155, PMAT-159, paiml/.github#29, paiml/infra#70.
3635cbf to
ffd7ed5
Compare
paiml/.github#29 merged an opt-in test_workspace input that switches the reusable workflow's test invocation from `cargo nextest run --lib` (root only) to `cargo nextest run --workspace --lib`. Without this, aprender's workspace-member lib tests (the interesting suite) never run in ci/test. Pair it with the existing exclusions so aprender-gpu/cuda-edge/compute are skipped — they don't build in the sovereign-ci container (cuBLAS / CUDA / SIGSEGV at exit). The workspace-test job below still covers them on GPU-ready hosts. Refs paiml/infra#33
…d HEAD
Found by the PMAT-159 canary: with the reusable workflow now exercising
`--workspace --lib`, aprender-orchestrate's test_get_git_status_current_repo
ran in the sovereign-ci container and failed because `actions/checkout`
leaves the workspace on a detached HEAD. `git branch --show-current`
emits an empty string in that state, so `status.branch` was "".
`unwrap_or_else("unknown")` never fired because `Ok("")` still maps to
`Some("")`, not None.
Fall back to `git rev-parse --short HEAD` formatted as `HEAD@<sha>` when
the branch lookup yields empty. Keeps the "unknown" sentinel for the
no-git-dir case. Existing test assertions (`!branch.is_empty()`,
`branch != "unknown"`) now hold in both interactive and CI contexts.
Verified locally with and without detached HEAD.
Refs paiml/infra#33
3 tasks
noahgift
added a commit
to paiml/.github
that referenced
this pull request
Apr 20, 2026
) * ci(sovereign): cargo test --workspace --lib (PMAT-159) F11 falsifier blind-spot from PMAT-155 investigation (paiml/infra#70): cargo nextest run --lib at repo root only tests the root package, leaving workspace-member libs silent. Per-pilot impact (pre-fix): copia - 227 tests (valid, single-crate repo) bashrs - 5 tests (root only; specs/runtime/oracle/wasm silent) aprender - 0 tests (root lib.rs is a stub; all 60 workspace crates silent) Fix: primary invocation now cargo nextest run --workspace --lib TEST_ARGS. Coverage updated to match. The -p REPO_NAME fallback is retained for harness quirks. Callers that need to skip workspace members (e.g. aprender's GPU crates) pass test_args: --exclude X --exclude Y. Blast radius: every repo's ci / test and ci / coverage will start running workspace-member lib tests that were previously silent. May surface real bugs. Recommended canary: merge, watch copia (no workspace effect), bashrs (4 new members surface), aprender (requires test_args exclusions first). Refs paiml/infra#70, PMAT-155, PMAT-159. * ci(sovereign): gate --workspace --lib behind opt-in test_workspace input The initial PMAT-159 change force-switched every caller to `--workspace --lib`. That breaks any repo whose workspace members don't build in the sovereign-ci container (e.g. aprender-gpu needs cuBLAS, aprender-cuda-edge needs CUDA). Switch to an opt-in `test_workspace` input (default false → current behavior). Callers that want workspace-wide coverage pair it with `test_args` exclusions: with: test_workspace: true test_args: "--exclude aprender-gpu --exclude aprender-cuda-edge" Refs paiml/infra#33 * ci(sovereign): bump test/lint/coverage timeout 30→60 min (PMAT-159) First aprender#934 canary of `test_workspace: true` timed out at exactly 30:28 on the test job — container has to compile 60+ crates cold before testing 25k+ lib tests. Default --lib callers finish in under 5 min, so the extra headroom is invisible to them; it only binds for large workspaces. Applied to all three 30-min jobs (test, lint, coverage) for consistency — lint also benefits if clippy ever hits a large workspace. Refs paiml/infra#33 · paiml/aprender#934
…4-19 (Refs #934) Flip enable_sccache: false → true. The "missing wrapper" disablement from 2026-04-19 is stale — paiml/infra#66 shipped the exec-script shim in sovereign-ci:stable the same day, and the fleet default is now true (PMAT-061). Diagnostic: run 24685501433 (test_workspace: true, sccache: false) hit the 60-min ceiling on both ci/test and ci/coverage. Cold compile of 60+ APR-MONO crates × 3 concurrent jobs × jobserver oversubscription × llvm-cov instrumentation on coverage exceeds 60 min without a warm sccache. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…der (Refs #934) PMAT-159's test_workspace opt-in duplicates what workspace-test already does (same --workspace --lib + GPU excludes). Running both on the same intel host triples cargo jobserver pressure (cargo #12912) and cost ~30 min of redundant compile per PR. F11 falsifier gets a per-repo override in infra so aprender's measurement points at `workspace-test` instead of `ci / test`. ci/test here stays as a root-stub gate presence (0 tests, green) so org ruleset contexts remain populated; no behavior regression. Keeps: - enable_sccache: true (independent PMAT-061 fix, fleet default) - aprender-orchestrate detached-HEAD fallback (real bug from actions/checkout) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
Author
|
Triaged stale (last touched 2026-04). CI fix from 2026-04 — sccache+per-PR target dir replaced its concerns; ci.yml diverged significantly. Closing as superseded. |
auto-merge was automatically disabled
May 12, 2026 15:59
Pull request was closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PMAT-159 (paiml/infra#33): the reusable sovereign-ci workflow historically ran
cargo nextest run --lib(root package only). For aprender — wherelib.rsis a stub — that meant ci/test ran 0 tests while F11 measured compile/cache-fetch latency and called it signal.paiml/.github#29 added an opt-in
test_workspace: trueinput. This PR opts aprender into that flag so ci/test actually exercises workspace-member lib tests (the interesting ~25k-test suite).Changes
.github/workflows/ci.yml: settest_workspace: true+ pair with existingtest_argsexclusions (aprender-gpu / cuda-edge / compute don't build in the sovereign-ci container).crates/aprender-orchestrate/src/oracle/local_workspace.rs:get_git_statusnow falls back toHEAD@<short-sha>whengit branch --show-currentreturns empty. Surfaced by the first canary run —actions/checkoutleaves the container workspace on detached HEAD, so!branch.is_empty()(and downstream consumers) needed a non-empty fallback.Five-whys (branch fallback)
test_get_git_status_current_repofail? —assertion failed: !status.branch.is_empty().status.branchempty? —git branch --show-currentemits "" on detached HEAD.actions/checkoutdefault checkout style.unwrap_or_else("unknown")fire? —Ok("")maps toSome(""), notNone; the.unwrap_or_elseonly triggers on I/O errors.--lib(root) not--workspace --lib.Test plan
cargo test -p aprender-orchestrate --lib oracle::local_workspacepassesgit checkout --detach HEADTEST_SCOPE: --workspace --lib)Refs paiml/infra#33 · paiml/.github#29