ci(release): swap NPM_TOKEN for npm OIDC trusted publisher#2990
ci(release): swap NPM_TOKEN for npm OIDC trusted publisher#2990elrrrrrrr wants to merge 1 commit into
Conversation
The three release workflows now exchange the GitHub Actions OIDC
id-token (granted via `id-token: write`) for a short-lived npm
publish token, replacing the long-lived `NPM_TOKEN` shared secret.
`actions/setup-node`'s `registry-url` writes a token-less `.npmrc`
template (`_authToken=${NODE_AUTH_TOKEN}` expands to empty once the
env var is gone), and the npm CLI from 11.5.1+ picks up the
`ACTIONS_ID_TOKEN_REQUEST_*` env vars and performs the
trusted-publisher exchange. The existing `--provenance` Sigstore
attestation rides on the same OIDC primitive and is preserved on
every publish call.
`pm-release.yml` bumps Node 18.x → 20.x because npm 11.x's
`engines.node` is `^20.17.0 || >=22.9.0` and would refuse install on
Node 18. Every publish job adds an `npm i -g npm@latest` step
because the Node-bundled npm 10.x predates the trusted-publisher
flow.
`vendor/scripts/npm-binary.sh` and `vendor/scripts/npm-utoo.sh` gain
`set -euo pipefail`; without it, a failed `npm publish` was masked
by the trailing `cat package.json` / `rm -rf "$WORK_DIR"` and the
workflow step exited zero. That mode would hide the most likely
first-cut failure of this migration — a mis-registered
trusted-publisher entry on npmjs.com — behind a green CI run.
Out-of-band one-time setup, not part of this diff: on npmjs.com,
register `utooland/utoo` + each of the three workflow filenames
(`pack-release.yml`, `pm-release.yml`, `utooweb-release.yml`) as
trusted publishers under the `@utoo` org so the scoped packages and
the per-platform `@utoo/utoo-<os>-<cpu>` binaries from the pm matrix
inherit the registration, plus a separate entry for the unscoped
`utoo` entry package. Once the first OIDC publish lands green, the
`NPM_TOKEN` repo secret can be deleted from Settings → Secrets and
variables → Actions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request updates npm-binary.sh and npm-utoo.sh to include set -euo pipefail, ensuring that failures during the npm publish process are not masked. The reviewer recommends using an EXIT trap to handle the cleanup of temporary directories, as the addition of set -e causes the scripts to terminate immediately on error, potentially leaving orphaned files.
| @@ -1,4 +1,6 @@ | |||
| #!/bin/bash | |||
| # Without this, a failed npm publish is masked by the trailing cat/rm. | |||
| set -euo pipefail | |||
There was a problem hiding this comment.
While set -e correctly prevents the script from masking failures, it also causes the script to exit immediately upon error, skipping the manual cleanup at the end. This leaves the temporary directory $WORK_DIR on the filesystem. Using an EXIT trap is a more robust pattern for ensuring cleanup in shell scripts, especially when using set -e.
| set -euo pipefail | |
| set -euo pipefail | |
| trap 'rm -rf "${WORK_DIR:-}"' EXIT |
| @@ -1,4 +1,6 @@ | |||
| #!/bin/bash | |||
| # Without this, a failed npm publish is masked by the trailing cat/rm. | |||
| set -euo pipefail | |||
There was a problem hiding this comment.
The addition of set -e means that any failure (e.g., during npm publish) will terminate the script before it reaches the cleanup step. To ensure the temporary working directory is always removed, consider using an EXIT trap instead of relying on a manual rm at the end of the script. Note that ${WORK_DIR:-} is used to safely handle the variable if the script exits before it is defined.
| set -euo pipefail | |
| set -euo pipefail | |
| trap 'rm -rf "${WORK_DIR:-}"' EXIT |
Summary
pack-release.yml,pm-release.yml,utooweb-release.yml) move from the long-livedNPM_TOKENGitHub Actions secret to npm's Trusted Publishers OIDC flow. Each publish job grantspermissions: id-token: write, letsactions/setup-node'sregistry-urlwrite the token-less~/.npmrctemplate (the_authToken=${NODE_AUTH_TOKEN}line expands to empty when the env var is gone, which is the signal npm CLI ≥ 11.5.1 uses to fall through to the trusted-publisher exchange againsthttps://registry.npmjs.org/-/npm/v1/oidc/tokenusing the GHA-injectedACTIONS_ID_TOKEN_REQUEST_*env vars), and drops theNODE_AUTH_TOKEN/NPM_TOKENenv entries pluspack-release/utooweb-release's legacyecho "//registry.npmjs.org/:_authToken=$NPM_TOKEN" >> ~/.npmrcline. The existing Sigstore--provenanceattestation rides the same OIDC primitive (it's what motivated theid-token: writepermission thatpm-release.ymlalready carried) and is now made explicit on the publish calls in the workflow YAML forpackandutooweb. Because the Node-bundled npm is 10.x and the trusted-publisher CLI handshake didn't ship until npm 11.5.1 (Aug 2025), every publish job inserts annpm install --global npm@lateststep before invoking publish.pm-release.yml's twoactions/setup-node@v3invocations bumpnode-versionfrom"18.x"to"20.x". npm 11.x'sengines.nodefield is^20.17.0 || >=22.9.0and refuses installation on Node 18, so the npm-CLI upgrade can't happen without the runner Node bump. Theactions/setup-node@v3andactions/checkout@v3major versions are deliberately left at v3 — those are at the runner-side Node-16-EOL deprecation-warning state, which is independent of the OIDC switch and belongs in a separate "ci: bump action majors across the release workflows" PR. The other two workflows (pack-release.yml,utooweb-release.yml) were already on Node 20 withactions/setup-node@v4, so no Node-version change is needed there.vendor/scripts/npm-binary.shandvendor/scripts/npm-utoo.sh(the helpers thatpm-release.ymlinvokes — the first per-matrix-leg for the@utoo/utoo-<os>-<cpu>platform-binary packages, the second once for the unscopedutooentry package) gainset -euo pipefailright under the shebang. Without it, a failingnpm publishwas being masked: the script's trailingcat package.jsonandrm -rf "$WORK_DIR"return zero, so the script's overall exit code was always the finalrm's zero regardless of whether the publish succeeded or 401'd, and the calling workflow step looked green. That is the exact shape of the first failure mode of the OIDC migration — a misregistered trusted-publisher entry on the npm side would 401 the exchange and the publish would no-op silently — so the script-hardening lands in the same PR as the auth swap so the first OIDC release goes red instead of green-no-op on a misconfiguration. This is the script-side correctness companion to the YAML auth-swap.Test plan
End-to-end verification of the OIDC flow requires a
release-typed GitHub event because each publish job hasif: github.event_name == 'release'and a tag-prefix guard. The cheap form is to cut a prerelease GitHub release whose version label routes the publish to a non-latestnpm dist-tag, so the canary never disturbs thelatestchannel. The three release workflows are independent (different tag prefixes, different package sets), so each needs its own probe. None of the existing per-PR CI matrices (pack-ci,pm-ci,pm-e2e-bench, the format/lint workflows, etc.) exercise theserelease.ymlfiles on a non-release push, so the regular PR CI is silent on the OIDC plumbing — it only validates the unchanged-by-this-PR rest of the repo.@utoonpm-organization level (Org → Settings → "Trusted Publishers" → Add publisher → GitHub Actions; fields: Organization=utooland, Repository=utoo, Workflow filename=one ofpack-release.yml/pm-release.yml/utooweb-release.ymlper the workflow that publishes the corresponding subset of scoped packages, Environment=blank — none of the three workflow's publish jobs in this diff sets aenvironment:key). Org-level registration covers all scoped@utoo/*packages in one shot:@utoo/pack,@utoo/pack-shared,@utoo/pack-cli,@utoo/web, and the five-platform set@utoo/utoo-<os>-<cpu>that the pm matrix publishes viavendor/scripts/npm-binary.sh. The unscopedutooentry package lives outside the@utooscope so it needs a separate per-package Trusted Publisher entry on its own npmjs.com package-settings page, with the same field shape and the workflow filename set topm-release.yml(thepublish-mainjob that invokesvendor/scripts/npm-utoo.sh).utoopack-release.yml: cut a prerelease GitHub release taggedutoopack-vX.Y.Z-oidcprobe.0where X.Y.Z is the current published@utoo/pack-sharedversion (so the in-workflownpm versionstep's bumped value doesn't collide with an existing version on the registry). Expect the workflow run to be green, the new prerelease versions of@utoo/pack,@utoo/pack-shared, and@utoo/pack-clito appear on each package's npmjs.com page within a minute or two of the workflow finishing, and each package page to show the green "Provenance" attestation badge from Sigstore (the--provenanceflag the publish commands now carry). The workflow derives the npm dist-tag from the chunk between the first-and the next.in the version string (utoopack-release.yml's tag-derivation block), so the publish lands on theoidcprobedist-tag andlatestis untouched; clean it up afterwards withnpm dist-tag rm @utoo/pack oidcprobeetc.utooweb-release.ymlwith the parallel shape: a prerelease tagutooweb-vX.Y.Z-oidcprobe.0against the current@utoo/webversion. Expect green workflow, the@utoo/webprerelease on npmjs.com with Provenance, dist-tagoidcprobe, samenpm dist-tag rmcleanup afterwards.utoopm-release.ymlwith a prerelease tagutoo-vX.Y.Z-oidcprobe.0. Mark the GitHub release as a prerelease in the GitHub web UI (Set as a pre-releasecheckbox on the release-creation page) — the workflow'spublish-mainjob branches ongithub.event.release.prerelease's boolean:trueroutes the publish to thebetadist-tag,falseroutes it tolatest. So the probe goes tobetaand the stablelatestchannel is undisturbed. The matrixbuild-and-publish-binaryjob runs once per(os, cpu)matrix entry and publishes the corresponding@utoo/utoo-<os>-<cpu>platform-binary package vianpm-binary.sh. After the workflow run goes green, check every platform's package page on npmjs.com (@utoo/utoo-darwin-arm64,@utoo/utoo-darwin-x64,@utoo/utoo-linux-x64,@utoo/utoo-linux-arm64-gnu,@utoo/utoo-win32-x64-msvc— exact name set per the matrix inpm-release.yml:13-43) shows the new beta version. Thennpm view utoo@betashows the new entry-package version, and a freshnpm install -g utoo@betaon a clean box runs the entry-package's postinstall (thepostinstall.utoo.sh.template-rendered script fromvendor/scripts/npm-utoo.sh) which fetches the right platform's matrix-published sibling binary into thebin/utoo/bin/utsymlink targets. End state on the canary box:which utoo,which ut,utoo --version,ut --versionall work and report the canary version, and the binary'sfileoutput matches the runtime's host triple. Theset -euo pipefailline added invendor/scripts/npm-binary.shin this PR is specifically the safety net for the matrix-leg case: without it, a missing trusted-publisher entry on any individual@utoo/utoo-<os>-<cpu>registration on the npm side would have been a silent green run on the corresponding matrix leg, with that one platform's package absent from the registry but no CI signal saying so — the user would discover it later via a postinstall failure on the missing-platform-binary's host on a realnpm i -g utoosomewhere.NPM_TOKENrepository secret at the repo's Settings → Secrets and variables → Actions, and on the npm-account side, revoke the corresponding "Publish"-scope automation/granular access token that originally generated the value ofNPM_TOKEN. The diff in this PR no longer references the secret in any of the workflow files (allNODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}andNPM_TOKEN: ${{ secrets.NPM_TOKEN }}lines are gone, and the manual_authToken=$NPM_TOKENechoes inpack-releaseandutooweb-releaseare gone), so the secret is dead state once the first stable OIDC release is verified — keeping a stale long-lived publish credential around defeats the security posture this PR's whole point is to achieve. Order matters: the dead-state secret is the rollback escape hatch until the stable release is green; before then, agit revertof the merge commit lands the workflows back on the token-auth code path that the secret is still wired for. After the stable release goes green, the rollback path is "manually re-add the secret if needed" instead, which is acceptable post-canary-burn-in.🤖 Generated with Claude Code