Skip to content

ci(release): swap NPM_TOKEN for npm OIDC trusted publisher#2990

Open
elrrrrrrr wants to merge 1 commit into
nextfrom
ci/npm-publish-oidc
Open

ci(release): swap NPM_TOKEN for npm OIDC trusted publisher#2990
elrrrrrrr wants to merge 1 commit into
nextfrom
ci/npm-publish-oidc

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

Summary

  • The three release workflows (pack-release.yml, pm-release.yml, utooweb-release.yml) move from the long-lived NPM_TOKEN GitHub Actions secret to npm's Trusted Publishers OIDC flow. Each publish job grants permissions: id-token: write, lets actions/setup-node's registry-url write the token-less ~/.npmrc template (the _authToken=${NODE_AUTH_TOKEN} line expands to empty when the env var is gone, which is the signal npm CLI ≥ 11.5.1 uses to fall through to the trusted-publisher exchange against https://registry.npmjs.org/-/npm/v1/oidc/token using the GHA-injected ACTIONS_ID_TOKEN_REQUEST_* env vars), and drops the NODE_AUTH_TOKEN / NPM_TOKEN env entries plus pack-release/utooweb-release's legacy echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" >> ~/.npmrc line. The existing Sigstore --provenance attestation rides the same OIDC primitive (it's what motivated the id-token: write permission that pm-release.yml already carried) and is now made explicit on the publish calls in the workflow YAML for pack and utooweb. Because the Node-bundled npm is 10.x and the trusted-publisher CLI handshake didn't ship until npm 11.5.1 (Aug 2025), every publish job inserts an npm install --global npm@latest step before invoking publish.
  • pm-release.yml's two actions/setup-node@v3 invocations bump node-version from "18.x" to "20.x". npm 11.x's engines.node field is ^20.17.0 || >=22.9.0 and refuses installation on Node 18, so the npm-CLI upgrade can't happen without the runner Node bump. The actions/setup-node@v3 and actions/checkout@v3 major versions are deliberately left at v3 — those are at the runner-side Node-16-EOL deprecation-warning state, which is independent of the OIDC switch and belongs in a separate "ci: bump action majors across the release workflows" PR. The other two workflows (pack-release.yml, utooweb-release.yml) were already on Node 20 with actions/setup-node@v4, so no Node-version change is needed there.
  • vendor/scripts/npm-binary.sh and vendor/scripts/npm-utoo.sh (the helpers that pm-release.yml invokes — the first per-matrix-leg for the @utoo/utoo-<os>-<cpu> platform-binary packages, the second once for the unscoped utoo entry package) gain set -euo pipefail right under the shebang. Without it, a failing npm publish was being masked: the script's trailing cat package.json and rm -rf "$WORK_DIR" return zero, so the script's overall exit code was always the final rm's zero regardless of whether the publish succeeded or 401'd, and the calling workflow step looked green. That is the exact shape of the first failure mode of the OIDC migration — a misregistered trusted-publisher entry on the npm side would 401 the exchange and the publish would no-op silently — so the script-hardening lands in the same PR as the auth swap so the first OIDC release goes red instead of green-no-op on a misconfiguration. This is the script-side correctness companion to the YAML auth-swap.

Test plan

End-to-end verification of the OIDC flow requires a release-typed GitHub event because each publish job has if: github.event_name == 'release' and a tag-prefix guard. The cheap form is to cut a prerelease GitHub release whose version label routes the publish to a non-latest npm dist-tag, so the canary never disturbs the latest channel. The three release workflows are independent (different tag prefixes, different package sets), so each needs its own probe. None of the existing per-PR CI matrices (pack-ci, pm-ci, pm-e2e-bench, the format/lint workflows, etc.) exercise these release.yml files on a non-release push, so the regular PR CI is silent on the OIDC plumbing — it only validates the unchanged-by-this-PR rest of the repo.

  • Register the trusted-publisher entries on npmjs.com before this merges. The cheapest form is one registration at the @utoo npm-organization level (Org → Settings → "Trusted Publishers" → Add publisher → GitHub Actions; fields: Organization=utooland, Repository=utoo, Workflow filename=one of pack-release.yml / pm-release.yml / utooweb-release.yml per the workflow that publishes the corresponding subset of scoped packages, Environment=blank — none of the three workflow's publish jobs in this diff sets a environment: key). Org-level registration covers all scoped @utoo/* packages in one shot: @utoo/pack, @utoo/pack-shared, @utoo/pack-cli, @utoo/web, and the five-platform set @utoo/utoo-<os>-<cpu> that the pm matrix publishes via vendor/scripts/npm-binary.sh. The unscoped utoo entry package lives outside the @utoo scope so it needs a separate per-package Trusted Publisher entry on its own npmjs.com package-settings page, with the same field shape and the workflow filename set to pm-release.yml (the publish-main job that invokes vendor/scripts/npm-utoo.sh).
  • Probe utoopack-release.yml: cut a prerelease GitHub release tagged utoopack-vX.Y.Z-oidcprobe.0 where X.Y.Z is the current published @utoo/pack-shared version (so the in-workflow npm version step's bumped value doesn't collide with an existing version on the registry). Expect the workflow run to be green, the new prerelease versions of @utoo/pack, @utoo/pack-shared, and @utoo/pack-cli to appear on each package's npmjs.com page within a minute or two of the workflow finishing, and each package page to show the green "Provenance" attestation badge from Sigstore (the --provenance flag the publish commands now carry). The workflow derives the npm dist-tag from the chunk between the first - and the next . in the version string (utoopack-release.yml's tag-derivation block), so the publish lands on the oidcprobe dist-tag and latest is untouched; clean it up afterwards with npm dist-tag rm @utoo/pack oidcprobe etc.
  • Probe utooweb-release.yml with the parallel shape: a prerelease tag utooweb-vX.Y.Z-oidcprobe.0 against the current @utoo/web version. Expect green workflow, the @utoo/web prerelease on npmjs.com with Provenance, dist-tag oidcprobe, same npm dist-tag rm cleanup afterwards.
  • Probe utoopm-release.yml with a prerelease tag utoo-vX.Y.Z-oidcprobe.0. Mark the GitHub release as a prerelease in the GitHub web UI (Set as a pre-release checkbox on the release-creation page) — the workflow's publish-main job branches on github.event.release.prerelease's boolean: true routes the publish to the beta dist-tag, false routes it to latest. So the probe goes to beta and the stable latest channel is undisturbed. The matrix build-and-publish-binary job runs once per (os, cpu) matrix entry and publishes the corresponding @utoo/utoo-<os>-<cpu> platform-binary package via npm-binary.sh. After the workflow run goes green, check every platform's package page on npmjs.com (@utoo/utoo-darwin-arm64, @utoo/utoo-darwin-x64, @utoo/utoo-linux-x64, @utoo/utoo-linux-arm64-gnu, @utoo/utoo-win32-x64-msvc — exact name set per the matrix in pm-release.yml:13-43) shows the new beta version. Then npm view utoo@beta shows the new entry-package version, and a fresh npm install -g utoo@beta on a clean box runs the entry-package's postinstall (the postinstall.utoo.sh.template-rendered script from vendor/scripts/npm-utoo.sh) which fetches the right platform's matrix-published sibling binary into the bin/utoo/bin/ut symlink targets. End state on the canary box: which utoo, which ut, utoo --version, ut --version all work and report the canary version, and the binary's file output matches the runtime's host triple. The set -euo pipefail line added in vendor/scripts/npm-binary.sh in this PR is specifically the safety net for the matrix-leg case: without it, a missing trusted-publisher entry on any individual @utoo/utoo-<os>-<cpu> registration on the npm side would have been a silent green run on the corresponding matrix leg, with that one platform's package absent from the registry but no CI signal saying so — the user would discover it later via a postinstall failure on the missing-platform-binary's host on a real npm i -g utoo somewhere.
  • After the first stable (non-prerelease) release publishes green via OIDC, delete the NPM_TOKEN repository secret at the repo's Settings → Secrets and variables → Actions, and on the npm-account side, revoke the corresponding "Publish"-scope automation/granular access token that originally generated the value of NPM_TOKEN. The diff in this PR no longer references the secret in any of the workflow files (all NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} and NPM_TOKEN: ${{ secrets.NPM_TOKEN }} lines are gone, and the manual _authToken=$NPM_TOKEN echoes in pack-release and utooweb-release are gone), so the secret is dead state once the first stable OIDC release is verified — keeping a stale long-lived publish credential around defeats the security posture this PR's whole point is to achieve. Order matters: the dead-state secret is the rollback escape hatch until the stable release is green; before then, a git revert of the merge commit lands the workflows back on the token-auth code path that the secret is still wired for. After the stable release goes green, the rollback path is "manually re-add the secret if needed" instead, which is acceptable post-canary-burn-in.

🤖 Generated with Claude Code

The three release workflows now exchange the GitHub Actions OIDC
id-token (granted via `id-token: write`) for a short-lived npm
publish token, replacing the long-lived `NPM_TOKEN` shared secret.
`actions/setup-node`'s `registry-url` writes a token-less `.npmrc`
template (`_authToken=${NODE_AUTH_TOKEN}` expands to empty once the
env var is gone), and the npm CLI from 11.5.1+ picks up the
`ACTIONS_ID_TOKEN_REQUEST_*` env vars and performs the
trusted-publisher exchange. The existing `--provenance` Sigstore
attestation rides on the same OIDC primitive and is preserved on
every publish call.

`pm-release.yml` bumps Node 18.x → 20.x because npm 11.x's
`engines.node` is `^20.17.0 || >=22.9.0` and would refuse install on
Node 18. Every publish job adds an `npm i -g npm@latest` step
because the Node-bundled npm 10.x predates the trusted-publisher
flow.

`vendor/scripts/npm-binary.sh` and `vendor/scripts/npm-utoo.sh` gain
`set -euo pipefail`; without it, a failed `npm publish` was masked
by the trailing `cat package.json` / `rm -rf "$WORK_DIR"` and the
workflow step exited zero. That mode would hide the most likely
first-cut failure of this migration — a mis-registered
trusted-publisher entry on npmjs.com — behind a green CI run.

Out-of-band one-time setup, not part of this diff: on npmjs.com,
register `utooland/utoo` + each of the three workflow filenames
(`pack-release.yml`, `pm-release.yml`, `utooweb-release.yml`) as
trusted publishers under the `@utoo` org so the scoped packages and
the per-platform `@utoo/utoo-<os>-<cpu>` binaries from the pm matrix
inherit the registration, plus a separate entry for the unscoped
`utoo` entry package. Once the first OIDC publish lands green, the
`NPM_TOKEN` repo secret can be deleted from Settings → Secrets and
variables → Actions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates npm-binary.sh and npm-utoo.sh to include set -euo pipefail, ensuring that failures during the npm publish process are not masked. The reviewer recommends using an EXIT trap to handle the cleanup of temporary directories, as the addition of set -e causes the scripts to terminate immediately on error, potentially leaving orphaned files.

@@ -1,4 +1,6 @@
#!/bin/bash
# Without this, a failed npm publish is masked by the trailing cat/rm.
set -euo pipefail
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While set -e correctly prevents the script from masking failures, it also causes the script to exit immediately upon error, skipping the manual cleanup at the end. This leaves the temporary directory $WORK_DIR on the filesystem. Using an EXIT trap is a more robust pattern for ensuring cleanup in shell scripts, especially when using set -e.

Suggested change
set -euo pipefail
set -euo pipefail
trap 'rm -rf "${WORK_DIR:-}"' EXIT

@@ -1,4 +1,6 @@
#!/bin/bash
# Without this, a failed npm publish is masked by the trailing cat/rm.
set -euo pipefail
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The addition of set -e means that any failure (e.g., during npm publish) will terminate the script before it reaches the cleanup step. To ensure the temporary working directory is always removed, consider using an EXIT trap instead of relying on a manual rm at the end of the script. Note that ${WORK_DIR:-} is used to safely handle the variable if the script exits before it is defined.

Suggested change
set -euo pipefail
set -euo pipefail
trap 'rm -rf "${WORK_DIR:-}"' EXIT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant