feat: implement release-branch workflow#1076
Conversation
Replaces the cut-from-main release flow with long-lived release/vX.Y branches carrying rcs and finals. main carries X.Y.0.devN for the next minor. Patches cherry-pick onto the existing release branch. Adds four workflow_dispatch workflows: cut-release-branch, publish-release (was cd.yml), cherry-pick-to-release, publish-dev-from-main. Adds bump_version.py with five PEP 440 transition modes plus unit tests. Prerelease publishing to PyPI is gated on PUBLISH_PRERELEASES (default false). Auth migrates from the mellea-auto-release GitHub App to GITHUB_TOKEN with inline permissions blocks. See RELEASE.md for the full operator-facing flow. Assisted-by: Claude Code Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>
|
Stepping back to the original problem we had on the last release, where we had to basically pause new PR merges for a few days while the release was prepared, I think the main thing we want from a release branch is to let contributors keep merging to main while a release is being stabilized. With that in mind, I want to push back a bit on the scope of this PR. I think most of the machinery here is solving an adjacent problem (formal PEP 440 rc/dev/patch lifecycle) rather than the merges-during-stabilization problem we ran into last time. I think we could actually solve this with a simpler flow:
I'd suggest splitting this in two: one PR for the release branch and a separate issue/PR for the PEP 440 flow. I think that would allow for more discussion but also let us close on the merge-during-release problem quicker. |
| ) | ||
|
|
||
| if mode == "rc": | ||
| if current.pre is None or current.pre[0] != "rc": |
There was a problem hiding this comment.
Suggestion: when patch > 0, both mode=rc and mode=patch-rc produce the same output (X.Y.Zrc(N+1)). mode=rc quietly succeeds on patch rcs by accident. RELEASE.md's operator table only documents patch-rc for the patch cycle, so someone running rc by muscle memory gets the right answer with no signal they've used the wrong mode.
Consider explicitly rejecting mode=rc when patch != 0:
if patch != 0:
raise ValueError(
f"mode=rc is for minor rcs (X.Y.0); got {current}. "
"Use mode=patch-rc to iterate a patch rc."
)There was a problem hiding this comment.
The overlap is intentional, I'll let Claude explain:
The two modes have different jobs:
mode=rcis the rc iterator — requires an existing rc and bumps the rc number. Works the same way regardless of whether it's a minor rc (0.6.0rc1→0.6.0rc2) or a patch rc (0.6.1rc1→0.6.1rc2).mode=patch-rcis the transition mode for going from a final to the first rc of a patch cycle (0.6.0→0.6.1rc0). It can also iterate patch rcs after that, but its primary purpose is the transition.
The error messages reflect that split: mode=rc rejects non-rcs ("If this is a final, use mode=patch-rc to start a patch cycle"), and mode=patch-rc rejects minor rcs ("Use mode=rc to iterate minor rcs"). An operator running the wrong mode for the wrong context still gets a hard error — the only soft case is "running rc against a patch rc that already exists," which is also rc's job.
Rejecting rc when patch != 0 would break the design: there'd be no single mode that just iterates an rc, and operators would have to mentally branch on whether they're in a minor or patch cycle every time they bump. I'd rather keep rc as the universal iterator.
| # Pull the generated notes back locally to update the changelog. | ||
| REL_NOTES=$(mktemp) | ||
| gh release view "${TARGET_TAG_NAME}" --json body -q ".body" >> "${REL_NOTES}" |
There was a problem hiding this comment.
We should have a discussion of what we want the release notes to encompass. If we cut release tags for rc1, devN, etc... I think these autogenerated release notes will be problematic.
Github auto-builds these notes with using PRs merged since the last release. I do not know if it accounts for standard semver things. If not, we will either have to do this ourselves or change our release tagging process documented here.
There was a problem hiding this comment.
This will be affected by the choices outlined in my refactor ideas below, but if we keep pre-release tags then I would just need to update the changelog workflow to compare to the last final release rather than the most recent one, a trivial change
| options: | ||
| - rc | ||
| - final | ||
| - patch-rc | ||
| - patch-final | ||
| - none |
There was a problem hiding this comment.
Why would we want to tag a rc0, etc... release (really anything but the final release for a v0.X.Y)?
There was a problem hiding this comment.
These are more of a future proofing choice for supporting pre-releases on pypi when/if we decide to add those by enabling PUBLISH_PRERELEASES=true
|
I addressed @planetf1 technical review with a new commit and a response, as for @psschwei and @jakelorocco your review is the type of feedback that I wanted to drive the discussion at sync on Monday, in fact I had meant to open this as draft to make it clear this was just a proposal. In writing this I focused on making it full-featured so we could remove undesired features afterwards rather than needing to rework new functionally in, so I'm open to removing things to streamline it. As such I'll have Claude detail out some refactoring ideas for discussion based on your comments and my opinions, we can then dig into these ideas both here and on Monday's call: Refactor ideas for MondayFour points to discuss. Items 1 and 2 are alternatives — pick at most one. Item 3 is orthogonal to both. Item 4 is the "split into multiple PRs" question. My current lean on each is in italics. 1. Drop speculative prerelease tagging. Currently we create git tags for every rc and dev (
2. Drop prereleases entirely. A more aggressive trim that supersedes #1: stabilization happens in-place on the release branch with no rc cycle, no 3. Drop cherry-pick, switch to PRs targeting the release branch. This is Paul's proposal and is orthogonal to #1 and #2. The current PR's flow is "every change lands on main first; maintainers run a cherry-pick workflow to port selected commits onto the release branch." Paul's alternative: contributors open stabilization PRs directly against What gets removed if we switch: Tradeoffs:
I'd lean toward keeping cherry-pick. "Main is the source of truth" matches what most contributors already do, and the cherry-pick machinery is written and validated. The simplification Paul gets is real but not large. 4. Split into two PRs. @psschwei suggested splitting release-branch + cherry-pick into PR1 and the PEP 440 lifecycle into PR2. I'd push back on this. If we decide to keep prerelease versioning, it should land integrated, not split — and if we decide to remove it (per #2 above), there's nothing to split. Splitting creates an awkward intermediate state where the project has half a release model. For illustration, the cleanest seam would be:
The seam isn't clean — both PRs edit The smaller-review-surface benefit Paul wants from a split, we can also get by trimming features inside this PR after Monday's discussion — without the integration churn. |
The context we provide the models is going to matter here. For example, if you give it this for a prompt will give you a response that is much more in favor of splitting and just merging the release branch part. |
But this goes against the whole point of having a release branch. A release branch's job is to freeze a code shape so you can stabilize it without main's churn leaking in. Cherry-pick inverts that: it makes main the place where fixes are authored, which means fixes get authored against main's current shape, which means main's churn does leak in. |
I did say it was Claude outlining my opinions, but perhaps I should have been clearer.
Honestly I don't disagree with you, but the current cherry-pick model was what we outlined and decided on in the design call last week, thus why Claude was so insistent on it, it saw it as a design requirement, not a choice made during implementation. I am fully ok with dropping the whole cheery-pick code and just using PRs. Its the easiest update of the items above |
planetf1
left a comment
There was a problem hiding this comment.
Five WARNINGs from a deeper pass focused on retry / idempotency and CI plumbing. None are architectural; all are small fixes. Suggestion blocks attached for each.
| gh release create "${TARGET_TAG_NAME}" \ | ||
| --target "${RELEASE_BRANCH}" \ | ||
| --generate-notes |
There was a problem hiding this comment.
RELEASE.md describes bump_type: none as the retry path when a previous run committed the bump but failed before publishing. That works for prereleases, but not for finals. gh release create returns 422 if the release already exists, set -e kicks in, and we never reach the pypi.yml and docs-publish.yml dispatches below.
Guarding the call should be enough:
| gh release create "${TARGET_TAG_NAME}" \ | |
| --target "${RELEASE_BRANCH}" \ | |
| --generate-notes | |
| if ! gh release view "${TARGET_TAG_NAME}" >/dev/null 2>&1; then | |
| gh release create "${TARGET_TAG_NAME}" \ | |
| --target "${RELEASE_BRANCH}" \ | |
| --generate-notes | |
| fi |
Same thinking applies to the changelog sync PR below (git push origin "${SYNC_BRANCH}" on 107, then the gh pr create after it). Both fail on a second run.
| To resolve locally: | ||
| 1. Clone the repo (if you are not already local) and check out ${RELEASE_BRANCH}. | ||
| 2. Re-run this script with the same SHAs to reach the same state. | ||
| 3. Resolve the conflicted files, then: | ||
| git add <resolved-files> | ||
| git cherry-pick --continue | ||
| 4. Push to origin (requires push access / bypass rights): | ||
| git push origin ${RELEASE_BRANCH} |
There was a problem hiding this comment.
Step 2 ("Re-run this script with the same SHAs") doesn't actually work. When git cherry-pick hits a conflict it leaves the working tree dirty (UU <file>), so the re-run hits the clean-tree check at line 38 and exits straight away.
Reordered with an explicit abort step first:
| To resolve locally: | |
| 1. Clone the repo (if you are not already local) and check out ${RELEASE_BRANCH}. | |
| 2. Re-run this script with the same SHAs to reach the same state. | |
| 3. Resolve the conflicted files, then: | |
| git add <resolved-files> | |
| git cherry-pick --continue | |
| 4. Push to origin (requires push access / bypass rights): | |
| git push origin ${RELEASE_BRANCH} | |
| To resolve locally: | |
| 1. Clone the repo (if you are not already local) and check out ${RELEASE_BRANCH}. | |
| 2. Abort the in-progress cherry-pick: | |
| git cherry-pick --abort | |
| 3. Re-run this script with the same SHAs to reach the same conflict. | |
| 4. Resolve the conflicted files, then: | |
| git add <resolved-files> | |
| git cherry-pick --continue | |
| 5. Push to origin (requires push access / bypass rights): | |
| git push origin ${RELEASE_BRANCH} |
Alternative is to detect CHERRY_PICK_HEAD at the top of the script and skip the clean-tree check in that case, but the playbook fix is simpler.
| if: >- | ||
| steps.latest_check.conclusion == 'skipped' || | ||
| steps.latest_check.outputs.is_latest_final == 'true' |
There was a problem hiding this comment.
If latest_check fails (transient gh API hiccup, rate limit, malformed paginated response), the step exits failure, is_latest_final is never written, and this if: falls through to false so the deploy step skips. The job ends red so it's not strictly silent, but the effect is that a flaky API call blocks the production docs deploy entirely.
Failing open is safer for finals — if we can't confirm "this isn't the latest", deploying matches the common case:
| if: >- | |
| steps.latest_check.conclusion == 'skipped' || | |
| steps.latest_check.outputs.is_latest_final == 'true' | |
| if: >- | |
| steps.latest_check.conclusion == 'skipped' || | |
| steps.latest_check.outputs.is_latest_final != 'false' |
Plus continue-on-error: true on the latest_check step itself (~ line 345) so its failure doesn't poison the rest of the job. The != 'false' then covers both the success case ('true') and the failure case (empty string).
| git tag "${TARGET_TAG_NAME}" | ||
| git push origin "${TARGET_TAG_NAME}" | ||
| gh workflow run pypi.yml --ref "${TARGET_TAG_NAME}" |
There was a problem hiding this comment.
Same idempotency gap as the finals path above, but for prereleases under bump_type: none. The existing-tag guard in bump_version.py:196 only runs when there's an actual bump — publish-release.yml:126 skips it for bump_type=none. So on retry, git tag here errors out because the tag is already there, set -e kicks in, and pypi.yml never gets dispatched.
Mirror the guard:
| git tag "${TARGET_TAG_NAME}" | |
| git push origin "${TARGET_TAG_NAME}" | |
| gh workflow run pypi.yml --ref "${TARGET_TAG_NAME}" | |
| if ! git rev-parse "${TARGET_TAG_NAME}" >/dev/null 2>&1; then | |
| git tag "${TARGET_TAG_NAME}" | |
| git push origin "${TARGET_TAG_NAME}" | |
| fi | |
| gh workflow run pypi.yml --ref "${TARGET_TAG_NAME}" |
| --title "docs: sync changelog for ${TARGET_TAG_NAME}" \ | ||
| --body "Automated changelog sync from \`${RELEASE_BRANCH}\` after publishing [${TARGET_TAG_NAME}](${RELEASE_URL}). | ||
|
|
||
| This PR brings the release-branch CHANGELOG entry back to main so the project root CHANGELOG remains the canonical history across all branches." |
There was a problem hiding this comment.
gh pr create here authenticates with GH_TOKEN, and per GitHub's anti-loop rule, a PR opened by GITHUB_TOKEN doesn't fire pull_request: events on other workflows. CI on the sync PR never starts, so if main requires status checks the PR can't merge until someone manually dispatches CI or pushes an empty commit.
cherry-pick-to-release.yml already works around the same limitation by dispatching ci.yml explicitly. Same pattern fits here — append:
| This PR brings the release-branch CHANGELOG entry back to main so the project root CHANGELOG remains the canonical history across all branches." | |
| This PR brings the release-branch CHANGELOG entry back to main so the project root CHANGELOG remains the canonical history across all branches." | |
| # GITHUB_TOKEN-authored pull_request events don't trigger workflows; dispatch CI explicitly. | |
| gh workflow run ci.yml --ref "${SYNC_BRANCH}" |
planetf1
left a comment
There was a problem hiding this comment.
Following up on the inline WARNINGs review with one more SUGGESTION on workflow concurrency. Escalating to REQUEST_CHANGES because W1, W6, and W7 either break or weaken documented behaviour (retry path for finals, prerelease retry, and CI on the sync PR) — would prefer to see those addressed (or explicitly deferred to a follow-up issue) before merge. Architecture and direction look good; this is about the rough edges around recovery and CI plumbing.
| contents: write # push main + new release branch | ||
| pull-requests: write # release.sh may open changelog sync PR if rc0 is published (flag on) | ||
| actions: write # release.sh dispatches pypi.yml after tagging rc0 | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
publish-release.yml:107 and publish-dev-from-main.yml both declare concurrency: release. The two write-path workflows that don't (this one and cherry-pick-to-release.yml) can run alongside any of the others.
The scenario that actually bites: a cherry-pick lands in the middle of a publish-release run on the same release branch. The bump commit gets pushed, then a cherry-pick lands, then git tag in release.sh points at the cherry-picked commit instead of the bump (or fails non-fast-forward and aborts halfway).
Default concurrency: queues rather than cancels, so adding the group doesn't lose work — just serializes the four release-track workflows:
| runs-on: ubuntu-latest | |
| runs-on: ubuntu-latest | |
| concurrency: release |
| permissions: | ||
| contents: write # push cherry-picks to release branch | ||
| actions: write # dispatch ci.yml after push | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
Pair with the suggestion on cut-release-branch.yml. Adding the same concurrency: release group on this job means a cherry-pick can't push to a release branch while publish-release or publish-dev-from-main is mid-run on it, which is the case that risks tagging the wrong commit.
| runs-on: ubuntu-latest | |
| runs-on: ubuntu-latest | |
| concurrency: release |
Misc PR
Type of PR
Description
Replaces the cut-from-main release flow with a release-branch model. Every minor release gets a long-lived
release/vX.Ybranch carrying rcs and the final;maincarriesX.Y.0.devNfor the next minor. Patches cherry-pick onto the existing release branch and go through their own rc cycle. SeeRELEASE.mdfor the full operator-facing documentation.Adds four
workflow_dispatchworkflows (cut-release-branch,publish-release(renamed fromcd.yml),cherry-pick-to-release,publish-dev-from-main), abump_version.pyhelper with five PEP 440 transition modes, and aPUBLISH_PRERELEASESrepo variable that gates PyPI uploads for rc/dev versions. Auth migrates from themellea-auto-releaseGitHub App toGITHUB_TOKENwith inlinepermissions:blocks.Admin actions required after merge are listed at the bottom of
RELEASE.md.Testing
End-to-end dry-run validated on
ajbozarth/melleafork: cut-release, cherry-pick, publish-dev-from-main, rc, final, and the explicit downstream-workflow dispatches (pypi.yml,docs-publish.yml,ci.yml) that work around GitHub's anti-loop rule forGITHUB_TOKEN-authored events.Attribution