feat: implement release-branch workflow by ajbozarth · Pull Request #1076 · generative-computing/mellea

ajbozarth · 2026-05-13T22:37:23Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes support an actual release branch #1005

Replaces the cut-from-main release flow with a release-branch model. Every minor release gets a long-lived release/vX.Y branch carrying rcs and the final; main carries X.Y.0.devN for the next minor. Patches cherry-pick onto the existing release branch and go through their own rc cycle. See RELEASE.md for the full operator-facing documentation.

Adds four workflow_dispatch workflows (cut-release-branch, publish-release (renamed from cd.yml), cherry-pick-to-release, publish-dev-from-main), a bump_version.py helper with five PEP 440 transition modes, and a PUBLISH_PRERELEASES repo variable that gates PyPI uploads for rc/dev versions. Auth migrates from the mellea-auto-release GitHub App to GITHUB_TOKEN with inline permissions: blocks.

Admin actions required after merge are listed at the bottom of RELEASE.md.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

End-to-end dry-run validated on ajbozarth/mellea fork: cut-release, cherry-pick, publish-dev-from-main, rc, final, and the explicit downstream-workflow dispatches (pypi.yml, docs-publish.yml, ci.yml) that work around GitHub's anti-loop rule for GITHUB_TOKEN-authored events.

Attribution

AI coding assistants used

Replaces the cut-from-main release flow with long-lived release/vX.Y branches carrying rcs and finals. main carries X.Y.0.devN for the next minor. Patches cherry-pick onto the existing release branch. Adds four workflow_dispatch workflows: cut-release-branch, publish-release (was cd.yml), cherry-pick-to-release, publish-dev-from-main. Adds bump_version.py with five PEP 440 transition modes plus unit tests. Prerelease publishing to PyPI is gated on PUBLISH_PRERELEASES (default false). Auth migrates from the mellea-auto-release GitHub App to GITHUB_TOKEN with inline permissions blocks. See RELEASE.md for the full operator-facing flow. Assisted-by: Claude Code Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

psschwei · 2026-05-14T02:02:34Z

Stepping back to the original problem we had on the last release, where we had to basically pause new PR merges for a few days while the release was prepared, I think the main thing we want from a release branch is to let contributors keep merging to main while a release is being stabilized.

With that in mind, I want to push back a bit on the scope of this PR. I think most of the machinery here is solving an adjacent problem (formal PEP 440 rc/dev/patch lifecycle) rather than the merges-during-stabilization problem we ran into last time.

I think we could actually solve this with a simpler flow:

When a release is ready to stabilize, create release/vX.Y from main using GitHub's UI (Branches → New branch from main)
Stabilization fixes land on release/vX.Y via normal PRs targeting that branch, while regular development keeps happening on main
When ready to publish, dispatch a single "Publish release" workflow against the release branch, with the target version typed in as an input (e.g. 0.6.0, or 0.6.1 for a later patch). The workflow handles the version bump, tag, GitHub Release, PyPI upload, changelog, and changelog-sync PR back to main.

I'd suggest splitting this in two: one PR for the release branch and a separate issue/PR for the PEP 440 flow. I think that would allow for more discussion but also let us close on the merge-during-release problem quicker.

planetf1 · 2026-05-14T11:50:29Z

+        )
+
+    if mode == "rc":
+        if current.pre is None or current.pre[0] != "rc":


Suggestion: when patch > 0, both mode=rc and mode=patch-rc produce the same output (X.Y.Zrc(N+1)). mode=rc quietly succeeds on patch rcs by accident. RELEASE.md's operator table only documents patch-rc for the patch cycle, so someone running rc by muscle memory gets the right answer with no signal they've used the wrong mode.

Consider explicitly rejecting mode=rc when patch != 0:

if patch != 0: raise ValueError( f"mode=rc is for minor rcs (X.Y.0); got {current}. " "Use mode=patch-rc to iterate a patch rc." )

The overlap is intentional, I'll let Claude explain:

The two modes have different jobs:

mode=rc is the rc iterator — requires an existing rc and bumps the rc number. Works the same way regardless of whether it's a minor rc (0.6.0rc1 → 0.6.0rc2) or a patch rc (0.6.1rc1 → 0.6.1rc2).

mode=patch-rc is the transition mode for going from a final to the first rc of a patch cycle (0.6.0 → 0.6.1rc0). It can also iterate patch rcs after that, but its primary purpose is the transition.

The error messages reflect that split: mode=rc rejects non-rcs ("If this is a final, use mode=patch-rc to start a patch cycle"), and mode=patch-rc rejects minor rcs ("Use mode=rc to iterate minor rcs"). An operator running the wrong mode for the wrong context still gets a hard error — the only soft case is "running rc against a patch rc that already exists," which is also rc's job.

Rejecting rc when patch != 0 would break the design: there'd be no single mode that just iterates an rc, and operators would have to mentally branch on whether they're in a minor or patch cycle every time they bump. I'd rather keep rc as the universal iterator.

jakelorocco · 2026-05-14T16:38:40Z

+# Pull the generated notes back locally to update the changelog.
 REL_NOTES=$(mktemp)
 gh release view "${TARGET_TAG_NAME}" --json body -q ".body" >> "${REL_NOTES}"


We should have a discussion of what we want the release notes to encompass. If we cut release tags for rc1, devN, etc... I think these autogenerated release notes will be problematic.

Github auto-builds these notes with using PRs merged since the last release. I do not know if it accounts for standard semver things. If not, we will either have to do this ourselves or change our release tagging process documented here.

This will be affected by the choices outlined in my refactor ideas below, but if we keep pre-release tags then I would just need to update the changelog workflow to compare to the last final release rather than the most recent one, a trivial change

jakelorocco · 2026-05-14T16:42:05Z

+        options:
+          - rc
+          - final
+          - patch-rc
+          - patch-final
+          - none


Why would we want to tag a rc0, etc... release (really anything but the final release for a v0.X.Y)?

These are more of a future proofing choice for supporting pre-releases on pypi when/if we decide to add those by enabling PUBLISH_PRERELEASES=true

ajbozarth · 2026-05-14T19:35:33Z

I addressed @planetf1 technical review with a new commit and a response, as for @psschwei and @jakelorocco your review is the type of feedback that I wanted to drive the discussion at sync on Monday, in fact I had meant to open this as draft to make it clear this was just a proposal. In writing this I focused on making it full-featured so we could remove undesired features afterwards rather than needing to rework new functionally in, so I'm open to removing things to streamline it.

As such I'll have Claude detail out some refactoring ideas for discussion based on your comments and my opinions, we can then dig into these ideas both here and on Monday's call:

Refactor ideas for Monday

Four points to discuss. Items 1 and 2 are alternatives — pick at most one. Item 3 is orthogonal to both. Item 4 is the "split into multiple PRs" question. My current lean on each is in italics.

1. Drop speculative prerelease tagging. Currently we create git tags for every rc and dev (v0.6.0rc1, v0.6.0.dev3) even though PUBLISH_PRERELEASES defaults to false, so the tags exist but no PyPI upload happens unless an admin flips the flag.

Drop the tagging. No prerelease tags created until we actually decide to publish prereleases. If we later flip PUBLISH_PRERELEASES, we'd add tagging back at that point. Jake's auto-notes concern dissolves because the only tag that exists is the final. This is what I'd lean toward.
Keep the future-proofing tags. Already what the code does. Add --notes-start-tag <last-final> to gh release create for the final so auto-notes diff against the previous final, not the previous rc. ~5 lines of shell. Preserves the "every prerelease version has a corresponding git tag" invariant for when we eventually flip the flag.

2. Drop prereleases entirely. A more aggressive trim that supersedes #1: stabilization happens in-place on the release branch with no rc cycle, no .devN from main, no PyPI uploads of prereleases ever. Final ships when ready, version bumps once at the end. Since prereleases don't publish by default today, the day-one diff to current behavior is small — what we'd lose is the option to flip the flag later and start publishing prereleases. Users who want pre-stable code would install from a git ref. I'd argue against going this far — the prerelease infrastructure is built and validated; keeping it (with #1's tagging trim) is cheap.

3. Drop cherry-pick, switch to PRs targeting the release branch. This is Paul's proposal and is orthogonal to #1 and #2. The current PR's flow is "every change lands on main first; maintainers run a cherry-pick workflow to port selected commits onto the release branch." Paul's alternative: contributors open stabilization PRs directly against release/vX.Y.

What gets removed if we switch: cherry_pick_to_release.sh, cherry-pick-to-release.yml, the merge-order topological sort logic, and the operator playbook for resolving cherry-pick conflicts.

Tradeoffs:

	Cherry-pick (current)	PRs to release branch (Paul's)
Source of truth	main is canonical; release branch is a curated subset	both branches accept changes; drift possible
Contributor burden	nothing new — open PRs against main as usual	must know which branch to target; maintainers may need to redirect
Maintainer burden	identify SHAs + dispatch cherry-pick workflow	review release-branch PRs; manually port to main if needed
Fix-on-both case	one PR to main, one cherry-pick dispatch	two PRs, or one + manual port
Release-branch-only fix	requires temporary land-on-main or script bypass	natural — just PR against the release branch
New machinery	~150 lines of shell + a workflow	none

I'd lean toward keeping cherry-pick. "Main is the source of truth" matches what most contributors already do, and the cherry-pick machinery is written and validated. The simplification Paul gets is real but not large.

4. Split into two PRs. @psschwei suggested splitting release-branch + cherry-pick into PR1 and the PEP 440 lifecycle into PR2. I'd push back on this. If we decide to keep prerelease versioning, it should land integrated, not split — and if we decide to remove it (per #2 above), there's nothing to split. Splitting creates an awkward intermediate state where the project has half a release model.

For illustration, the cleanest seam would be:

PR1 — "ship final releases from a release branch": cut-release-branch.yml (creates the branch, bumps directly to next final), cherry_pick_to_release.sh + workflow, release.sh final-only path, pypi.yml on v* tag push only, bump_version.py with only final/patch-final modes, basic RELEASE.md.
PR2 — "prerelease lifecycle": rc/patch-rc/dev modes added to bump_version.py, publish-dev-from-main.yml + .devN on main, cut-release-branch.yml reworked to bump main to next dev and branch to rc0, prerelease path added back to release.sh, PUBLISH_PRERELEASES gate added to pypi.yml, RELEASE.md extended.

The seam isn't clean — both PRs edit bump_version.py, cut-release-branch.yml, release.sh, pypi.yml, and RELEASE.md. PR2 would partially undo and reshape what PR1 establishes. The end-to-end dry-run I did would need to be redone for PR1's narrower scope and again for PR2's reintegration.

The smaller-review-surface benefit Paul wants from a split, we can also get by trimming features inside this PR after Monday's discussion — without the integration churn.

psschwei · 2026-05-15T02:05:56Z

I'll have Claude detail out some refactoring ideas

The context we provide the models is going to matter here. For example, if you give it this for a prompt

I want to look at PR 1076 and evaluate this comment left on the PR:
https://github.com/generative-computing/mellea/pull/1076#issuecomment-4446766810 
ignore all other comments, focus on just the code + PR body + this comment

will give you a response that is much more in favor of splitting and just merging the release branch part.

psschwei · 2026-05-15T02:10:49Z

Drop cherry-pick, switch to PRs targeting the release branch. ... I'd lean toward keeping cherry-pick. "Main is the source of truth" matches what most contributors already do, and the cherry-pick machinery is written and validated.

But this goes against the whole point of having a release branch. A release branch's job is to freeze a code shape so you can stabilize it without main's churn leaking in. Cherry-pick inverts that: it makes main the place where fixes are authored, which means fixes get authored against main's current shape, which means main's churn does leak in.

ajbozarth · 2026-05-15T14:51:50Z

The context we provide the models is going to matter here.

I did say it was Claude outlining my opinions, but perhaps I should have been clearer.

But this goes against the whole point of having a release branch. A release branch's job is to freeze a code shape so you can stabilize it without main's churn leaking in. Cherry-pick inverts that: it makes main the place where fixes are authored, which means fixes get authored against main's current shape, which means main's churn does leak in.

Honestly I don't disagree with you, but the current cherry-pick model was what we outlined and decided on in the design call last week, thus why Claude was so insistent on it, it saw it as a design requirement, not a choice made during implementation.

I am fully ok with dropping the whole cheery-pick code and just using PRs. Its the easiest update of the items above

planetf1

Five WARNINGs from a deeper pass focused on retry / idempotency and CI plumbing. None are architectural; all are small fixes. Suggestion blocks attached for each.

planetf1 · 2026-05-18T09:34:22Z

+gh release create "${TARGET_TAG_NAME}" \
+    --target "${RELEASE_BRANCH}" \
+    --generate-notes


RELEASE.md describes bump_type: none as the retry path when a previous run committed the bump but failed before publishing. That works for prereleases, but not for finals. gh release create returns 422 if the release already exists, set -e kicks in, and we never reach the pypi.yml and docs-publish.yml dispatches below.

Guarding the call should be enough:

Suggested change

gh release create "${TARGET_TAG_NAME}" \

--target "${RELEASE_BRANCH}" \

--generate-notes

if ! gh release view "${TARGET_TAG_NAME}" >/dev/null 2>&1; then

gh release create "${TARGET_TAG_NAME}" \

--target "${RELEASE_BRANCH}" \

--generate-notes

fi

Same thinking applies to the changelog sync PR below (git push origin "${SYNC_BRANCH}" on 107, then the gh pr create after it). Both fail on a second run.

planetf1 · 2026-05-18T09:34:23Z

+To resolve locally:
+  1. Clone the repo (if you are not already local) and check out ${RELEASE_BRANCH}.
+  2. Re-run this script with the same SHAs to reach the same state.
+  3. Resolve the conflicted files, then:
+       git add <resolved-files>
+       git cherry-pick --continue
+  4. Push to origin (requires push access / bypass rights):
+       git push origin ${RELEASE_BRANCH}


Step 2 ("Re-run this script with the same SHAs") doesn't actually work. When git cherry-pick hits a conflict it leaves the working tree dirty (UU <file>), so the re-run hits the clean-tree check at line 38 and exits straight away.

Reordered with an explicit abort step first:

Suggested change

To resolve locally:

1. Clone the repo (if you are not already local) and check out ${RELEASE_BRANCH}.

2. Re-run this script with the same SHAs to reach the same state.

3. Resolve the conflicted files, then:

git add <resolved-files>

git cherry-pick --continue

4. Push to origin (requires push access / bypass rights):

git push origin ${RELEASE_BRANCH}

To resolve locally:

1. Clone the repo (if you are not already local) and check out ${RELEASE_BRANCH}.

2. Abort the in-progress cherry-pick:

git cherry-pick --abort

3. Re-run this script with the same SHAs to reach the same conflict.

4. Resolve the conflicted files, then:

git add <resolved-files>

git cherry-pick --continue

5. Push to origin (requires push access / bypass rights):

git push origin ${RELEASE_BRANCH}

Alternative is to detect CHERRY_PICK_HEAD at the top of the script and skip the clean-tree check in that case, but the playbook fix is simpler.

planetf1 · 2026-05-18T09:34:23Z

+        if: >-
+          steps.latest_check.conclusion == 'skipped' ||
+          steps.latest_check.outputs.is_latest_final == 'true'


If latest_check fails (transient gh API hiccup, rate limit, malformed paginated response), the step exits failure, is_latest_final is never written, and this if: falls through to false so the deploy step skips. The job ends red so it's not strictly silent, but the effect is that a flaky API call blocks the production docs deploy entirely.

Failing open is safer for finals — if we can't confirm "this isn't the latest", deploying matches the common case:

Suggested change

if: >-

steps.latest_check.conclusion == 'skipped' ||

steps.latest_check.outputs.is_latest_final == 'true'

if: >-

steps.latest_check.conclusion == 'skipped' ||

steps.latest_check.outputs.is_latest_final != 'false'

Plus continue-on-error: true on the latest_check step itself (~ line 345) so its failure doesn't poison the rest of the job. The != 'false' then covers both the success case ('true') and the failure case (empty string).

planetf1 · 2026-05-18T09:34:23Z

+    git tag "${TARGET_TAG_NAME}"
+    git push origin "${TARGET_TAG_NAME}"
+    gh workflow run pypi.yml --ref "${TARGET_TAG_NAME}"


Same idempotency gap as the finals path above, but for prereleases under bump_type: none. The existing-tag guard in bump_version.py:196 only runs when there's an actual bump — publish-release.yml:126 skips it for bump_type=none. So on retry, git tag here errors out because the tag is already there, set -e kicks in, and pypi.yml never gets dispatched.

Mirror the guard:

Suggested change

git tag "${TARGET_TAG_NAME}"

git push origin "${TARGET_TAG_NAME}"

gh workflow run pypi.yml --ref "${TARGET_TAG_NAME}"

if ! git rev-parse "${TARGET_TAG_NAME}" >/dev/null 2>&1; then

git tag "${TARGET_TAG_NAME}"

git push origin "${TARGET_TAG_NAME}"

fi

gh workflow run pypi.yml --ref "${TARGET_TAG_NAME}"

planetf1 · 2026-05-18T09:34:23Z

+    --title "docs: sync changelog for ${TARGET_TAG_NAME}" \
+    --body "Automated changelog sync from \`${RELEASE_BRANCH}\` after publishing [${TARGET_TAG_NAME}](${RELEASE_URL}).
+
+This PR brings the release-branch CHANGELOG entry back to main so the project root CHANGELOG remains the canonical history across all branches."


gh pr create here authenticates with GH_TOKEN, and per GitHub's anti-loop rule, a PR opened by GITHUB_TOKEN doesn't fire pull_request: events on other workflows. CI on the sync PR never starts, so if main requires status checks the PR can't merge until someone manually dispatches CI or pushes an empty commit.

cherry-pick-to-release.yml already works around the same limitation by dispatching ci.yml explicitly. Same pattern fits here — append:

Suggested change

This PR brings the release-branch CHANGELOG entry back to main so the project root CHANGELOG remains the canonical history across all branches."

This PR brings the release-branch CHANGELOG entry back to main so the project root CHANGELOG remains the canonical history across all branches."

# GITHUB_TOKEN-authored pull_request events don't trigger workflows; dispatch CI explicitly.

gh workflow run ci.yml --ref "${SYNC_BRANCH}"

planetf1

Following up on the inline WARNINGs review with one more SUGGESTION on workflow concurrency. Escalating to REQUEST_CHANGES because W1, W6, and W7 either break or weaken documented behaviour (retry path for finals, prerelease retry, and CI on the sync PR) — would prefer to see those addressed (or explicitly deferred to a follow-up issue) before merge. Architecture and direction look good; this is about the rough edges around recovery and CI plumbing.

planetf1 · 2026-05-18T09:38:13Z

+      contents: write        # push main + new release branch
+      pull-requests: write   # release.sh may open changelog sync PR if rc0 is published (flag on)
+      actions: write         # release.sh dispatches pypi.yml after tagging rc0
+    runs-on: ubuntu-latest


publish-release.yml:107 and publish-dev-from-main.yml both declare concurrency: release. The two write-path workflows that don't (this one and cherry-pick-to-release.yml) can run alongside any of the others.

The scenario that actually bites: a cherry-pick lands in the middle of a publish-release run on the same release branch. The bump commit gets pushed, then a cherry-pick lands, then git tag in release.sh points at the cherry-picked commit instead of the bump (or fails non-fast-forward and aborts halfway).

Default concurrency: queues rather than cancels, so adding the group doesn't lose work — just serializes the four release-track workflows:

Suggested change

runs-on: ubuntu-latest

runs-on: ubuntu-latest

concurrency: release

planetf1 · 2026-05-18T09:38:13Z

+    permissions:
+      contents: write    # push cherry-picks to release branch
+      actions: write     # dispatch ci.yml after push
+    runs-on: ubuntu-latest


Pair with the suggestion on cut-release-branch.yml. Adding the same concurrency: release group on this job means a cherry-pick can't push to a release branch while publish-release or publish-dev-from-main is mid-run on it, which is the case that risks tagging the wrong commit.

Suggested change

runs-on: ubuntu-latest

runs-on: ubuntu-latest

concurrency: release

ajbozarth requested a review from a team as a code owner May 13, 2026 22:37

ajbozarth requested review from nrfulton and planetf1 May 13, 2026 22:37

github-actions Bot added the enhancement New feature or request label May 13, 2026

ajbozarth requested review from jakelorocco, psschwei and serjikibm May 13, 2026 22:38

ajbozarth self-assigned this May 13, 2026