Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 73 additions & 13 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -210,29 +210,89 @@ jobs:
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"

- name: Recover missing GitHub release for current version
id: recover
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
CURRENT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' rust/Cargo.toml)
TAG_EXISTS=false
RELEASE_EXISTS=false

if git rev-parse "rust-v$CURRENT_VERSION" >/dev/null 2>&1; then
TAG_EXISTS=true
fi

if gh api "repos/${{ github.repository }}/releases/tags/rust-v$CURRENT_VERSION" --silent 2>/dev/null; then
RELEASE_EXISTS=true
fi

if [ "$TAG_EXISTS" = "true" ] && [ "$RELEASE_EXISTS" = "false" ]; then
echo "Recovering: tag rust-v$CURRENT_VERSION exists but GitHub release is missing"
node scripts/create-github-release.mjs \
--release-version "$CURRENT_VERSION" \
--repository "${{ github.repository }}" \
--prefix "rust-"
node scripts/format-github-release.mjs \
--release-version "$CURRENT_VERSION" \
--repository "${{ github.repository }}" \
--commit-sha "$(git rev-list -1 rust-v$CURRENT_VERSION)" \
--prefix "rust-"
echo "recovered=true" >> $GITHUB_OUTPUT
else
echo "No recovery needed (tag=$TAG_EXISTS, release=$RELEASE_EXISTS)"
echo "recovered=false" >> $GITHUB_OUTPUT
fi

- name: Determine bump type from changelog fragments
id: bump_type
run: node scripts/rust-get-bump-type.mjs

- name: Check if version already released or no fragments
id: check
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Check if there are changelog fragments
if [ "${{ steps.bump_type.outputs.has_fragments }}" != "true" ]; then
# No fragments - check if current version tag exists
CURRENT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' rust/Cargo.toml)
if git rev-parse "rust-v$CURRENT_VERSION" >/dev/null 2>&1; then
echo "No changelog fragments and rust-v$CURRENT_VERSION already released"
echo "should_release=false" >> $GITHUB_OUTPUT
else
echo "No changelog fragments but rust-v$CURRENT_VERSION not yet released"
echo "should_release=true" >> $GITHUB_OUTPUT
echo "skip_bump=true" >> $GITHUB_OUTPUT
fi
else
CURRENT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' rust/Cargo.toml)

# Check if version is published on crates.io (source of truth for publish status)
CRATE_NAME=$(grep -Po '(?<=^name = ")[^"]*' rust/Cargo.toml)
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "https://crates.io/api/v1/crates/$CRATE_NAME/$CURRENT_VERSION")
CRATES_PUBLISHED=false
if [ "$HTTP_STATUS" = "200" ]; then
CRATES_PUBLISHED=true
fi

TAG_EXISTS=false
if git rev-parse "rust-v$CURRENT_VERSION" >/dev/null 2>&1; then
TAG_EXISTS=true
fi

RELEASE_EXISTS=false
if gh api "repos/${{ github.repository }}/releases/tags/rust-v$CURRENT_VERSION" --silent 2>/dev/null; then
RELEASE_EXISTS=true
fi

echo "Tag exists: $TAG_EXISTS, Release exists: $RELEASE_EXISTS, Crates.io published: $CRATES_PUBLISHED"

if [ "${{ steps.bump_type.outputs.has_fragments }}" = "true" ]; then
echo "Found changelog fragments, proceeding with release"
echo "should_release=true" >> $GITHUB_OUTPUT
echo "skip_bump=false" >> $GITHUB_OUTPUT
elif [ "$TAG_EXISTS" = "true" ] && [ "$RELEASE_EXISTS" = "true" ] && [ "$CRATES_PUBLISHED" = "true" ]; then
echo "No changelog fragments and rust-v$CURRENT_VERSION fully released (tag + GitHub release + crates.io)"
echo "should_release=false" >> $GITHUB_OUTPUT
elif [ "$TAG_EXISTS" = "true" ] && [ "$RELEASE_EXISTS" = "false" ]; then
echo "Tag exists but GitHub release missing for rust-v$CURRENT_VERSION — recovering"
echo "should_release=true" >> $GITHUB_OUTPUT
echo "skip_bump=true" >> $GITHUB_OUTPUT
elif [ "$TAG_EXISTS" = "false" ]; then
echo "No changelog fragments but rust-v$CURRENT_VERSION not yet tagged"
echo "should_release=true" >> $GITHUB_OUTPUT
echo "skip_bump=true" >> $GITHUB_OUTPUT
else
echo "Version rust-v$CURRENT_VERSION appears fully released"
echo "should_release=false" >> $GITHUB_OUTPUT
fi

- name: Collect changelog and bump version
Expand Down
77 changes: 77 additions & 0 deletions docs/case-studies/issue-261/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Case Study: Issue #261 - Rust CI/CD Missing Releases

## Summary

Version 0.9.2 of `link-assistant-agent` was published to crates.io but no GitHub release was created. The root cause was a chain of failures in the CI/CD pipeline related to crates.io propagation delays and missing recovery logic.

## Timeline

| Time (UTC) | Event | CI Run | Result |
|---|---|---|---|
| 2026-04-13 06:19 | Push to main (v0.9.0 -> 0.9.1 bump) | [24328743522](https://github.com/link-assistant/agent/actions/runs/24328743522) | **Failed**: `CARGO_REGISTRY_TOKEN` not set |
| 2026-04-13 07:49 | Push to main (CARGO_TOKEN fallback fix) | [24331934514](https://github.com/link-assistant/agent/actions/runs/24331934514) | **Failed**: crates.io publish verification failed (5s too short) |
| 2026-04-13 08:51 | Push to main (propagation delay fix) | [24334488286](https://github.com/link-assistant/agent/actions/runs/24334488286) | **Failed**: v0.9.2 published but old code ran (fix not yet merged) |
| 2026-04-13 10:14 | PR #262 merged (all fixes) | [24337999712](https://github.com/link-assistant/agent/actions/runs/24337999712) | **Success** but no release: fragments consumed, tag exists, skipped |

## Root Cause Analysis

### Root Cause 1: Missing CARGO_REGISTRY_TOKEN (run 24328743522)

The workflow used `${{ secrets.CARGO_REGISTRY_TOKEN }}` but the organization had the token stored as `CARGO_TOKEN`. The script checked for `CARGO_REGISTRY_TOKEN` only.

**Fix applied in PR #258**: Added `CARGO_TOKEN` as fallback in `publish-to-crates.mjs`.

### Root Cause 2: Insufficient crates.io propagation delay (run 24331934514)

After a successful `cargo publish`, the script waited only 5 seconds to verify on the crates.io API. First-time publishes can take 15-30+ seconds to propagate.

**Fix applied in PR #260**: Increased verification delay and added retry logic.

### Root Cause 3: "Already exists" not treated as success (run 24334488286)

After the first publish attempt succeeded (exit code 0) but verification failed, retry attempts received "crate already exists" error (exit code 101). The old code treated this as a failure instead of recognizing that the crate was already published.

**Fix applied in PR #262**: Added `detectAlreadyExists()` logic to treat "already exists" as success.

### Root Cause 4: No recovery for missing GitHub releases (run 24337999712)

When the fix merged, the changelog fragments had been consumed by previous runs and the git tag `rust-v0.9.2` already existed. The check logic only looked at git tags:

```yaml
if git rev-parse "rust-v$CURRENT_VERSION" >/dev/null 2>&1; then
echo "should_release=false" # <- Skips GitHub release creation too!
fi
```

This meant the GitHub release was never created despite the crate being published on crates.io.

**Fix applied in this PR (#263)**:
- Check crates.io API, GitHub release existence, AND git tags
- If tag exists but GitHub release doesn't, proceed with release creation
- Use crates.io as the authoritative source of truth for publish status

## Requirements from Issue

1. **Crates.io release**: Published (v0.9.2) but via a failed CI run
2. **GitHub release with badge**: Missing - never created
3. **Check already-published versions**: If version is already on crates.io, treat as success
4. **Check GitHub release existence**: Create if missing even when tag exists
5. **Case study documentation**: This document

## Key Learnings

1. **Use crates.io API as source of truth** - Git tags can exist without successful crates.io publish, and vice versa. Always check the actual registry.
2. **Recovery mechanisms are essential** - CI/CD pipelines must handle partial failures gracefully. If step N fails but step N-1 succeeded, the next run should detect and continue from where it left off.
3. **Verification delays must account for worst-case propagation** - First-time crate publishes can take 30+ seconds to appear on the crates.io API. Use generous delays with multiple retries.
4. **"Already exists" is not an error** - When a registry says "already exists," that's a success signal, not a failure.

## Files Changed

- `.github/workflows/rust.yml` - Added crates.io + GitHub release checks in release decision logic
- `scripts/publish-to-crates.mjs` - Added pre-retry crates.io API check, increased propagation delay/retries

## References

- [Issue #261](https://github.com/link-assistant/agent/issues/261)
- [PR #263](https://github.com/link-assistant/agent/pull/263)
- [Reference: rust-ai-driven-development-pipeline-template](https://github.com/link-foundation/rust-ai-driven-development-pipeline-template) - Uses crates.io as source of truth for release decisions
5 changes: 5 additions & 0 deletions js/.changeset/fix-rust-cicd-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@link-assistant/agent': patch
---

fix(ci): add recovery for missing Rust GitHub releases and improve crates.io publish resilience
9 changes: 9 additions & 0 deletions rust/changelog.d/20260413_fix_missing_github_release.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
bump: patch
---

### Fixed

- CI/CD: Add recovery mechanism for missing GitHub releases when crate is already published on crates.io
- CI/CD: Improve crates.io publish verification with longer delays (20s) and more retries (5 attempts)
- CI/CD: Check crates.io API before retry attempts to detect successful prior publishes
22 changes: 19 additions & 3 deletions scripts/publish-to-crates.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ import {

const MAX_RETRIES = 3;
const RETRY_DELAY = 10000; // 10 seconds
const VERIFY_DELAY = 15000; // 15 seconds for crates.io propagation
const VERIFY_RETRIES = 3; // Number of verification attempts
const VERIFY_DELAY = 20000; // 20 seconds for crates.io propagation
const VERIFY_RETRIES = 5; // Number of verification attempts

const args = process.argv.slice(2);
const getArg = (name, defaultValue) => {
Expand Down Expand Up @@ -241,6 +241,22 @@ async function main() {
for (let i = 1; i <= MAX_RETRIES; i++) {
console.log(`\nPublish attempt ${i} of ${MAX_RETRIES}...`);

// Before each attempt, check crates.io API to see if a prior attempt succeeded
// (crates.io propagation can cause verification to fail even when publish succeeded)
if (i > 1) {
console.log('Checking crates.io API before retry...');
const nowPublished = await checkCratesIo(packageName, currentVersion);
if (nowPublished) {
console.log(
`${packageName}@${currentVersion} is now confirmed on crates.io (prior attempt succeeded)`
);
setOutput('published', 'true');
setOutput('published_version', currentVersion);
setOutput('already_published', 'true');
return;
}
}

const result = exec(cargoPublishCmd, {
capture: true,
allowFailure: true,
Expand All @@ -256,7 +272,7 @@ async function main() {
// "already exists" means the crate was published (possibly by a previous attempt)
if (detectAlreadyExists(combinedOutput)) {
console.log(
`Crate ${packageName}@${currentVersion} already exists on crates.io (published successfully)`
`Crate ${packageName}@${currentVersion} already exists on crates.io (treating as success)`
);
setOutput('published', 'true');
setOutput('published_version', currentVersion);
Expand Down
Loading