BUILD-10724: Import GitHub cache to S3 (migration fallback)#45
BUILD-10724: Import GitHub cache to S3 (migration fallback)#45julien-carsique-sonarsource wants to merge 1 commit intomasterfrom
Conversation
SummaryAdds automatic migration of GitHub Actions cache entries to S3 during the transition period. When using the S3 backend and an entry isn't found in S3, the action falls back to restoring from GitHub Actions cache (using the original unprefixed key), then saves the restored content to S3 post-job. This is enabled by default but can be disabled via What reviewers should knowStart here: Review the three new steps in Test scenarios: The Migration detector: Non-obvious decisions: (1) The output resolver now includes Gotcha: The
|
There was a problem hiding this comment.
Conclusion: The migration logic is sound and fits the existing patterns well, but there is one security issue that needs fixing before merge.
SonarCloud recommendations: Even though action.yml:191 is not failing the quality gate, I strongly recommend fixing it because ${{ inputs.key }} is interpolated directly into the shell run: block inside double quotes — a key value containing a " character can break out of the string and execute arbitrary shell commands. Fix by passing the key through an environment variable:
env:
CACHE_KEY: ${{ inputs.key }}
run: |
echo "::error::Cache miss: no cache found in S3 or GitHub for key '${CACHE_KEY}'"
exit 174563fd to
261d1d0
Compare
261d1d0 to
c0f18eb
Compare
c0f18eb to
2afad67
Compare
…tion mode)
When the S3 backend is used and the S3 cache misses, automatically attempt to restore the cache from GitHub using the original unprefixed key.
The S3 post-job step will then save the restored content to S3, pre-provisioning it for subsequent runs.
The feature is enabled by default. Resolution order to disable it:
1. Action input `import-github-cache: 'false'`
2. Environment variable `CACHE_IMPORT_GITHUB=false`
3. Default: true
`fail-on-cache-miss` and `lookup-only` are propagated to the GitHub fallback step.
When `fail-on-cache-miss` is set and import mode is active, failure is deferred until both S3 and GitHub have been tried.
Also adds `.github/workflows/check-cache-migration.yml`: a manually-triggered workflow that compares GitHub cache entries to S3 objects across
target branches (main, master, branch-*, dogfood-on-*, feature/long/*), ignoring transient keys (build-number-*, mise-*). When 100% of entries
are found in S3, it automatically sets the CACHE_IMPORT_GITHUB=false repository variable to disable the import fallback (this requires the
`CACHE_IMPORT_GITHUB` environment variable to be set from the repository variable via `${{ vars.CACHE_IMPORT_GITHUB }}`).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2afad67 to
0d796ab
Compare
SonarQube reviewer guide
|
SonarQube reviewer guide
|
There was a problem hiding this comment.
Conclusion: The action.yml logic is solid and the fail-on-cache-miss deferral pattern is correctly implemented. However, the migration detector has a key-format mismatch that will cause it to permanently misreport migration status for any caches saved from PR-triggered workflows on dogfood-on-* or feature/long/* branches.
| REF=$(echo "$ENTRY" | jq -r '.ref') | ||
| KEY=$(echo "$ENTRY" | jq -r '.key') | ||
| # Expected S3 key: {ref}/{key} (e.g. refs/heads/main/my-gradle-abc123) | ||
| S3_KEY="${REF}/${KEY}" |
There was a problem hiding this comment.
Bug: The S3 key comparison assumes all S3 keys have a refs/heads/ prefix, but that depends on how the cache was originally saved.
prepare-keys.sh derives the branch prefix as ${GITHUB_HEAD_REF:-$GITHUB_REF}. For push events GITHUB_HEAD_REF is unset, so the prefix is GITHUB_REF = refs/heads/main → S3 key: refs/heads/main/{key}. For PR events GITHUB_HEAD_REF is just the short branch name (e.g. feature/long/my-branch), so the S3 key is feature/long/my-branch/{key} — no refs/heads/ prefix.
The GitHub cache API always returns .ref in full form (refs/heads/feature/long/my-branch), so S3_KEY = "refs/heads/feature/long/my-branch/{key}" will never match feature/long/my-branch/{key} in the S3 listing. Any GitHub cache entry for dogfood-on-* or feature/long/* that was originally saved from a PR workflow will be permanently reported as "missing", and the detector will never declare migration complete as long as those entries exist.
Fix: strip the refs/heads/ prefix from REF before building S3_KEY, and also try the prefix-less form:
REF_SHORT=$(echo "$REF" | sed 's|^refs/heads/||')
S3_KEY_FULL="${REF}/${KEY}"
S3_KEY_SHORT="${REF_SHORT}/${KEY}"
if grep -qxF "$S3_KEY_FULL" /tmp/s3_keys.txt || grep -qxF "$S3_KEY_SHORT" /tmp/s3_keys.txt; thenOr, simpler: always strip the prefix when building the S3 key, since prepare-keys.sh only uses the full refs/heads/ form when GITHUB_HEAD_REF is absent (push events), and in those cases the S3 key is refs/heads/main/{key} — which would also match a prefix-stripped comparison of main/{key}. The safest approach is to check both forms.
- Mark as noise




Summary
When migrating from GitHub cache to S3, the S3 bucket starts empty. This causes all runners to re-download dependencies from scratch until their first S3 cache save completes. This PR addresses that by automatically importing existing GitHub cache entries into S3 during the migration window.
Changes
action.yml— newimport-github-cacheinput + migration stepsNew input:
import-github-cache(default: enabled)Resolution order (mirrors the existing
CACHE_BACKEND/backendpattern):import-github-cacheCACHE_IMPORT_GITHUB(can be set from a repo variable via${{ vars.CACHE_IMPORT_GITHUB }})trueNew steps (S3 path only):
actions/cache/restorewith the original unprefixed key when S3 misses and import mode is active. The S3 post-job save then persists the restored content to S3.fail-on-cache-miss: trueis still respected.fail-on-cache-missandlookup-onlyare correctly propagated to the GitHub fallback step..github/workflows/check-cache-migration.yml— migration completion detectorManually triggered (
workflow_dispatch) workflow to determine when the migration is complete and automatically opt out of the import fallback.It:
main,master,branch-*,dogfood-on-*,feature/long/*) and excluding transient keys (build-number-*,mise-*){ref}/{key}in S3CACHE_IMPORT_GITHUB=false, disabling the fallback automaticallyBehaviour matrix
fail-on-cache-missTesting
Dogfood PR: SonarSource/sonar-dummy-js#125
Jira
BUILD-10724 — child of BUILD-10684