ci: one-time cache clear to reduce S3 bucket bloat#901
Closed
Corey Christous (cchristous) wants to merge 1 commit intomasterfrom
Closed
ci: one-time cache clear to reduce S3 bucket bloat#901Corey Christous (cchristous) wants to merge 1 commit intomasterfrom
Corey Christous (cchristous) wants to merge 1 commit intomasterfrom
Conversation
The Semaphore cache S3 bucket (semaphore-cache-519856050701-us-west-2) has grown to 137 TB due to Maven cache entries being stored with unique-per-build keys. This causes thousands of multi-GB duplicate cache files to accumulate over the 90-day lifecycle window. This adds `cache clear` as the first command in the CI prologue to flush all existing cache entries for this project. This is a one-time cleanup and should be reverted after merging.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cache clearas the first command in the Semaphore CI pipeline prologueWhy this is needed
The prod Semaphore cache S3 bucket (
semaphore-cache-519856050701-us-west-2) has grown to 137 TB, up from 58 TB just 3 months ago. Analysis shows that Maven cache entries are being stored with unique-per-build keys (e.g.,cache-maven-8.0.x-1770405615), creating a new 17-35 GB cache file on every single build instead of overwriting a stable key.With the 90-day S3 lifecycle, this causes thousands of duplicate cache files to pile up. This project is one of the top contributors to the bucket size.
What this PR does
Adds
cache clearas the first prologue command, which deletes all cached entries for this project. This is a one-time cleanup -- the change should be reverted immediately after merging so that caching resumes normally.Why this is safe
cache clearonly affects this project cache prefix in S3 (scoped by Semaphore project ID)cache storewill repopulate itAction after merge
Revert this PR immediately after the first successful pipeline run so that future builds benefit from caching again.