correct download labels — show output dir, rename Skipped → Cached#1038
Open
nl917 wants to merge 2 commits into
Open
correct download labels — show output dir, rename Skipped → Cached#1038nl917 wants to merge 2 commits into
nl917 wants to merge 2 commits into
Conversation
Contributor
Author
dolaameng
reviewed
May 29, 2026
| parts = [f"{n} run(s) {label}" for n, label in ((downloaded, "downloaded"), (skipped, "skipped")) if n] | ||
| # Summary — counts are runs (one run may include multiple files, e.g. with --include-source). | ||
| # "Cached" = already on disk from a previous download; the data is available without a fetch. | ||
| parts = [f"{n} run(s) {label}" for n, label in ((downloaded, "downloaded"), (cached, "cached")) if n] |
Contributor
There was a problem hiding this comment.
skipped -> cached. leave to @simsryan-google for a decision
Contributor
There was a problem hiding this comment.
In line 7586 we use _dir_size and here we use zip file size, any reason why?
Contributor
Author
There was a problem hiding this comment.
Good catch! This is an inconsistency and will be fixed!
| display_files = [ | ||
| f"{self._normalize_model_slug(r.model_version_slug)}/{r.id}/{r.id}.zip" for r in downloadable | ||
| ] | ||
| # Show the extracted output directory (what is actually left on disk after |
Contributor
There was a problem hiding this comment.
I don't think we need to comment on "why" here, and the "what" part is also obvious. Suggested removed
dolaameng
approved these changes
May 30, 2026
Contributor
dolaameng
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the fix!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Fixes two cosmetic-but-misleading labels in
kaggle b t downloadoutput thatdescribed the function's internal control flow rather than the on-disk state
the user actually sees.
Bug A:
Filecolumn shows a.zippath that doesn't exist on diskBefore:
Model File Size
Progress
────────────────────── ──────────────────────────────────────── ──────────
──────────
gemini-3-flash-preview gemini-3-flash-preview/271385/271385.zip 1.06KB
Done
After:
Model File Size
Progress
────────────────────── ──────────────────────────────────────── ──────────
──────────
gemini-3-flash-preview gemini-3-flash-preview/271385/ 1.06KB
Done
The
.zipis the intermediate download archive that gets extracted andthen
removed (
os.remove(zipfile_path)afterzf.extractall). Showing it in theoutput table sends users
ls'ing for a file that isn't there. Now showing theextracted directory that actually survives.
Bug B: re-runs show
Skipped, suggesting the data isn't availableBefore:
gemini-3-flash-preview gemini-3-flash-preview/271385/ 1.06KB
Skipped
Done: 1 run(s) skipped.
After:
gemini-3-flash-preview gemini-3-flash-preview/271385/ 1.06KB
Cached
Done: 1 run(s) cached.
When a run's output directory already exists and
--forceisn't passed, weshort-circuit the download (the right behavior). But
Skippedreads as "thedownload didn't happen, so I have nothing" — when in reality the data is
on disk from a previous run, ready to use.
Cacheddescribes the user-visibleoutcome (the data is locally cached, no fetch needed) instead of leaking the
function's internal control-flow language.
Files changed
src/kaggle/api/kaggle_api_extended.py— 4 small edits inbenchmarks_tasks_download_cli:display_fileslist, the skip branch,counter init, summary line. Added 2 short comments explaining the rationale.
src/kaggle/test/test_benchmarks_cli.py—test_download_skips_existing_outputandtest_download_summary_countsupdated to assert the new wording and pin the new behavior (asserts the
.zippath is not in the output, and the directory path is).
Test plan
pytest src/kaggle/test/test_benchmarks_cli.py::TestDownload→ 21/21pass
kaggle b t download <task>shows directory path inFilecolumn, no
.zipkaggle b t download <task>(re-run without-f) showsCachedrow and
Done: 1 run(s) cached.summarykaggle b t download <task> -f(force re-download) showsDonerow and
Done: 1 run(s) downloaded.summary(verified by running the suite on the unmodified branch base)