fix: #3274 limit sandbox archive extraction#3278
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 58202dfe6e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Nice work! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
| MAX_ARCHIVE_MEMBERS = 10_000 | ||
| MAX_ARCHIVE_EXTRACTED_BYTES = 512 * 1024 * 1024 |
There was a problem hiding this comment.
how did you decide these magic numbers?
There was a problem hiding this comment.
These numbers are not derived from a published sandbox quota, and this extraction path did not have an existing archive-size cap to copy from.
I picked 512 MiB for the byte-size defaults because it should leave reasonable room for typical source/workspace archives, while giving the local spool, workspace write, and declared extraction size a clear upper bound. The input cap protects the local spool/workspace write before archive validation runs; the extracted-size cap is a second layer for archives that are acceptable as input but declare a much larger extraction footprint.
Both byte-size caps use 512 MiB right now to keep the first version simple and conservative, not because compressed and extracted sizes need to match exactly. If we want to leave more room for normal archive expansion, the extracted-size cap could be higher than the input cap.
The 10k member cap is separate. I chose that to stay out of the way for ordinary source archives while still stopping cases where the cost is mostly the number of entries, like many empty or tiny files.
That said, I’m not attached to the exact thresholds; happy to adjust them if you’d prefer a tighter or looser default. If we decide these should be user-tunable, I think the design should be an explicit limits object rather than a hidden global setting.
There was a problem hiding this comment.
Thanks for addressing the tar streaming issue. I think the remaining concern is that the PR turns archive resource limits into public behavior, but only exposes them as hard-coded module globals.
Could you please make these limits explicitly configurable and avoid changing the default behavior?
Suggested API shape:
- Add a small dataclass such as
SandboxArchiveLimitswith:max_input_bytes: int | Nonemax_extracted_bytes: int | Nonemax_members: int | None
- Add
archive_limits: SandboxArchiveLimits | None = NonetoSandboxRunConfig, appended after the existing fields to preserve positional compatibility. - Treat
archive_limits=Noneas "do not enforce SDK archive resource limits", preserving the current behavior. - Treat
SandboxArchiveLimits()as "enable archive resource limits with SDK default thresholds". - Treat
Noneon an individualSandboxArchiveLimitsfield as "disable this specific limit". - Validate non-
Nonevalues as positive integers. - Also allow a per-call override on
BaseSandboxSession.extract(..., archive_limits=...), sinceextract()is a direct public entry point outsideRunner.
I would also like the default thresholds used by SandboxArchiveLimits() to be less conservative than the current hard-coded values. For example:
max_input_bytes = 1 GiBmax_extracted_bytes = 4 GiBmax_members = 100_000
I do not think 512 MiB for both input bytes and extracted bytes is a good default. Those limits protect different resources, and compressed archives commonly expand beyond their input size. I also think 10,000 members is low for legitimate sandbox workloads that include dependency trees, generated files, logs, datasets, media assets, or workspace restore/migration archives.
This keeps resource limits available for applications that handle untrusted archives, while avoiding a default behavior change for applications that currently extract large trusted archives.
Please add tests covering:
- Default
archive_limits=Nonepreserving the existing no-resource-limit behavior. SandboxArchiveLimits()enabling the SDK default thresholds.SandboxRunConfig.archive_limitsbeing applied to Runner-created, resumed, and injected sessions.BaseSandboxSession.extract(..., archive_limits=...)overriding the session default.- Invalid non-positive limits being rejected.
Nonedisabling only the selected limit when aSandboxArchiveLimitsobject is provided.- Error context reporting the configured limit value.
Summary
extract_archive().Test plan
uv run pytest tests/sandbox/test_extract.py -k "extract"bash .agents/skills/code-change-verification/scripts/run.shIssue number
Closes #3274
Checks
make lintandmake format