Skip to content

fix: #3274 limit sandbox archive extraction#3278

Open
Aphroq wants to merge 2 commits intoopenai:mainfrom
Aphroq:fix/archive-extraction-resource-limits
Open

fix: #3274 limit sandbox archive extraction#3278
Aphroq wants to merge 2 commits intoopenai:mainfrom
Aphroq:fix/archive-extraction-resource-limits

Conversation

@Aphroq
Copy link
Copy Markdown
Contributor

@Aphroq Aphroq commented May 8, 2026

Summary

  • Add an archive input byte limit to public extract_archive().
  • Preflight tar/zip member count and declared extracted-byte totals before writing extracted payloads.
  • Add small-limit regression tests for input size, tar/zip extracted bytes, and tar/zip member count.

Test plan

  • uv run pytest tests/sandbox/test_extract.py -k "extract"
  • bash .agents/skills/code-change-verification/scripts/run.sh

Issue number

Closes #3274

Checks

  • I've added new tests (if relevant)
  • I've added/updated the relevant documentation
  • I've run make lint and make format
  • I've made sure tests pass

@github-actions github-actions Bot added bug Something isn't working feature:sandboxes labels May 8, 2026
@Aphroq Aphroq changed the title fix: limit sandbox archive extraction fix: #3274 limit sandbox archive extraction May 8, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 58202dfe6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/sandbox/session/archive_extraction.py Outdated
@Aphroq
Copy link
Copy Markdown
Contributor Author

Aphroq commented May 8, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +17 to +18
MAX_ARCHIVE_MEMBERS = 10_000
MAX_ARCHIVE_EXTRACTED_BYTES = 512 * 1024 * 1024
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how did you decide these magic numbers?

Copy link
Copy Markdown
Contributor Author

@Aphroq Aphroq May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These numbers are not derived from a published sandbox quota, and this extraction path did not have an existing archive-size cap to copy from.

I picked 512 MiB for the byte-size defaults because it should leave reasonable room for typical source/workspace archives, while giving the local spool, workspace write, and declared extraction size a clear upper bound. The input cap protects the local spool/workspace write before archive validation runs; the extracted-size cap is a second layer for archives that are acceptable as input but declare a much larger extraction footprint.

Both byte-size caps use 512 MiB right now to keep the first version simple and conservative, not because compressed and extracted sizes need to match exactly. If we want to leave more room for normal archive expansion, the extracted-size cap could be higher than the input cap.

The 10k member cap is separate. I chose that to stay out of the way for ordinary source archives while still stopping cases where the cost is mostly the number of entries, like many empty or tiny files.

That said, I’m not attached to the exact thresholds; happy to adjust them if you’d prefer a tighter or looser default. If we decide these should be user-tunable, I think the design should be an explicit limits object rather than a hidden global setting.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the tar streaming issue. I think the remaining concern is that the PR turns archive resource limits into public behavior, but only exposes them as hard-coded module globals.

Could you please make these limits explicitly configurable and avoid changing the default behavior?

Suggested API shape:

  • Add a small dataclass such as SandboxArchiveLimits with:
    • max_input_bytes: int | None
    • max_extracted_bytes: int | None
    • max_members: int | None
  • Add archive_limits: SandboxArchiveLimits | None = None to SandboxRunConfig, appended after the existing fields to preserve positional compatibility.
  • Treat archive_limits=None as "do not enforce SDK archive resource limits", preserving the current behavior.
  • Treat SandboxArchiveLimits() as "enable archive resource limits with SDK default thresholds".
  • Treat None on an individual SandboxArchiveLimits field as "disable this specific limit".
  • Validate non-None values as positive integers.
  • Also allow a per-call override on BaseSandboxSession.extract(..., archive_limits=...), since extract() is a direct public entry point outside Runner.

I would also like the default thresholds used by SandboxArchiveLimits() to be less conservative than the current hard-coded values. For example:

  • max_input_bytes = 1 GiB
  • max_extracted_bytes = 4 GiB
  • max_members = 100_000

I do not think 512 MiB for both input bytes and extracted bytes is a good default. Those limits protect different resources, and compressed archives commonly expand beyond their input size. I also think 10,000 members is low for legitimate sandbox workloads that include dependency trees, generated files, logs, datasets, media assets, or workspace restore/migration archives.

This keeps resource limits available for applications that handle untrusted archives, while avoiding a default behavior change for applications that currently extract large trusted archives.

Please add tests covering:

  • Default archive_limits=None preserving the existing no-resource-limit behavior.
  • SandboxArchiveLimits() enabling the SDK default thresholds.
  • SandboxRunConfig.archive_limits being applied to Runner-created, resumed, and injected sessions.
  • BaseSandboxSession.extract(..., archive_limits=...) overriding the session default.
  • Invalid non-positive limits being rejected.
  • None disabling only the selected limit when a SandboxArchiveLimits object is provided.
  • Error context reporting the configured limit value.

@seratch seratch added this to the 0.17.x milestone May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:sandboxes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Archive extraction should enforce resource limits before writing payloads

2 participants