feat(base): disable proactive memory compaction in guest by ValentaTomas · Pull Request #2553 · e2b-dev/infra

ValentaTomas · 2026-05-04T01:21:51Z

Adds vm.compaction_proactiveness=0 to the base template's /etc/sysctl.conf so kcompactd no longer runs background page migrations in the guest.

With 2 MiB host-side hugepage backing of guest RAM, every migration dirties a destination hugepage from the host UFFD's perspective and lands in the next memfile diff — with no snapshot-aligned benefit. The pre-pause compact_memory write (#2551) does the work deterministically right before we capture state.

Existing templates inherit the change on rebuild.

kcompactd's background migrations dirty destination 2 MiB host hugepages without producing a snapshot-aligned benefit, inflating memfile diffs. We trigger compaction explicitly pre-pause instead.

cursor · 2026-05-04T01:21:57Z

PR Summary

Medium Risk
Kernel memory-tuning change applied to all rebuilt base templates could increase memory fragmentation or allocation latency under certain workloads; no direct security impact.

Overview
Updates the base template provisioning to append vm.compaction_proactiveness=0 to /etc/sysctl.conf, disabling kcompactd’s background memory compaction so snapshot diffs aren’t inflated by host-visible page migrations and compaction can be triggered deterministically before pause.

^{Reviewed by Cursor Bugbot for commit 479a399. Bugbot is set up for automated code reviews on this repo. Configure here.}

claude

LGTM, simple sysctl tweak with clear rationale tied to the pre-pause compaction trigger.

Extended reasoning...

Overview

Single-line addition to packages/orchestrator/pkg/template/build/phases/base/provision.sh that appends vm.compaction_proactiveness=0 to /etc/sysctl.conf in the base template, disabling kcompactd background page migration in the guest.

Security risks

None. This is a kernel tunable inside the guest VM that controls when the kernel does background memory compaction. It does not affect auth, isolation, or any host-facing surface.

Level of scrutiny

Low. The change is purely a guest-side performance/snapshot-size optimization with a well-explained rationale: with 2 MiB hugepage backing, kcompactd-driven page migrations dirty destination hugepages from the host UFFD perspective and bloat memfile diffs without workload benefit. The PR description references the companion PR (#2551) that adds an explicit pre-pause compact_memory write so compaction still happens deterministically at the right moment.

Other factors

The pattern matches the existing fs.inotify.max_user_watches line directly above it, the comment explains the why clearly, and existing templates inherit the change only on rebuild so rollout is gradual. No bug reports were filed.

matthewlouisbrockman

lgtm, we need to make sure 2551 goes out at same time/first?

Adds an opt-in pre-pause step that runs `sync`, `drop_caches`, `compact_memory`, and `fstrim -av` on the live VM via envd's Process service to shrink the memfile/rootfs diff. Each step is wrapped in `timeout -s KILL` with its own cap, so a stuck step (most realistically a slow `sync` on a large dirty backlog) cannot starve the rest — and a killed step does not abort the chain (`;`-separated, not `&&`). Pausing FC is unaffected by an in-flight guest `sync` we time out: FC only drains in-flight virtio I/O before completing the pause; any unflushed dirty pages stay in the memfile snapshot and converge on resume. Per-step timeouts trade reclaim payoff, never correctness — `drop_caches` is documented non-destructive, `fstrim` consults FS allocation metadata not pagecache, and a partial `compact_memory` is just less-compacted. Disabled by default — the LD flag's null default leaves every step at 0 (skipped). Missing keys, zero, negative, and wrong-type values all collapse to "skip". The orchestrator skips the envd call entirely when the chain is empty. The outer `Connect-Timeout-Ms` is the sum of per-step caps plus a small slack. Single LD flag, one rule per cohort: - `guest-pause-reclaim` (JSON) — per-step caps in milliseconds keyed by step name, evaluated against sandbox / team / template LD contexts so targeting is configured in LaunchDarkly. Example value: ```json {"sync":500,"drop_caches":200,"compact_memory":1000,"fstrim":500} ``` `resume-build` exposes `-reclaim` to inject the example values into the offline LD store for local testing. Pairs cleanly with #2553 (disable proactive compaction in the guest base image), but is independent of it and of FPH (#2552). Split out from #2550.

feat(base): disable proactive memory compaction in guest

479a399

kcompactd's background migrations dirty destination 2 MiB host hugepages without producing a snapshot-aligned benefit, inflating memfile diffs. We trigger compaction explicitly pre-pause instead.

e2b-request-same-site-reviewers Bot assigned dobrac May 4, 2026

ValentaTomas mentioned this pull request May 4, 2026

feat(sandbox): pre-pause guest reclaim via envd #2551

Merged

ValentaTomas unassigned dobrac May 4, 2026

ValentaTomas marked this pull request as ready for review May 4, 2026 01:42

ValentaTomas requested review from dobrac and jakubno as code owners May 4, 2026 01:42

e2b-request-same-site-reviewers Bot assigned tvi May 4, 2026

ValentaTomas unassigned tvi May 4, 2026

claude Bot reviewed May 4, 2026

View reviewed changes

ValentaTomas enabled auto-merge (squash) May 4, 2026 01:44

ValentaTomas requested a review from arkamar May 4, 2026 08:52

matthewlouisbrockman approved these changes May 4, 2026

View reviewed changes

ValentaTomas merged commit 0892051 into main May 4, 2026
48 checks passed

ValentaTomas deleted the feat/disable-proactive-compaction-base branch May 4, 2026 18:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(base): disable proactive memory compaction in guest#2553

feat(base): disable proactive memory compaction in guest#2553
ValentaTomas merged 1 commit into
mainfrom
feat/disable-proactive-compaction-base

ValentaTomas commented May 4, 2026

Uh oh!

cursor Bot commented May 4, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

matthewlouisbrockman left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ValentaTomas commented May 4, 2026

Uh oh!

cursor Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

matthewlouisbrockman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cursor Bot commented May 4, 2026 •

edited

Loading