Skip to content

feat(capsule): auto-prune to bound disk usage of capsules cache#10382

Open
davidfirst wants to merge 15 commits into
masterfrom
feat/capsule-auto-prune
Open

feat(capsule): auto-prune to bound disk usage of capsules cache#10382
davidfirst wants to merge 15 commits into
masterfrom
feat/capsule-auto-prune

Conversation

@davidfirst
Copy link
Copy Markdown
Member

@davidfirst davidfirst commented May 18, 2026

The global capsules cache (~/Library/Caches/Bit/capsules) grows unbounded — measured 46 GB across 490 subdirs on a real machine, oldest 4+ weeks old. There was zero automatic eviction. Each hashed subdir is tied to a host path, so workspaces that move/disappear leave orphans forever, and aspect-version subdirs pile up (3 versions of one env at ~450 MB each is typical).

Capsule kinds — handled differently

Every top-level subdir under the capsules root falls into one of four kinds, and each is treated differently by the prune logic:

  • Workspace capsules — caps built during workspace operations (bit build). Cheap to regenerate from a live workspace, so these are deleted unconditionally on every prune run.
  • Scope capsules — caps built when isolating from a bare-scope context (no workspace). Treated as warm cache; deleted only if last-used marker is older than --older-than (default 30d) or the origin path no longer exists.
  • Scope-aspects roots — the root containing per-aspect-version subdirs (e.g., teambit.node_envs_node-babel-mocha@0.2.6, @0.2.7, @0.2.8). The root itself is never deleted; instead each per-aspect-version child is checked individually — children whose last-used marker mtime is older than the threshold (or whose origin is gone) are evicted, current versions stay warm. This avoids cold pnpm-install on every aspect load while still bounding the unbounded version accumulation.
  • Dated-capsules dir<root>/dated-capsules/<YYYY-M-D>/<uuid>/ holds in-flight isolation runs (used when use_dated_capsules is enabled). Dated capsules are recreated on every isolation, so anything that isn't today's date subdir is leftover from a previous run and gets deleted regardless of age. Today's subdir is preserved to avoid racing a concurrent bit process. Honors the capsules_scopes_aspects_dated_dir config name.

Mechanism

  • Origin marker .bit-capsule-origin.json per capsule dir (kind + originPath + createdAt). Mtime is touched on every reuse so it acts as a reliable "last used" signal independent of filesystem atime (which is often disabled).
  • Fast delete: rename into a sibling .trash/<uuid>/ (O(1) on APFS/ext4) + detached portable Node subprocess running fs.rmSync. bit capsule delete now returns instantly even for multi-GB dirs.
  • bit capsule prune with --older-than, --keep-workspace-caps, --no-orphans, --size-target, --dry-run, --json. Legacy unmarked dirs sniffed by aspect-version pattern to avoid nuking a pre-existing aspect root.
  • Auto-trigger: runs once per ~24h on onBeforeExit, gated by a stamp file, spawns a detached bit capsule prune child so the parent's exit is never delayed by the size walk.
  • bit capsule list now reports total cache size, orphan count, and stale-aspect-version count.
  • New config keys (user-settable): capsules_max_size_gb (default 10), capsules_max_age_days (default 30), capsules_auto_prune (default true).

Adds origin markers, fast delete, prune command, and a once-per-24h
auto-trigger so the global capsules cache stops growing unbounded.
Workspace caps are deleted on each prune; aspect-version and scope caps
are evicted by last-used age (default 30d). New configs:
capsules_max_size_gb (10), capsules_max_age_days (30),
capsules_auto_prune (true).
Copilot AI review requested due to automatic review settings May 18, 2026 20:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds bounded-disk-usage management to the global capsules cache by tracking per-capsule origin metadata, fast-deleting via a .trash rename + detached rm -rf, and introducing a bit capsule prune command plus an automatic ~24h trigger gated by a stamp file. Also enriches bit capsule list with cache size/orphan/stale-aspect stats and adds three new user config keys.

Changes:

  • New CapsulePruneCmd and origin-marker bookkeeping in IsolatorMain (kind, originPath, last-used mtime), plus prune/eviction/size-target logic.
  • Fast-delete pipeline: move capsule dir into sibling .trash/<uuid>/ and detach rm -rf.
  • Auto-prune onBeforeExit hook gated by .last-capsule-prune stamp, with three new config keys.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
scopes/workspace/workspace/workspace.main.runtime.ts Wires CapsulePruneCmd into the capsule command group.
scopes/workspace/workspace/capsule.cmd.ts Adds prune subcommand, formatBytes, and richer list output (size/orphans/stale).
scopes/component/isolator/isolator.main.runtime.ts Core: origin markers, fast-delete via trash, prune/size-target/LRU logic, auto-prune hook.
scopes/component/isolator/index.ts Re-exports new types and constants from the isolator runtime.
components/legacy/constants/constants.ts Adds capsules_max_size_gb, capsules_max_age_days, capsules_auto_prune config keys.
Comments suppressed due to low confidence (1)

scopes/component/isolator/isolator.main.runtime.ts:1252

  • discrepancy_with_pr_description: The PR description states that auto-prune is "spawned detached so it never blocks foreground", but maybeAutoPrune runs pruneCapsules in-process and awaits it inside the onBeforeExit hook. Either the implementation needs to actually detach (e.g., spawn a child process), or the PR description should be corrected.
    const report = await this.pruneCapsules({
      olderThanDays,
      sizeTargetGb,
      includeOrphans: true,
    });
    this.logger.debug(
      `[auto-prune] removed ${report.removed.length} capsule(s), freed ${report.totalRemovedBytes} bytes`
    );

Comment thread scopes/workspace/workspace/capsule.cmd.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts
Comment thread scopes/component/isolator/isolator.main.runtime.ts
- read --no-orphans as opts.noOrphans (bit CLI doesn't apply commander
  negation); previously the flag was a no-op.
- replace `rm -rf` spawn with portable `node -e fs.rmSync` so the trash
  sweep also works on Windows.
- detach auto-prune via a spawned `bit capsule prune` child so the slow
  size walk doesn't delay every bit command's exit once per day.
The dated-capsules dir (`<root>/dated-capsules/<YYYY-M-D>/<uuid>/`) holds
in-flight isolation runs. Its parent mtime is bumped on every new
isolation, so it never aged out under the standard rule. Walk one level
deep and prune individual date subdirs older than `--older-than`.
Honors the `capsules_scopes_aspects_dated_dir` config.
Copilot AI review requested due to automatic review settings May 18, 2026 23:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
Dated capsules are recreated on every isolation, so anything that isn't
today's YYYY-M-D subdir is leftover from a previous run and safe to
delete regardless of age. Today's subdir is preserved to avoid racing a
concurrent bit process. Drops the unused age cutoff from this path.
Copilot AI review requested due to automatic review settings May 19, 2026 15:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Comment thread scopes/component/isolator/isolator.main.runtime.ts
Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts
Comment thread scopes/workspace/workspace/capsule.cmd.ts Outdated
Comment thread scopes/workspace/workspace/capsule.cmd.ts Outdated
- deriveCapsuleKind: detect bare-scope host via duck-type check on bitMap
  so non-aspect scope isolations get kind=scope (warm-cache) instead of
  workspace (deleted unconditionally).
- pruneCapsules: project totalSizeAfter from removed bytes in dry-run so
  the report no longer reads "46GB → 46GB (freed 12GB)".
- scheduleFastDelete: when dir is the global capsules root itself, skip
  the rename-to-trash dance (would put .trash inside the doomed dir) and
  just remove directly.
- CapsuleListCmd: walking size/orphan/stale stats is now gated behind
  `--with-stats`. Default `bit capsule list` is back to near-instant.
- CapsuleListCmd: the stale-aspect cutoff now reads
  CFG_CAPSULES_MAX_AGE_DAYS instead of hard-coding 30.
Copilot AI review requested due to automatic review settings May 19, 2026 17:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
Comment thread scopes/workspace/workspace/capsule.cmd.ts
- registerAutoPruneHook JSDoc: clarify that the prune now runs out-of-
  process via a detached child, not in-process.
- maybeAutoPrune: accept both string `'false'` and boolean false for
  capsules_auto_prune so JSON-config users can disable it too.
- pruneDatedCapsulesChildren: drop the misleading getMonth() < 12 branch
  (always true) and use the same getMonth()+1 expression directly.
- CapsulePruneCmd: read CFG_CAPSULES_MAX_AGE_DAYS and CFG_CAPSULES_MAX_SIZE_GB
  as fallbacks for the CLI flags so manual prune and auto-prune use the
  same effective thresholds when the user has overridden the defaults.
Copilot AI review requested due to automatic review settings May 20, 2026 20:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (3)

scopes/workspace/workspace/capsule.cmd.ts:345

  • Same numeric-coercion issue for sizeTargetGb: Number(opts.sizeTarget) / Number(sizeTargetFromConfig) can yield NaN, which then breaks the size-target logic in pruneCapsules (no evictions will happen because the computed threshold becomes NaN). Validate that the parsed value is finite (and > 0) before passing it through.
      sizeTargetGb:
        opts.sizeTarget !== undefined
          ? Number(opts.sizeTarget)
          : sizeTargetFromConfig !== undefined
            ? Number(sizeTargetFromConfig)
            : undefined,
      dryRun: opts.dryRun === true,

scopes/component/isolator/isolator.main.runtime.ts:1507

  • pruneCapsules() computes totalSizeBefore via getCapsulesTotalSize() (which calls listAllCapsuleRoots() and walks sizes), and then immediately calls listAllCapsuleRoots() again. This doubles the most expensive part of pruning. Consider calling listAllCapsuleRoots({ withSizes: true }) once, deriving totalSizeBefore from the returned roots, and reusing that array for the prune loop.
    const datedDirName = this.configStore.getConfig(CFG_CAPSULES_SCOPES_ASPECTS_DATED_DIR) || 'dated-capsules';

    const totalSizeBefore = await this.getCapsulesTotalSize();
    const roots = await this.listAllCapsuleRoots();
    const removed: PruneCapsulesReport['removed'] = [];

scopes/component/isolator/isolator.main.runtime.ts:1662

  • applySizeTarget() calls listAllCapsuleRoots() without { withSizes: false }, so it will (by default) compute full directory sizes for every top-level cache entry again, even though this method only needs the root classification/paths and then computes sizes for each aspect child explicitly. Passing { withSizes: false } here would avoid an unnecessary full-cache walk.
    const targetBytes = sizeTargetGb * 1024 * 1024 * 1024;
    const removedPaths = new Set(removed.map((r) => r.path));
    // Re-walk what's left to find the oldest aspect-version children.
    const roots = await this.listAllCapsuleRoots();
    const aspectChildren: Array<{ path: string; lastUsedMs: number; sizeBytes: number }> = [];

Comment thread scopes/workspace/workspace/capsule.cmd.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts
Comment thread scopes/component/isolator/isolator.main.runtime.ts
Comment thread scopes/workspace/workspace/capsule.cmd.ts
- toFiniteNumber helper: guard against empty/non-numeric config values
  (which would otherwise become NaN and silently disable age/size
  enforcement). Used in both maybeAutoPrune and CapsulePruneCmd.
- listAllCapsuleRoots: switch unbounded Promise.all to bounded pMap so
  large caches (hundreds of subdirs with many files each) don't risk
  EMFILE from concurrent recursive size walks.
- computeDirSize: same bounded-concurrency switch inside the recursive
  walk itself.

E2E coverage for `bit capsule prune` is a worthwhile follow-up but not
included in this PR.
Copilot AI review requested due to automatic review settings May 20, 2026 21:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment thread scopes/workspace/workspace/capsule.cmd.ts Outdated
Comment thread scopes/workspace/workspace/capsule.cmd.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
Comment thread scopes/component/isolator/isolator.main.runtime.ts Outdated
- CapsuleListCmd --with-stats: use toFiniteNumber + Math.max(0,...) for
  capsules_max_age_days so a corrupt config value can't yield NaN/0d in
  the printed cutoff label.
- bit capsule list -j: omit totalSizeBytes/allRoots unless --with-stats
  is set, instead of returning misleading zeros.
- pruneCapsules: walk the cache once. listAllCapsuleRoots already
  reports sizeBytes, so derive totalSizeBefore from that rather than
  calling getCapsulesTotalSize (which would re-walk).
- readOriginMarker: validate kind against the known set. Markers with
  an unknown kind (corrupt or from a future version) now fall through
  to the unmarked path instead of being silently skipped by prune.
bit capsule prune was blocking 5+ minutes on multi-GB caches because
pruneCapsules called listAllCapsuleRoots({ withSizes: true }), which
recursively lstats every file before doing any deletion. The deletes
themselves are O(1) renames (scheduleFastDelete), so the size walk was
the only slow part.

New behavior:
- pruneCapsules takes a withSizes flag (default false). When false, all
  sizeBytes are reported as 0 and the cache walk skips computeDirSize.
- --size-target still forces sizes on (mandatory for LRU enforcement).
- CapsulePruneCmd exposes a --with-sizes flag for users who want byte
  accounting in the report.
- The report omits the byte summary line when sizes weren't computed
  and hints how to get it.

Auto-prune is unaffected (it sets --size-target, which still walks).
Copilot AI review requested due to automatic review settings May 21, 2026 15:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment on lines +1256 to +1278
const ONE_DAY_MS = 24 * 60 * 60 * 1000;
try {
const stat = await fs.stat(stampPath);
if (Date.now() - stat.mtime.getTime() < ONE_DAY_MS) return;
} catch {
// missing — first run, fall through and prune
}
// Write the stamp first to claim the slot — even if the spawn fails, we don't retry
// within 24h. The detached child sees this recent stamp on its own exit and skips its
// own auto-prune, so no recursion.
await fs.outputFile(stampPath, '');

// Guard against non-numeric or empty config values: a stray string would otherwise
// become NaN and silently disable age/size enforcement.
const maxAgeRaw = this.configStore.getConfig(CFG_CAPSULES_MAX_AGE_DAYS);
const olderThanDays = toFiniteNumber(maxAgeRaw) ?? 30;
const maxSizeRaw = this.configStore.getConfig(CFG_CAPSULES_MAX_SIZE_GB);
const sizeTargetGb = toFiniteNumber(maxSizeRaw) ?? 10;

this.logger.debug(
`[auto-prune] spawning detached child. olderThanDays=${olderThanDays}, sizeTargetGb=${sizeTargetGb}`
);
this.spawnDetachedAutoPrune(olderThanDays, sizeTargetGb);
Comment on lines +1532 to +1542
async pruneCapsules(opts: PruneCapsulesOptions = {}): Promise<PruneCapsulesReport> {
const olderThanDays = opts.olderThanDays ?? 30;
const includeOrphans = opts.includeOrphans !== false;
const keepWorkspaceCaps = opts.keepWorkspaceCaps === true;
const dryRun = opts.dryRun === true;
// Size accounting requires an expensive recursive lstat across the whole cache. Skip
// it by default so the foreground command returns in ms (deletes are O(1) renames);
// force on for size-target enforcement and when the caller asks for byte accounting.
const computeSizes = opts.withSizes === true || opts.sizeTargetGb !== undefined;
const ageCutoffMs = Date.now() - olderThanDays * 24 * 60 * 60 * 1000;
const datedDirName = this.configStore.getConfig(CFG_CAPSULES_SCOPES_ASPECTS_DATED_DIR) || 'dated-capsules';
Comment on lines +208 to +213
for (const entry of entries) {
if (!entry.isDirectory() || entry.name === 'node_modules' || entry.name.startsWith('.')) continue;
const childPath = path.join(rootPath, entry.name);
const markerPath = path.join(childPath, '.bit-capsule-origin.json');
try {
const stat = await fs.stat(markerPath);
Two bugs were combining to thrash disk:

1. orphan-check wrongly flagged every scope-aspect subdir. Their marker
   stores `originPath` = the *logical* scope-aspects path (e.g.
   <scope.path>-aspects) used only to hash a capsule dir name; it does
   not have to exist as a real directory. The old check treated any
   non-existent originPath as orphan and deleted the capsule. With
   per-aspect-version subdirs that meant we deleted every current
   aspect cap on every prune. Now: scope-aspect children are pruned
   purely by marker mtime (touched on every aspect load).

2. sweepTrashAsync ran unconditionally on every isolator
   construction, so each `bit` invocation (server-forever, e2e
   workers, compile, etc.) spawned its own detached `rm -rf .trash`.
   We observed 1,409 concurrent sweep processes saturating disk I/O,
   blocking foreground bit commands. Now: a PID-stamped lock file
   ensures at most one sweep runs across all bit processes, and the
   sweep is skipped entirely when `.trash` is empty. The child clears
   the lock on exit; stale locks (dead PID) are reclaimed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants