Init is slow on large projects because every file is byte-copied serially. Recent fix 868874d (excluding .npm-cache) reduced scope but didn't address the per-file cost. Below are four orthogonal improvements; each can ship independently.
Source: packages/local-mount/src/mount.ts
1. Hardlink read-only files instead of copying
Where: copyMountedFile (mount.ts:310) — currently copyFileSync + chmodSync(safeMountPath, 0o444) for readonly matches.
Read-only files are chmod 0o444 and never written from inside the mount, so a hardlink (fs.linkSync) is semantically equivalent and is a pure metadata op — collapses the dominant cost on repos where most files are gitignored-but-readonly (lockfiles, vendored deps, etc.).
- Use
linkSync(source, target) when isPathMatched(relativePath, readonlyMatcher) is true.
- Fall back to
copyFileSync if link fails with EXDEV (cross-volume) or EPERM.
- The existing
0o444 chmod is unnecessary for hardlinks (mode is shared with the source) — and arguably wrong since chmodding a hardlink mutates the source file's mode. Verify and drop the chmod on the readonly path.
2. Parallelize the walk and copy
Where: walkProjectTree (mount.ts:225) — fully synchronous recursion via readdirSync + copyFileSync.
Switch to async fs.promises.copyFile / readdir driven by a bounded concurrency queue (~os.availableParallelism() * 4). On a single thread the kernel is mostly idle waiting on syscalls; bounded async lets multiple copies overlap.
3. Drop redundant per-file syscalls
Where: copyMountedFile (mount.ts:310) and resolveSafeCopyTarget (mount.ts:505).
For every file, the current code does roughly:
resolveSafeCopyTarget → ensureDirectoryWithinRoot → mkdirSync(parent, recursive) + realpathSync(parent)
- Another
realpathSync(parent) immediately after
resolveVerifiedFilePath → realpathSync(source) + statSync(source)
copyFileSync
statSync(safeSourcePath) again at mount.ts:336 to read mode for chmod
chmodSync
That's ~6 extra syscalls per file beyond the actual copy. Wins:
- Pre-create the destination directory tree once during the walk (we already visit every dir), then in the file path skip the
ensureDirectoryWithinRoot step.
- Cache parent-dir realpaths (one entry per directory, not per file).
- For non-readonly files, skip the chmod entirely unless we actually need to preserve a non-default mode (e.g. exec bit). The double
statSync on the source can go.
- Path-safety checks should still happen, but at directory-entry time once, not on every file.
Goal: in the common case, file processing is copyFile + nothing else.
4. Expand default excludes
Where: DEFAULT_EXCLUDED_DIRS (mount.ts:60) — currently ['.git', 'node_modules', '.npm-cache'].
Add common build/cache/venv directories that are never useful inside an agent mount and are often huge:
target/ (Rust, Java/Maven)
.next/ (Next.js)
dist/, build/, out/ (generic build output)
__pycache__/, .pytest_cache/, .mypy_cache/, .ruff_cache/ (Python)
.venv/, venv/, env/ (Python)
.gradle/ (Java/Gradle)
coverage/, .nyc_output/ (test coverage)
.turbo/, .cache/ (build caches)
.DS_Store (macOS metadata files — already a single file, but worth filtering)
Considerations:
- Keep these as defaults but ensure callers can override (the existing
excludeDirs option appends, doesn't replace — verify that semantics are still right, and consider exposing a way to opt out of a specific default).
- Document the default list in the README.
- Some of these (
dist/, build/) are project-source in some repos. Decide whether to gate by presence of common project markers or just ship the list and let users opt out.
Out of scope for this issue
Reflinks / clonefile / same-volume mount placement — likely the biggest single win, but it's an architectural change to the mount-path default and warrants its own issue + design discussion.
A native (Rust) walker — possibly worth revisiting if these four together don't get init under ~100ms on representative repos, but not yet.
Suggested verification
- Add a benchmark test that times
createMount against a synthetic tree (e.g. 10k files, 100 dirs, mix of readonly/writable).
- Capture before/after numbers in the PR descriptions for each item.
Init is slow on large projects because every file is byte-copied serially. Recent fix
868874d(excluding.npm-cache) reduced scope but didn't address the per-file cost. Below are four orthogonal improvements; each can ship independently.Source:
packages/local-mount/src/mount.ts1. Hardlink read-only files instead of copying
Where:
copyMountedFile(mount.ts:310) — currentlycopyFileSync+chmodSync(safeMountPath, 0o444)for readonly matches.Read-only files are
chmod 0o444and never written from inside the mount, so a hardlink (fs.linkSync) is semantically equivalent and is a pure metadata op — collapses the dominant cost on repos where most files are gitignored-but-readonly (lockfiles, vendored deps, etc.).linkSync(source, target)whenisPathMatched(relativePath, readonlyMatcher)is true.copyFileSynciflinkfails withEXDEV(cross-volume) orEPERM.0o444chmod is unnecessary for hardlinks (mode is shared with the source) — and arguably wrong since chmodding a hardlink mutates the source file's mode. Verify and drop the chmod on the readonly path.2. Parallelize the walk and copy
Where:
walkProjectTree(mount.ts:225) — fully synchronous recursion viareaddirSync+copyFileSync.Switch to async
fs.promises.copyFile/readdirdriven by a bounded concurrency queue (~os.availableParallelism() * 4). On a single thread the kernel is mostly idle waiting on syscalls; bounded async lets multiple copies overlap.3. Drop redundant per-file syscalls
Where:
copyMountedFile(mount.ts:310) andresolveSafeCopyTarget(mount.ts:505).For every file, the current code does roughly:
resolveSafeCopyTarget→ensureDirectoryWithinRoot→mkdirSync(parent, recursive)+realpathSync(parent)realpathSync(parent)immediately afterresolveVerifiedFilePath→realpathSync(source)+statSync(source)copyFileSyncstatSync(safeSourcePath)again at mount.ts:336 to read mode for chmodchmodSyncThat's ~6 extra syscalls per file beyond the actual copy. Wins:
ensureDirectoryWithinRootstep.statSyncon the source can go.Goal: in the common case, file processing is
copyFile+ nothing else.4. Expand default excludes
Where:
DEFAULT_EXCLUDED_DIRS(mount.ts:60) — currently['.git', 'node_modules', '.npm-cache'].Add common build/cache/venv directories that are never useful inside an agent mount and are often huge:
target/(Rust, Java/Maven).next/(Next.js)dist/,build/,out/(generic build output)__pycache__/,.pytest_cache/,.mypy_cache/,.ruff_cache/(Python).venv/,venv/,env/(Python).gradle/(Java/Gradle)coverage/,.nyc_output/(test coverage).turbo/,.cache/(build caches).DS_Store(macOS metadata files — already a single file, but worth filtering)Considerations:
excludeDirsoption appends, doesn't replace — verify that semantics are still right, and consider exposing a way to opt out of a specific default).dist/,build/) are project-source in some repos. Decide whether to gate by presence of common project markers or just ship the list and let users opt out.Out of scope for this issue
Reflinks /
clonefile/ same-volume mount placement — likely the biggest single win, but it's an architectural change to the mount-path default and warrants its own issue + design discussion.A native (Rust) walker — possibly worth revisiting if these four together don't get init under ~100ms on representative repos, but not yet.
Suggested verification
createMountagainst a synthetic tree (e.g. 10k files, 100 dirs, mix of readonly/writable).