Skip to content

macOS: indexer leaks ~64k open file descriptors per daemon (opens excluded files, never closes) → exhausts kern.maxfiles, system-wide "too many open files" #496

@martinkaru

Description

@martinkaru

Bug

On macOS, codegraph serve --mcp accumulates tens of thousands of open regular-file descriptors over its lifetime and never releases them. The descriptors are open REG files (verified with lsof), not inotify/kqueue watches — so this is distinct from #276. With one daemon per project root, a few daemons exhaust the system-wide open-file table (kern.num_fileskern.maxfiles), after which every process fails with too many open files in system: Docker Desktop, Chrome and Teams all crashed simultaneously, and every shell command intermittently failed with ENFILE.

Environment

  • OS: macOS 26.3.1 (Darwin kernel 25.3.0), Apple Silicon (arm64)
  • Node: v25.2.1
  • CodeGraph: 0.9.6 (npm latest), @colbymchenry/codegraph-darwin-arm64
  • kern.maxfiles = 184320 (macOS stock default)
  • Root: large umbrella dir — several TS+Python projects, python virtualenvs, git worktrees, binary assets

Evidence

One daemon, 13 minutes after start:

$ lsof -nP -p <pid> | awk 'NR>1{print $5}' | sort | uniq -c | sort -rn
64221 REG      <-- open regular files
    4 PIPE
    3 KQUEUE   <-- only 3 watch fds; the watcher is NOT the leak
    ...
$ lsof -nP -p <pid> | awk '$5=="REG"{print $NF}' | wc -l        # 64221 total
$ lsof -nP -p <pid> | awk '$5=="REG"{print $NF}' | sort -u | wc -l  # 64162 distinct

~64k distinct files held open at once. Crucially, the held-open paths include files my .codegraph/config.json explicitly excludes, and file types not in include:

.../env/lib/python3.12/site-packages/onnx/...   (excluded: **/site-packages/**, **/env/**)
.../.worktrees/<branch>/backend/.../migrations  (excluded: **/.worktrees/**, **/migrations/**)
.../.git/lost-found/other                       (excluded: **/.git/**)
.../tests/baselines/screenshots/.../uus.png     (.png not in include — source-only)
.../2026.xlsx, raporteerimine.pdf, *.docx       (binary, not in include)

System-level at peak (multiple daemons): kern.num_files 183912 / kern.maxfiles 184320 (99.8%) → Docker accept: too many open files in system.

Root cause (inferred)

The tree-walk opens an fd on every file it encounters, applies the include/exclude filter after the open, and leaks the handle for rejected files (the same acquire-then-filter shape as #276, but for fs.open/read handles in the indexer rather than inotify watches). Give-away: files correctly excluded by config, and binary files never eligible for indexing, are nonetheless held open. The leak is proportional to files walked under --path and is independent of the watcher — it reproduces with CODEGRAPH_NO_WATCH=1 (initial index + every codegraph sync still walk the tree).

Repro

  1. codegraph init on a large tree containing excluded subtrees (python venv with site-packages/, a .git/, git worktrees, many binary assets).
  2. codegraph serve --mcp --path <root>.
  3. lsof -nP -p $(pgrep -f "codegraph serve") | grep -c REG — climbs into the tens of thousands within minutes.
  4. lsof -nP -p <pid> | awk '$5=="REG"{print $NF}' | grep -E 'site-packages|/\.git/|/\.worktrees/|\.png$' — confirms excluded/binary files are held open.

Suggested fix

  • Apply the include/exclude filter to the path before any open, and read-then-close each sniffed/parsed file immediately.
  • Defensively, bound concurrent open handles in the walker so a regression degrades gracefully instead of exhausting the kernel file table.

Workaround

  • Reap daemons by FD count (lsof -p <pid> | wc -l, kill above a threshold) — the on-disk index survives, the daemon respawns clean.
  • CODEGRAPH_NO_WATCH=1 lowers re-walk frequency but does not eliminate it.
  • Narrowing --path to a single project reduces files walked and leak magnitude.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions