Skip to content

perf: collapse importer exists+isDir into single isFile stat#872

Open
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/importer-single-stat
Open

perf: collapse importer exists+isDir into single isFile stat#872
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/importer-single-stat

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 24, 2026

Motivation

async-profiler on kube-prometheus shows file-I/O syscalls account for ~15% of CPU (top leaf frames: __open 294, stat64 210, access 167, close 102, read 92). The SimpleImporter.resolve chain does os.exists(f) && !os.isDir(f) per candidate — os.exists backs onto Files.exists (POSIX access(F_OK)) and os.isDir backs onto Files.isDirectory (POSIX stat). For an existent candidate that is two syscalls. The same check is repeated again in readPath, so the success path pays four syscalls just for the resolve+read gate before the real read happens.

Modification

Replace os.exists(f) && !os.isDir(f) with os.isFile(_) in both SimpleImporter.resolve and readPath. os.isFile delegates to Files.isRegularFile, which is one stat call and catches IOException internally — so it returns false for non-existent paths without throwing. Semantics are preserved for any path a Jsonnet import would care about (regular files, including those reached through symlinks).

Result

async-profiler confirms access leaf samples eliminated entirely (167 → 0). 3-run × 200-iter A/B on kube-prometheus (JVM build, fresh Interpreter per run, no asprof attached):

baseline after delta
min 54.5 ms 52.6 ms −1.9 ms (−3.5%)
p50 58.1 ms 55.8 ms −2.3 ms (−4.0%)
mean 61.9 ms 59.2 ms −2.7 ms (−4.4%)

stat64 count goes slightly up (210 → 305) because Files.isRegularFile uses stat instead of access, but the net syscall count drops ~72 per run. Scala.js importer is user-provided JS closures so it is unaffected; Scala Native picks up the same benefit since the file lives under src-jvm-native.

Test plan

  • ./mill 'sjsonnet.jvm[3.3.7]'.compile
  • ./mill 'sjsonnet.native[3.3.7]'.compile
  • ./mill __.checkFormat
  • ./mill 'sjsonnet.jvm[3.3.7]'.test (full JVM suite passes)

## Motivation

async-profiler on kube-prometheus shows file-I/O syscalls account for ~15%
of CPU (top leaf frames: `__open` 294, `stat64` 210, `access` 167, `close`
102, `read` 92). The `SimpleImporter.resolve` chain does
`os.exists(f) && !os.isDir(f)` per candidate — `os.exists` backs onto
`Files.exists` (POSIX `access(F_OK)`) and `os.isDir` backs onto
`Files.isDirectory` (POSIX `stat`). For an existent candidate that is two
syscalls. The same check is repeated again in `readPath`, so the success
path pays four syscalls just for the resolve+read gate before the real
read happens.

## Modification

Replace `os.exists(f) && !os.isDir(f)` with `os.isFile(_)` in both
`SimpleImporter.resolve` and `readPath`. `os.isFile` delegates to
`Files.isRegularFile`, which is one `stat` call and catches `IOException`
internally — so it returns `false` for non-existent paths without
throwing. Semantics are preserved for any path a Jsonnet import would
care about (regular files, including those reached through symlinks).

## Result

async-profiler confirms `access` leaf samples eliminated entirely
(167 → 0). 3-run × 200-iter A/B on kube-prometheus (JVM build, fresh
Interpreter per run, no asprof attached):

|        | baseline | after  | delta             |
|--------|----------|--------|-------------------|
| min    | 54.5 ms  | 52.6 ms| −1.9 ms (−3.5%)   |
| p50    | 58.1 ms  | 55.8 ms| −2.3 ms (−4.0%)   |
| mean   | 61.9 ms  | 59.2 ms| −2.7 ms (−4.4%)   |

`stat64` count goes slightly up (210 → 305) because `Files.isRegularFile`
uses `stat` instead of `access`, but the net syscall count drops ~72 per
run. Scala.js importer is user-provided JS closures so it is unaffected;
Scala Native picks up the same benefit since the file lives under
`src-jvm-native`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant