Fix #10968: spawnProcess fails when RLIMIT_NOFILE exceeds int.max#10969
Fix #10968: spawnProcess fails when RLIMIT_NOFILE exceeds int.max#10969Poita wants to merge 6 commits intodlang:masterfrom
Conversation
When RLIMIT_NOFILE is set to unlimited (RLIM_INFINITY), r.rlim_cur is a huge value (e.g. 2^63-1). The cast(int) on this value wraps to -1, which causes the /dev/fd fast path to be skipped (since -1 < 128K) and the poll() path to attempt a massive malloc that fails. This manifests as "Failed to allocate memory (Cannot allocate memory)" on any process spawn, making dub completely unusable on systems with unlimited file descriptor limits (common on macOS). Fix by: - Using long instead of cast(int) for maxDescriptors - Always trying /dev/fd enumeration first (it's the most efficient path and works regardless of the limit value) - Capping the slow close() fallback to 1M descriptors to avoid iterating over billions when the limit is huge Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Regression test that sets RLIMIT_NOFILE above int.max and verifies process spawning still works. Without the previous fix, this triggers "Failed to allocate memory" due to integer overflow in the fd-closing code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for your pull request and interest in making D better, @Poita! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + phobos#10969" |
- Drop explicit `long` type, use inferred type from r.rlim_cur (0xEAB)
- Revert else+if merge back to original else { if ... } structure (0xEAB)
- Guard test with static if (rlim_t.sizeof > int.sizeof) for platform
safety, and use cast(rlim_t) instead of cast(ulong) (ibuclaw)
Address review feedback from 0xEAB.
| // When rlim_cur is huge (e.g. unlimited), cap to avoid | ||
| // iterating over billions of file descriptors. | ||
| immutable closeMax = cast(int) | ||
| (maxDescriptors > 1_048_576 ? 1_048_576 : maxDescriptors); |
There was a problem hiding this comment.
Where is that “magic constant number” coming from? How was it chosenß
There was a problem hiding this comment.
Honestly, somewhat arbitrarily, but motivated by:
- This path is only taken if maxDescriptors is >100k
- If maxDescriptors is unlimited, closing all would take essentially forever.
- In practice, it is very unlikely for a real system to have >1M FDs actually open, so this is effectively a "close everything"
- Finally, basically no modern system would ever go down this path anyway since they'll all have
/dev/fd-- only some very obscure systems would hit this.
So it's just a "sane limit" as the comment says, but could reasonably be something else. Just needed to choose something.
Fix #10968
Summary
When
RLIMIT_NOFILEis unlimited or exceedsint.max,spawnProcessfails with "Cannot allocate memory" due to integer overflow in the fd-closing code afterfork().Changes
longinstead ofcast(int)formaxDescriptorsto preserve the actualrlim_curvalue/dev/fd(or/proc/self/fd) enumeration first, regardless of the limit value — this is the most efficient path and avoids themallocentirelyRLIMIT_NOFILEaboveint.maxand verifies process spawning worksRoot cause
cast(int)ofRLIM_INFINITY(2^63-1) wraps to-1, which:/dev/fdenumeration path (since-1 < 128K)poll()path which tries tomalloca huge buffer and failsTesting
std.processand passes with the fix