#118: fix NTS test failures (getaddrinfo leak + macOS curl progress exception)#120
Merged
Conversation
The first half of #118 (NTS-only): async_dns_addrinfo_t (288 B) leaked when a coroutine cancelled a DNS lookup and the reactor shut down before the libuv threadpool worker finished its blocking getaddrinfo() syscall. The dispose path sets LIBUV_DNS_F_DISPOSE_PENDING and relies on on_addrinfo_event firing (re-entering dispose) to reach the pefree branch. Our shutdown drain used UV_RUN_NOWAIT only — non-blocking peeks that don't wait for the threadpool cancel-completion to surface via libuv's internal pipe. After uv_loop_close() the callback can't fire, the struct leaks. Per libuv docs (and issue libuv/libuv#2481), uv_cancel() on an in-flight uv_getaddrinfo_t returns EBUSY: workers cannot be preempted, only waited on. Two-phase drain in libuv_reactor_shutdown: Phase 1 — bounded UV_RUN_NOWAIT (sync callbacks already on loop). Phase 2 — bounded UV_RUN_ONCE (block until threadpool cancel- completions arrive; cap so a wedged DNS server can't hang shutdown forever). If a worker is still wedged past the budget, leave the loop open: pefree-ing pending structs would race the still-running worker (UAF, much worse than a leak). The OS reclaims memory at process exit. A ZEND_DEBUG-only warning is printed for visibility. Local impact: ~15 ms shutdown for a normal cancel test (Phase 1 closes the loop immediately; Phase 2 not entered). No functional regressions in ext/async/tests.
…S fix) Companion entry for the php-src fix in ext/curl/interface.c. Documents why the bug only surfaced on macOS (libuv/kqueue reentry path) and that the same pattern was already applied to curl_prereqfunction and curl_debug — these two callbacks were the gap.
MACOS_NTS job was failing at the Configure PHP step with 'Please specify the install prefix of iconv with --with-iconv=<DIR>' because the workflow passes --with-iconv=$(brew --prefix)/opt/libiconv but libiconv was not in the brew install list. LINUX_NTS_ASAN and LINUX_ZTS_JIT are green; this unblocks verification of the curl xferinfo/progress macOS fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the remaining NTS-only test failures tracked in #118.
getaddrinfo struct leak on reactor shutdown (NTS). The
`async_dns_addrinfo_t` (288 B) was never freed when a coroutine
cancelled a DNS resolve and the reactor shut down before libuv's
threadpool worker finished the blocking `getaddrinfo()` syscall.
Per libuv docs `uv_cancel(uv_getaddrinfo_t*)` returns `EBUSY` for
in-flight requests — we cannot preempt the worker, only wait for it.
Fix: two-phase drain in `libuv_reactor_shutdown` — bounded
`UV_RUN_NOWAIT` for ready callbacks, then bounded `UV_RUN_ONCE` for
threadpool cancel-completions. Bounded fallback leaks struct rather
than UAF if budget is exhausted (OS reclaims at process exit).
(Commit `52604b6`.)
curl `XFERINFOFUNCTION` / `PROGRESSFUNCTION` exception leak (macOS).
Two of four async-aware curl callbacks in php-src
`ext/curl/interface.c` were missing the exception-handoff pattern.
When the user callback threw, `EG(exception)` stayed set across
libcurl iterations; `zend_call_known_fcc` on subsequent ticks
short-circuited without clearing it. Eventually the transfer ended
and the dangling exception surfaced outside the coroutine — on Linux
it happened to land on a frame the awaiter unwound, on macOS the
libuv/kqueue reentry path delivered it as uncaught at top-level.
Fix lives in php-src `true-async` (`ae2bd4fba38`): both callbacks
now hand the exception off to `curl_async_event_set_callback_exception()`,
clear it, and return 1 to abort the transfer.
CI infra. Added `libiconv` to the macOS brew install line in
`debug-bugfix.yml` so `./configure` finds it via
`--with-iconv=$(brew --prefix)/opt/libiconv`.
Verification
`Debug Bugfix (manual)` workflow run `25917156027` — all three jobs green:
Companion change
This PR depends on php-src commit `ae2bd4fba38` on the `true-async`
branch — the actual curl-callback fix lives there. Mergeable
independently because the ext/async CHANGELOG entry just documents
that companion change.
Test plan