Signal handling for non-Python WASM modules#6154
Conversation
There was a problem hiding this comment.
I'm Bonk, and I've done a quick review of your PR.
Summary: Adds a JS shim around WebAssembly.instantiate/Instance to detect WASM modules exporting __signal_address/__terminated_address globals, registering them for signal-based shutdown notification via a new signal-safe AtomicList data structure.
Issues ranked by severity:
-
[High] Missing compat flag — behavioral change applied unconditionally to all workers. The shim wraps
WebAssembly.instantiate's return promise with an extra.then(), which adds a microtask tick before results are delivered. This subtly changes promise scheduling for every worker that usesWebAssembly.instantiate, not just ones with signal globals. workerd has a strong backward-compat commitment; this kind of observable behavioral change must be gated behind a compatibility flag incompatibility-date.capnp. -
[High] Missing autogate for staged rollout. Even with a compat flag, this is a risky change (shimming fundamental WebAssembly APIs, new signal-handler code path). It should additionally be gated behind an autogate flag for safe incremental rollout.
-
[High] No tests. There are no tests for any part of this feature:
AtomicList, the JS shim,registerWasmShutdownSignal, or the GC prologue filtering.AtomicListis a custom concurrent data structure used from signal handlers — this absolutely needs unit tests, including tests that verify signal-safety invariants (e.g., thatiterate()sees a consistent list whilefilter()is interrupted). -
[Medium]
WebAssembly.Instanceshim changes behavior when called withoutnew. The originalWebAssembly.Instancethrows if called as a regular function (not as a constructor). The shim is a plain JS function that internally doesnew originalInstance(...), soWebAssembly.Instance(module, imports)(withoutnew) would succeed instead of throwing. This is a spec deviation. -
[Medium]
AtomicListuses GCC__atomicbuiltins on non-atomic types.Node* headandNode* nextare plain pointers, notstd::atomic<Node*>. The GCC builtins work in practice, but this is technically undefined behavior per the C++ memory model. Consider usingstd::atomic<Node*>forheadandNode::next— the generated code is identical, and it makes the concurrency intent explicit and portable. -
[Low]
<atomic>header is included butstd::atomicis not used. Either switch tostd::atomic(see issue 5) or remove the unused include. -
[Low]
signalByteOffsetandterminatedByteOffsetcan overlap. The bounds checks don't verify the two offsets point to non-overlapping memory. Overlapping offsets would cause the runtime's own signal write to be misinterpreted as the module's termination flag. A simpleKJ_REQUIRE(signalOffset != terminatedOffset)would catch this.
|
I'm Bonk, and I've done a quick review of your PR. Summary: Adds a JS shim around I found 7 issues and posted them as a batch review on the PR. The highest severity items are:
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6154 +/- ##
==========================================
- Coverage 70.70% 70.65% -0.05%
==========================================
Files 409 416 +7
Lines 109411 110493 +1082
Branches 18018 18207 +189
==========================================
+ Hits 77356 78070 +714
- Misses 21246 21553 +307
- Partials 10809 10870 +61 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
guybedford
left a comment
There was a problem hiding this comment.
Initial feedback, please let me know when this is ready for final review.
Merging this PR will not alter performance
Comparing Footnotes
|
guybedford
left a comment
There was a problem hiding this comment.
Will be great to see this feature shipped. There is an integration test PR already in cloudflare/workers-rs#939. @jasnell would value your input on if we need to think more about detached buffers here before landing.
| auto backingStore = memory->Buffer()->GetBackingStore(); | ||
| auto wasmMemory = | ||
| kj::arrayPtr(static_cast<kj::byte*>(backingStore->Data()), backingStore->ByteLength()) | ||
| .attach(kj::mv(backingStore)); |
There was a problem hiding this comment.
Again, may need to consider v8 sandbox / memory protection key interactions here. @erikcorry
There was a problem hiding this comment.
Spoke with Corry on this, but any misstatements are my own. I believe that MPK interactions are actually an existing concern with Python workers. Currently WASM linear memories are not MPK protected, which would explain why we have not had issues so far, and why this code is fine for the moment. I will make a follow-up to this PR which future proofs both implementations for when we enable protections in the future.
ryanking13
left a comment
There was a problem hiding this comment.
I have some nitpicky comments, not blocking merge.
|
Minor note: I added |
Co-authored-by: Guy Bedford <gbedford@cloudflare.com>
| auto backingStore = memory->Buffer()->GetBackingStore(); | ||
| auto wasmMemory = | ||
| kj::arrayPtr(static_cast<kj::byte*>(backingStore->Data()), backingStore->ByteLength()) | ||
| .attach(kj::mv(backingStore)); |
There was a problem hiding this comment.
if signalOffset is the offset into backingStore->Data(), it might be safer here to limit the kj::arrayPtr(...) to start at that point.
There was a problem hiding this comment.
This is a little bit tricky because there are two signals that need to be in scope. Could restrict it to the minimum span containing both maybe? Seems overkill to me
…posal Each WasmShutdownSignal entry holds a shared_ptr<v8::BackingStore> via kj::Array::attach(). In Worker::Isolate, member destruction order destroys the V8 isolate (api) before limitEnforcer (which owns the AtomicList). The BackingStore destructor lives in libv8.so and may access V8 internal state, so dropping these shared_ptrs after V8 disposal is a use-after-free. Fix: explicitly clear the AtomicList in ~Isolate() while V8 is still alive.
guybedford
left a comment
There was a problem hiding this comment.
Just needs the suggested memory lookup adjustment.
Co-authored-by: Guy Bedford <gbedford@cloudflare.com>
Continuation of 6098
This adds a shim to
WebAssembly.Instantiateand friends which looks for modules that export two i32 globals:__instance_signaland__instance_terminated. Modules matching one of these signatures will receive a hook, allowing us to read and write to those memory locations, allowing for signal-based communication with modules.The addition of
__instance_terminatedis required, since we must know when the module is dead so we can stop sending it signals. This is also useful for modules that want to know when they have been terminated, so that they can re-initialize.