Skip to content

Browser async/sync: core async virtuals + real ReduceAsync + race detector#6

Merged
LostBeard merged 5 commits into
masterfrom
tuvok/async-sync-core-virtuals
May 30, 2026
Merged

Browser async/sync: core async virtuals + real ReduceAsync + race detector#6
LostBeard merged 5 commits into
masterfrom
tuvok/async-sync-core-virtuals

Conversation

@LostBeard
Copy link
Copy Markdown
Owner

Closes the browser single-threaded-async bug class Trip's research (_research/08-browser-async-vs-ilgpu-sync.md) surfaced: ILGPU core had no overridable async drain/readback, so ILGPU.Algorithms (references only core) couldn't reach a backend's real async wait. AcceleratorStream.SynchronizeAsync was a non-virtual Task.Run(sync Synchronize) — fake on Wasm — and ReduceAsync was Task.Run(sync Reduce->T), whose CopyToCPU threw on WebGPU / read stale on Wasm.

Commits

  • Core async virtuals (2717f90): Accelerator/AcceleratorStream.SynchronizeAsync made virtual; new MemoryBuffer.CopyToRawAsync + ArrayView<T>.CopyToCPUAsync. Wasm/WebGPU/WebGL override all three with their real drain + async readback. ReduceAsync rewritten real-async; sync Reduce->T throws NotSupportedException on browser backends instead of silent-stale.
  • WebGL nested-struct/BVH init (d73880c): carried-forward, version-noted local.15.
  • Demo.Shared -> ML ProjectReference (e7e0ff1).
  • MemSetToZeroAsync (be02c8e): the missing async sibling of CopyFromAsync.
  • DetectHostBufferRaces (97da1f2): opt-in detector that throws on a sync MemSet/CopyTo/CopyToHost against an in-flight buffer — mechanizes the audit.

Verification (PMT)

  • ILGPUReduceAsyncTest — green on CPU/CUDA/OpenCL/WebGPU/Wasm (WebGL skips); previously WebGPU threw, Wasm stale.
  • MemSetToZeroAsyncTest — green ×5 backends.
  • DetectHostBufferRaceTest — green on Wasm (deterministic: sync read on the same JS turn as an unawaited dispatch throws; succeeds after SynchronizeAsync).

Not in scope (documented for follow-up)

Confirmed-broken sites needing async-signature changes (OptimizationEngine.FetchToCPUAsync, LoadParametersInternal, Optimizer.MemSetToZero, SparseMatrix.CopyToCPU, ConcurrentStreamProcessor) and the 10 ML sync sites — deferred to avoid rushing invasive changes.

🤖 Generated with Claude Code

LostBeard and others added 5 commits May 29, 2026 21:37
…Async

Root cause: ILGPU core had no overridable async drain/readback, so the
algorithm layer (which references only core) could not reach a backend's
real async wait. AcceleratorStream.SynchronizeAsync was a non-virtual
Task.Run(sync Synchronize) - fake on Wasm where Synchronize is a no-op -
and ReductionExtensions.ReduceAsync was Task.Run(sync Reduce->T), whose
inner CopyToCPU throws on WebGPU (no sync GPU->CPU readback) and reads
stale data on Wasm (the reduction kernel is still in flight).

Core (ILGPU):
- AcceleratorStream.SynchronizeAsync() made virtual.
- Accelerator.SynchronizeAsync() added (virtual; default runs sync
  Synchronize + completed task).
- MemoryBuffer.CopyToRawAsync(stream, offsetBytes, lengthBytes) added
  (virtual; default drains then sync CopyTo) + ArrayView<T>.CopyToCPUAsync
  extension.

Backends (SpawnDev.ILGPU): Wasm/WebGPU/WebGL override SynchronizeAsync
(accelerator + stream) and CopyToRawAsync with their real async drain +
readback (worker-dispatch await, queue.OnSubmittedWorkDone + mapAsync,
GL-worker readback).

Algorithms: ReduceAsync (both overloads) rewritten to real async
(dispatch -> SynchronizeAsync -> CopyToCPUAsync). Synchronous Reduce->T
now throws a clear NotSupportedException on Wasm/WebGL/WebGPU instead of
returning stale data.

Test: ILGPUReduceAsyncTest exercises dispatch -> real async drain ->
async readback. PMT scoped run green on CPU/CUDA/OpenCL/WebGPU/Wasm
(48 passed, 0 failed); WebGL skips (no shared memory/barriers).

Docs: Wasm/CLAUDE.md async drain/readback section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…raversal)

GLSL codegen now builds hoisted struct/local default initializers with
per-field constructors (GetStructDefaultInitializer) instead of a single
flat constructor, and hoists all PointerType values as int. glWorker keys
the program cache by shader source so a changed source recompiles (and the
stale program/shaders are deleted) instead of returning a cached mismatch.
Version -> 4.9.10-local.15.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lets the shared test project exercise ML pipelines directly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completes the browser-safe async buffer-op surface alongside CopyFromAsync /
CopyToHostAsync / SynchronizeAsync. On Wasm the sync MemSetToZero writes the
SharedArrayBuffer immediately and bypasses the dispatch queue, racing in-flight
worker kernels; MemSetToZeroAsync awaits the accelerator drain first so the
zero-fill is correctly ordered after pending dispatches. WebGPU/WebGL/desktop
ordering is already handled by their encoder/worker/stream, so the explicit
wait is Wasm-only (mirrors CopyFromAsync).

Test MemSetToZeroAsyncTest: kernel fills nonzero (unawaited) -> MemSetToZeroAsync
-> readback all zeros. PMT green on CPU/CUDA/OpenCL/WebGPU/Wasm (6 pass/0 fail);
WebGL skips (MemSet is deferred CPU-side upload).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
WasmMemoryBuffer.DetectHostBufferRaces (default false): when true, the
synchronous host ops (MemSet / CopyTo / CopyToHost) throw if the buffer has
an in-flight dispatch (_pendingSnapshotIntents > 0, incremented synchronously
at queue time in RunKernel) - i.e. the host is reading/zeroing a
SharedArrayBuffer that worker kernels may still be writing. CopyFrom* are NOT
guarded; the lazy snapshot mechanism (PrepareHostWrite) protects them by design.

This mechanizes the async/sync audit: enable it in a PMT sweep to ENUMERATE any
remaining sync-readback race sites that the async APIs (CopyToHostAsync /
CopyFromAsync / MemSetToZeroAsync / SynchronizeAsync) replace. A properly-drained
path never trips it.

Test WasmTests.DetectHostBufferRaceTest: a sync read on the same JS turn as an
unawaited dispatch deterministically throws; the identical read succeeds after
SynchronizeAsync. PMT green on Wasm (1 pass/0 fail). Wasm/CLAUDE.md documented.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LostBeard LostBeard merged commit 97da1f2 into master May 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant