Improve Playground debuggability. by bkaradzic-microsoft · Pull Request #1688 · BabylonJS/BabylonNative

bkaradzic-microsoft · 2026-05-06T21:00:41Z

Make Playground crashes, asserts, and visual-test failures actually debuggable in CI and locally. Adds a callstack-printing crash handler, suppresses MSVC runtime dialogs, gives the Win32 build a real CLI with a headless mode, wires programmatic RenderDoc capture, makes the test runner report per-run summaries with pre-flight reference-image
checks, and gives TestUtils a host-side exit callback so tests can drive the process exit code.

Crash + assert handling

Apps/Playground/Shared/Diagnostics.{h,cpp} — install Win32 SEH + CRT-invalid-parameter + abort/signal handlers that print a bx::writeCallstack-style stack trace and a BN: <CATEGORY> banner before exiting; route _CrtDbgReport / _CrtSetReportMode away from the modal "Microsoft Visual C++ Runtime Library" dialog so asserts don't
block CI.
Core/Graphics/Source/BgfxCallback.cpp — fatal() now always logs to stderr + bx trace, breakpoints only when a debugger is attached on Windows, then falls through to abort() instead of silent exit.

Headless + CLI

Apps/Playground/Shared/CommandLine.{h,cpp} — new flag parser supports --headless, --once, --test <name>, --capture[=<frame>], --renderdoc-dll <path>, plus passthrough of script path / extra args.
Apps/Playground/Win32/App.cpp — switch to /SUBSYSTEM:CONSOLE + wWinMainCRTStartup, UTF-8 console + unbuffered stdio, parse CLI, hide the top-level window in headless mode.
Apps/Playground/CMakeLists.txt — add Diagnostics + CommandLine, link bx, /SUBSYSTEM:CONSOLE for non-UWP Win32.
Apps/Playground/X11/App.cpp, macOS/main.mm — stub Diagnostics::Initialize so the new code links cleanly on every platform.

RenderDoc

Diagnostics::SetupRenderDoc loads renderdoc.dll (override via --renderdoc-dll), pins the in-process API, and arms a single-frame capture when --capture is set.
.github/instructions/playground/playground-renderdoc-capture.instructions.md documents the host-side capture flow end-to-end.

Test runner (`Apps/Playground/Scripts/validation_native.js`)

Pre-flight reference-image check (TestUtils.referenceImageExists); tests with no reference are reported as missingRef, not failures.

Per-run summary line:

Run complete. ran=N passed=N failed=N missingRef=N skipped=N

Honours --once / --test from the CLI; --capture triggers TestUtils.captureNextFrame at the configured frame; bx-style debugger trigger on validation failure.

TestUtils

Plugins/TestUtils — add SetExitCallback() so the host can react to tests calling exit() (drives the Playground process exit code); add referenceImageExists(name) (Win32 checks the deployed ReferenceImages/ next to the executable; other platforms return true to preserve existing behaviour).

Console output

Apps/Playground/Shared/AppContext.cpp — route console.error to stderr, install a global unhandled-exception handler that funnels JS errors through
Diagnostics::DumpFailure.

Docs

.github/instructions/babylon-native-debugging.instructions.md — reference for the Diagnostics output format, exit codes (0/1/2/3/-1), CLI flags,
MISSING_REFERENCE_IMAGE flow, and the one-liner RenderDoc capture invocation.
.github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md — small alignment edits for the new --capture flag.

Copilot

Pull request overview

This PR improves Babylon Native Playground debuggability (CI + local) by standardizing crash/assert reporting with callstacks, adding a Win32 CLI/headless flow, wiring RenderDoc capture support, and enhancing the visual-test runner + TestUtils so failures and exits can be diagnosed and surfaced via process exit codes.

Changes:

Add Diagnostics crash/assert handling (callstack banners, finish-line, RenderDoc preload/version reporting) and integrate it into Playground apps.
Add Win32 CLI parsing (--headless, --once, --test, --capture, --renderdoc-dll, etc.) and plumb options into JS via _playgroundOptions.
Improve visual-test runner reporting (filtering, per-run summary, pre-flight reference checks) and add a host-side TestUtils exit callback hook.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
Plugins/TestUtils/Source/TestUtils.h	Adds internal bridge for host exit callback + JS `referenceImageExists` binding.
Plugins/TestUtils/Source/TestUtils.cpp	Implements `SetExitCallback` storage/dispatch for `TestUtils.exit()`.
Plugins/TestUtils/Source/TestUtils_WinRT.cpp	Invokes host exit callback on `exit()`; stubs `referenceImageExists`.
Plugins/TestUtils/Source/TestUtils_Win32.cpp	Invokes host exit callback on `exit()`; implements Win32 reference-image existence check.
Plugins/TestUtils/Source/TestUtils_Unix.cpp	Invokes host exit callback on `exit()`; stubs `referenceImageExists`.
Plugins/TestUtils/Source/TestUtils_macOS.mm	Invokes host exit callback on `exit()`; stubs `referenceImageExists`.
Plugins/TestUtils/Source/TestUtils_iOS.mm	Stubs `referenceImageExists` for iOS.
Plugins/TestUtils/Source/TestUtils_Android.cpp	Stubs `referenceImageExists` for Android.
Plugins/TestUtils/Include/Babylon/Plugins/TestUtils.h	Public API for registering a host exit callback.
Core/Graphics/Source/BgfxCallback.cpp	Makes bgfx fatals reliably log to stderr + abort (with debugger-aware break on Windows).
Apps/Playground/X11/App.cpp	Initializes diagnostics early on X11 builds.
Apps/Playground/Win32/App.cpp	Converts Win32 Playground into console/CLI-driven entry flow (headless, RenderDoc preload, exit-code wiring).
Apps/Playground/Shared/Diagnostics.h	Declares cross-platform diagnostics API (crash handling, finish-line, RenderDoc preload).
Apps/Playground/Shared/Diagnostics.cpp	Implements crash/assert handlers, banner dumping, finish-line, and Windows RenderDoc DLL resolution/version output.
Apps/Playground/Shared/CommandLine.h	Declares `PlaygroundOptions` and CLI parse/usage interface.
Apps/Playground/Shared/CommandLine.cpp	Implements CLI parsing and usage text (flags for tests/headless/capture/renderdoc).
Apps/Playground/Shared/AppContext.h	Adds `PlaygroundOptions` plumb-through to runtime initialization.
Apps/Playground/Shared/AppContext.cpp	Routes JS errors/console.error into diagnostics banners; injects `_playgroundOptions`; enables bgfx diagnostic output mirroring.
Apps/Playground/Scripts/validation_native.js	Adds CLI-driven filtering/once/listing, run summary, capture timing, and pre-flight reference checks.
Apps/Playground/macOS/main.mm	Initializes diagnostics early on macOS build.
Apps/Playground/CMakeLists.txt	Includes new Shared sources; links `bx`; switches Win32 desktop target to console subsystem with custom entry.
.github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md	Documentation alignment edits for the new capture flag language.
.github/instructions/playground/playground-renderdoc-capture.instructions.md	New end-to-end RenderDoc capture workflow doc for Playground tests.
.github/instructions/babylon-native-debugging.instructions.md	New debugging reference doc for diagnostics output, flags, exit codes, and common scenarios.

Comments suppressed due to low confidence (1)

Apps/Playground/Win32/App.cpp:123

The new --list flag is parsed in wWinMain, but if no scripts are provided the app still loads experience.js and never runs validation_native.js (the only place that actually implements list output). As a result, Playground --list does not currently list tests or exit early as advertised; consider handling ListTests in native code (print + exit), or auto-loading validation_native.js when ListTests is set and Scripts is empty.

        if (g_options.Scripts.empty())
        {
            appContext->ScriptLoader().LoadScript("app:///Scripts/experience.js");
        }
        else
        {
            for (const auto& arg : g_options.Scripts)
            {
                appContext->ScriptLoader().LoadScript(GetUrlFromPath(arg));
            }

            appContext->ScriptLoader().LoadScript("app:///Scripts/playground_runner.js");
        }

- Apps/Playground/Android/BabylonNative/CMakeLists.txt: BabylonNativeJNI compiled AppContext.cpp (now references Diagnostics::DumpFailure / SetExitCode / PrintFinishLine) but not Diagnostics.cpp; add it to sources and link bx. - Apps/Playground/Shared/Diagnostics.cpp: drop redundant Diagnostics:: qualifier on SetupRenderDoc -- the function is already inside namespace Diagnostics, which clang -Wextra-qualification flags as an error on Linux/Android. - Apps/Playground/Shared/Diagnostics.cpp: gate the _CrtSetReportMode / _CrtSetReportFile / _CrtSetReportHook loop on #if defined(_DEBUG); these become no-op macros without _DEBUG, leaving the `reportType` loop variable unused (C4189 -> C2220 under /WX in RelWithDebInfo + Sanitizers).

Napi::Env::RunScript is a V8-only Node-API extension; JSI does not implement it (C2039: 'RunScript': is not a member of 'Napi::Env'). Use env.Global().Get("Function").As<Napi::Function>().New({source}).Call({}) instead, which is portable across V8 and JSI/JSC.

… for DLL injection.

bghgary

[Reviewed by Copilot on behalf of @bghgary]

Substantial improvement to BN diagnostics — the compareFrame/stopFrame separation for --capture and the host-side SetExitCallback hook are both well-placed. Most of the substantive concerns below are about where the code lives (right responsibility) rather than what it does.

Architectural concerns

`referenceImageExists` is a workaround for a JsRuntimeHost XHR polyfill bug

The pre-flight reference-image check in validation_native.js is motivated by:

Without this, BN's URL polyfill throws a synchronous std::runtime_error inside its async task on a missing app:///ReferenceImages/.png, bypassing BABYLON.Tools.LoadFile's onLoadFileError and surfacing as UNCAUGHT JS error before the runner can record the failure.

The actual bug is in JsRuntimeHost/Polyfills/XMLHttpRequest/Source/XMLHttpRequest.cpp::Send():

m_request.SendAsync()
    .then(...)                                    // sendRegion cleanup
    .then(m_runtimeScheduler, none, [this]() {    // success-only continuation
        SetReadyState(ReadyState::Done);
        RaiseEvent(EventType::LoadEnd);
        m_eventHandlerRefs.clear();
    })
    .then(inline_scheduler, none, [env](expected<void, exception_ptr> result) {
        if (result.has_error()) {
            Napi::Error::New(env, result.error()).ThrowAsJavaScriptException();
        }
    });

The success-only middle .then is skipped on async failure (per arcana semantics), so ReadyState::Done and the LoadEnd/error events are never raised. The final continuation throws as a JS exception — so BABYLON.Tools.LoadFile's onLoadFileError (wired to xhr.onerror/xhr.onreadystatechange) never fires, and the failure surfaces as UNCAUGHT JS ERROR instead. This violates the XHR contract and breaks any caller using the standard onerror callback pattern — affects every file-load failure, not just missing reference images.

Suggested fix (in JsRuntimeHost): replace the success-only middle .then with a both-outcomes handler that on error sets a non-OK status and raises error + loadend. Drop the JS-throwing continuation. Then in this PR, referenceImageExists and the new TestUtils plumbing can be deleted; BABYLON.Tools.LoadFile's onLoadFileError will fire normally on missing references and the runner records the failure cleanly. If the eager early-out is still wanted for performance/clarity, fine — but rewrite the comment to say that's the reason, not a polyfill workaround.

`console.error` wrapper belongs in JsRuntimeHost's Console polyfill, not AppContext

AppContext.cpp runs new Function(body)() to install a JS-side console.error wrapper that appends (new Error()).stack to the message. Concerns with this living in the host:

Stack mangled into the message arg — args.push('\nJS callstack:\n' + stack) pollutes the final argument seen by the polyfill callback.
Engine-format heuristics in JS — the lines.shift() normalization papers over per-engine stack-format differences in lowest-common-denominator JS. Belongs in the engine-aware polyfill where it can be done once and shared.
new Function(body) is eval-class — disallowed under CSP-style sandboxing. A native polyfill capture has no such constraint.
Fragile layering — wraps after Polyfills::Console::Initialize. Future scripts re-wrapping console.error either shadow or wrap this.

Suggested fix: move the capture into JsRuntimeHost's Polyfills::Console::Initialize using standard N-API:

auto err = env.Global().Get("Error").As<Napi::Function>().New({});
auto jsStack = err.Get("stack").As<Napi::String>().Utf8Value();

Pass jsStack to LogCallback as a separate parameter. Pure refactor — same new Error().stack capture, same fidelity, no contract change to N-API. Engine-specific stack normalization is a separate PR; this one isn't blocked on it. Drop the new Function(body) block from AppContext entirely.

Instruction docs vs. user docs

The new files in .github/instructions/ mix two distinct kinds of content:

Genuine technical documentation — exit codes, CLI flag reference, config.json knobs, RenderDoc DLL-loading mechanics, the worked test-286 example, the pre-flight failure-pattern table, "scroll up from the error line" troubleshooting. Useful for anyone debugging Playground tests.
Agent steering — "Read this before...", "do NOT re-implement", "Don't add parallel mechanisms", the keyword-bait "When to Use" lists, "future agents", "Cite this as the canonical example". Only useful as agent policy.

.github/instructions/ paths are auto-loaded into Copilot agent context, so anything committed there is active steering rather than passive documentation. That makes maintenance discipline matter: when these go stale, agents are misled silently — exactly the readers least likely to push back.

Suggested split:

Tech content → Apps/Playground/README.md (or docs/playground-debugging.md). Curated for humans; agents can still read it from the file system.
.github/instructions/ keeps only tight agent-behavior policy if any — e.g., "this repo has X infrastructure already, prefer it over reinventing," with a link to the human doc for mechanics.

This separates knowledge from agent policy and avoids the trap where the two get interleaved and rot together.

Other issues called out inline.

bghgary · 2026-05-08T00:06:21Z

        size_t height,
        DebugLogCallback debugLog,
-        AdditionalInitCallback additionalInit = {});
+        AdditionalInitCallback additionalInit = {},


const PlaygroundOptions* with a nullptr default plus if (playgroundOptions != nullptr) guards in the ctor is a nullable raw pointer encoding "no CLI" — but PlaygroundOptions is a pure-data settings struct whose default-constructed state is already "nothing set." Suggest by-value with a default:

AppContext(..., PlaygroundOptions playgroundOptions = {});

The function consumes the data once; by-value lets the caller std::move(opts) (the TestFilters / TestIndices vectors). The 6 playgroundOptions != nullptr checks in the ctor lambda all collapse to unconditional access. JS-side code in validation_native.js is unaffected — it already handles _playgroundOptions being either undefined or fully populated.

bghgary · 2026-05-08T00:06:21Z

+
+namespace CommandLine
+{
+    PlaygroundOptions Parse(int argc, const char* const* argv)


The 13 repeated blocks of if (auto m = match(...); m.matched) { opt.X = ...; continue; } followed by if (!err.empty()) { opt.ParseError = true; ... return opt; } have clear structural duplication that table-driven dispatch eliminates.

A static kFlags table with one row per flag (longName, shortName, kind, apply functor) plus a single match/apply loop reduces ~280 lines to ~80. The same table can drive PrintUsage so flag definitions stop being a second source of truth for the help text — which is exactly what enabled the --list 3-way mismatch (see inline on validation_native.js).

Worth doing in this PR — every new flag pays the per-flag-boilerplate tax until it's fixed.

- AppContext: take PlaygroundOptions by value (defaulted to {}); drop the nullable-pointer dance and the six nullptr-guard branches in the ctor. - Win32: rename g_options -> options to match the file's existing global naming convention; rename GetCommandLineArgumentsW -> GetCommandLineArguments since the W suffix collides with the Win32 wide-char API convention (the function returns UTF-8 narrow strings). - CommandLine: replace 13 repeated match/apply blocks with a static kFlags table whose rows describe long name, short name, kind, value label, help text, and an apply functor. PrintUsage walks the same table so flag definitions are no longer a second source of truth and adding a flag is a single-row edit. - --list: emit canonical TSV (index, title, referenceImage, exclusionReason) so the listing is machine-parseable. The exclusionReason column reflects config state (ignores --include-excluded). Update help text and instruction docs to match. - validation_native.js: drop the leading underscore on _opts and the five run-counters (_ranCount etc.); the prefix only carries meaning on _playgroundOptions, which is host-injected from C++. Remove the duplicate \if (breakOnFail) debugger;\ in the runner -- failTest() already breaks before invoking the done callback. - instructions: replace the hard-coded test-index=286 example with a --test "<title>" form so config.json reorderings don't silently break the doc; drop hard-coded rdc / renderdoc-py version numbers. - Revert renderdoc-gpu-debug.instructions.md to master: the diff was pure em-dash to '--' substitution and unrelated to debuggability. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

bkaradzic-microsoft · 2026-05-08T03:55:58Z

Pushed 5b4ad6bd addressing the inline items (10 of 12). Two open routes for the architectural ones — want to confirm preference before opening companion work.

1. `referenceImageExists` / XHR polyfill bug

Confirmed your read of XMLHttpRequest.cpp::Send() — the success-only .then at L265 skips SetReadyState(Done) + RaiseEvent(LoadEnd) + m_eventHandlerRefs.clear() on async failure and falls through to the JS-throwing trailing continuation at L273. So BABYLON.Tools.LoadFile's onLoadFileError can never fire. Two routes:

A — open a JsRuntimeHost PR replacing the success-only middle .then with a both-outcomes handler (set non-OK status, raise error + loadend, drop the JS-throwing continuation). Bump the dep here, delete TestUtils::referenceImageExists + the runner pre-check before this PR lands.
B — land this PR's pre-check as-is, file a JsRuntimeHost issue, remove the workaround in a follow-up here once that's fixed.

2. `console.error` stack wrapper

Agree the new Function(body)() capture belongs in Polyfills::Console::Initialize. Three routes:

A — JsRuntimeHost PR adds a third callback parameter (e.g. jsStack) populated via N-API (env.Global().Get("Error").New({}).Get("stack")); bump here, delete the new Function(body) block.
B — land this PR's wrapper, file an issue, follow up.
C — drop the wrapper from this PR entirely. Rely on the unhandled-exception handler's banner alone; console.error keeps just the legacy [Error] ... line. (console.error covers Babylon.js's recoverable-but-still-bad paths that often don't reach the unhandled handler, but the JS engine-format heuristics in the wrapper are fragile, so dropping is defensible.)

Preference for each?

Improve debugability.

4ebaa5f

bkaradzic-microsoft requested review from CedricGuillemet, bghgary and Copilot May 6, 2026 21:00

Copilot started reviewing on behalf of bkaradzic-microsoft May 6, 2026 21:05 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

bkaradzic added 5 commits May 6, 2026 14:16

Fix Linux/macOS build: remove unused DumpCallstack helper.

7be547e

UWP doesn't have RenderDoc.

74718d8

Remove --renderdoc-dll preload code; rely on renderdoccmd/rdc capture…

0bef49f

… for DLL injection.

bkaradzic-microsoft enabled auto-merge (squash) May 7, 2026 15:37

bghgary reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Playground debuggability.#1688

Improve Playground debuggability.#1688
bkaradzic-microsoft wants to merge 7 commits intoBabylonJS:masterfrom
bkaradzic-microsoft:improve-debugability

bkaradzic-microsoft commented May 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bghgary left a comment

Uh oh!

bghgary May 8, 2026

Uh oh!

Uh oh!

Uh oh!

bghgary May 8, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bkaradzic-microsoft commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

bkaradzic-microsoft commented May 6, 2026

Crash + assert handling

Headless + CLI

RenderDoc

Test runner (Apps/Playground/Scripts/validation_native.js)

TestUtils

Console output

Docs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bghgary left a comment

Choose a reason for hiding this comment

Architectural concerns

referenceImageExists is a workaround for a JsRuntimeHost XHR polyfill bug

console.error wrapper belongs in JsRuntimeHost's Console polyfill, not AppContext

Instruction docs vs. user docs

Other issues called out inline.

Uh oh!

bghgary May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bghgary May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bkaradzic-microsoft commented May 8, 2026

1. referenceImageExists / XHR polyfill bug

2. console.error stack wrapper

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Test runner (`Apps/Playground/Scripts/validation_native.js`)

`referenceImageExists` is a workaround for a JsRuntimeHost XHR polyfill bug

`console.error` wrapper belongs in JsRuntimeHost's Console polyfill, not AppContext

1. `referenceImageExists` / XHR polyfill bug

2. `console.error` stack wrapper