Improve Playground debuggability.#1688
Improve Playground debuggability.#1688bkaradzic-microsoft wants to merge 7 commits intoBabylonJS:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves Babylon Native Playground debuggability (CI + local) by standardizing crash/assert reporting with callstacks, adding a Win32 CLI/headless flow, wiring RenderDoc capture support, and enhancing the visual-test runner + TestUtils so failures and exits can be diagnosed and surfaced via process exit codes.
Changes:
- Add
Diagnosticscrash/assert handling (callstack banners, finish-line, RenderDoc preload/version reporting) and integrate it into Playground apps. - Add Win32 CLI parsing (
--headless,--once,--test,--capture,--renderdoc-dll, etc.) and plumb options into JS via_playgroundOptions. - Improve visual-test runner reporting (filtering, per-run summary, pre-flight reference checks) and add a host-side
TestUtilsexit callback hook.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| Plugins/TestUtils/Source/TestUtils.h | Adds internal bridge for host exit callback + JS referenceImageExists binding. |
| Plugins/TestUtils/Source/TestUtils.cpp | Implements SetExitCallback storage/dispatch for TestUtils.exit(). |
| Plugins/TestUtils/Source/TestUtils_WinRT.cpp | Invokes host exit callback on exit(); stubs referenceImageExists. |
| Plugins/TestUtils/Source/TestUtils_Win32.cpp | Invokes host exit callback on exit(); implements Win32 reference-image existence check. |
| Plugins/TestUtils/Source/TestUtils_Unix.cpp | Invokes host exit callback on exit(); stubs referenceImageExists. |
| Plugins/TestUtils/Source/TestUtils_macOS.mm | Invokes host exit callback on exit(); stubs referenceImageExists. |
| Plugins/TestUtils/Source/TestUtils_iOS.mm | Stubs referenceImageExists for iOS. |
| Plugins/TestUtils/Source/TestUtils_Android.cpp | Stubs referenceImageExists for Android. |
| Plugins/TestUtils/Include/Babylon/Plugins/TestUtils.h | Public API for registering a host exit callback. |
| Core/Graphics/Source/BgfxCallback.cpp | Makes bgfx fatals reliably log to stderr + abort (with debugger-aware break on Windows). |
| Apps/Playground/X11/App.cpp | Initializes diagnostics early on X11 builds. |
| Apps/Playground/Win32/App.cpp | Converts Win32 Playground into console/CLI-driven entry flow (headless, RenderDoc preload, exit-code wiring). |
| Apps/Playground/Shared/Diagnostics.h | Declares cross-platform diagnostics API (crash handling, finish-line, RenderDoc preload). |
| Apps/Playground/Shared/Diagnostics.cpp | Implements crash/assert handlers, banner dumping, finish-line, and Windows RenderDoc DLL resolution/version output. |
| Apps/Playground/Shared/CommandLine.h | Declares PlaygroundOptions and CLI parse/usage interface. |
| Apps/Playground/Shared/CommandLine.cpp | Implements CLI parsing and usage text (flags for tests/headless/capture/renderdoc). |
| Apps/Playground/Shared/AppContext.h | Adds PlaygroundOptions plumb-through to runtime initialization. |
| Apps/Playground/Shared/AppContext.cpp | Routes JS errors/console.error into diagnostics banners; injects _playgroundOptions; enables bgfx diagnostic output mirroring. |
| Apps/Playground/Scripts/validation_native.js | Adds CLI-driven filtering/once/listing, run summary, capture timing, and pre-flight reference checks. |
| Apps/Playground/macOS/main.mm | Initializes diagnostics early on macOS build. |
| Apps/Playground/CMakeLists.txt | Includes new Shared sources; links bx; switches Win32 desktop target to console subsystem with custom entry. |
| .github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md | Documentation alignment edits for the new capture flag language. |
| .github/instructions/playground/playground-renderdoc-capture.instructions.md | New end-to-end RenderDoc capture workflow doc for Playground tests. |
| .github/instructions/babylon-native-debugging.instructions.md | New debugging reference doc for diagnostics output, flags, exit codes, and common scenarios. |
Comments suppressed due to low confidence (1)
Apps/Playground/Win32/App.cpp:123
- The new --list flag is parsed in wWinMain, but if no scripts are provided the app still loads experience.js and never runs validation_native.js (the only place that actually implements list output). As a result,
Playground --listdoes not currently list tests or exit early as advertised; consider handling ListTests in native code (print + exit), or auto-loading validation_native.js when ListTests is set and Scripts is empty.
if (g_options.Scripts.empty())
{
appContext->ScriptLoader().LoadScript("app:///Scripts/experience.js");
}
else
{
for (const auto& arg : g_options.Scripts)
{
appContext->ScriptLoader().LoadScript(GetUrlFromPath(arg));
}
appContext->ScriptLoader().LoadScript("app:///Scripts/playground_runner.js");
}
- Apps/Playground/Android/BabylonNative/CMakeLists.txt: BabylonNativeJNI compiled AppContext.cpp (now references Diagnostics::DumpFailure / SetExitCode / PrintFinishLine) but not Diagnostics.cpp; add it to sources and link bx. - Apps/Playground/Shared/Diagnostics.cpp: drop redundant Diagnostics:: qualifier on SetupRenderDoc -- the function is already inside namespace Diagnostics, which clang -Wextra-qualification flags as an error on Linux/Android. - Apps/Playground/Shared/Diagnostics.cpp: gate the _CrtSetReportMode / _CrtSetReportFile / _CrtSetReportHook loop on #if defined(_DEBUG); these become no-op macros without _DEBUG, leaving the `reportType` loop variable unused (C4189 -> C2220 under /WX in RelWithDebInfo + Sanitizers).
Napi::Env::RunScript is a V8-only Node-API extension; JSI does not implement it (C2039: 'RunScript': is not a member of 'Napi::Env'). Use
env.Global().Get("Function").As<Napi::Function>().New({source}).Call({}) instead, which is portable across V8 and JSI/JSC.
… for DLL injection.
bghgary
left a comment
There was a problem hiding this comment.
[Reviewed by Copilot on behalf of @bghgary]
Substantial improvement to BN diagnostics — the compareFrame/stopFrame separation for --capture and the host-side SetExitCallback hook are both well-placed. Most of the substantive concerns below are about where the code lives (right responsibility) rather than what it does.
Architectural concerns
referenceImageExists is a workaround for a JsRuntimeHost XHR polyfill bug
The pre-flight reference-image check in validation_native.js is motivated by:
Without this, BN's URL polyfill throws a synchronous std::runtime_error inside its async task on a missing app:///ReferenceImages/.png, bypassing BABYLON.Tools.LoadFile's onLoadFileError and surfacing as UNCAUGHT JS error before the runner can record the failure.
The actual bug is in JsRuntimeHost/Polyfills/XMLHttpRequest/Source/XMLHttpRequest.cpp::Send():
m_request.SendAsync()
.then(...) // sendRegion cleanup
.then(m_runtimeScheduler, none, [this]() { // success-only continuation
SetReadyState(ReadyState::Done);
RaiseEvent(EventType::LoadEnd);
m_eventHandlerRefs.clear();
})
.then(inline_scheduler, none, [env](expected<void, exception_ptr> result) {
if (result.has_error()) {
Napi::Error::New(env, result.error()).ThrowAsJavaScriptException();
}
});The success-only middle .then is skipped on async failure (per arcana semantics), so ReadyState::Done and the LoadEnd/error events are never raised. The final continuation throws as a JS exception — so BABYLON.Tools.LoadFile's onLoadFileError (wired to xhr.onerror/xhr.onreadystatechange) never fires, and the failure surfaces as UNCAUGHT JS ERROR instead. This violates the XHR contract and breaks any caller using the standard onerror callback pattern — affects every file-load failure, not just missing reference images.
Suggested fix (in JsRuntimeHost): replace the success-only middle .then with a both-outcomes handler that on error sets a non-OK status and raises error + loadend. Drop the JS-throwing continuation. Then in this PR, referenceImageExists and the new TestUtils plumbing can be deleted; BABYLON.Tools.LoadFile's onLoadFileError will fire normally on missing references and the runner records the failure cleanly. If the eager early-out is still wanted for performance/clarity, fine — but rewrite the comment to say that's the reason, not a polyfill workaround.
console.error wrapper belongs in JsRuntimeHost's Console polyfill, not AppContext
AppContext.cpp runs new Function(body)() to install a JS-side console.error wrapper that appends (new Error()).stack to the message. Concerns with this living in the host:
- Stack mangled into the message arg —
args.push('\nJS callstack:\n' + stack)pollutes the final argument seen by the polyfill callback. - Engine-format heuristics in JS — the
lines.shift()normalization papers over per-engine stack-format differences in lowest-common-denominator JS. Belongs in the engine-aware polyfill where it can be done once and shared. new Function(body)is eval-class — disallowed under CSP-style sandboxing. A native polyfill capture has no such constraint.- Fragile layering — wraps after
Polyfills::Console::Initialize. Future scripts re-wrappingconsole.erroreither shadow or wrap this.
Suggested fix: move the capture into JsRuntimeHost's Polyfills::Console::Initialize using standard N-API:
auto err = env.Global().Get("Error").As<Napi::Function>().New({});
auto jsStack = err.Get("stack").As<Napi::String>().Utf8Value();Pass jsStack to LogCallback as a separate parameter. Pure refactor — same new Error().stack capture, same fidelity, no contract change to N-API. Engine-specific stack normalization is a separate PR; this one isn't blocked on it. Drop the new Function(body) block from AppContext entirely.
Instruction docs vs. user docs
The new files in .github/instructions/ mix two distinct kinds of content:
- Genuine technical documentation — exit codes, CLI flag reference, config.json knobs, RenderDoc DLL-loading mechanics, the worked test-286 example, the pre-flight failure-pattern table, "scroll up from the error line" troubleshooting. Useful for anyone debugging Playground tests.
- Agent steering — "Read this before...", "do NOT re-implement", "Don't add parallel mechanisms", the keyword-bait "When to Use" lists, "future agents", "Cite this as the canonical example". Only useful as agent policy.
.github/instructions/ paths are auto-loaded into Copilot agent context, so anything committed there is active steering rather than passive documentation. That makes maintenance discipline matter: when these go stale, agents are misled silently — exactly the readers least likely to push back.
Suggested split:
- Tech content →
Apps/Playground/README.md(ordocs/playground-debugging.md). Curated for humans; agents can still read it from the file system. .github/instructions/keeps only tight agent-behavior policy if any — e.g., "this repo has X infrastructure already, prefer it over reinventing," with a link to the human doc for mechanics.
This separates knowledge from agent policy and avoids the trap where the two get interleaved and rot together.
Other issues called out inline.
| size_t height, | ||
| DebugLogCallback debugLog, | ||
| AdditionalInitCallback additionalInit = {}); | ||
| AdditionalInitCallback additionalInit = {}, |
There was a problem hiding this comment.
const PlaygroundOptions* with a nullptr default plus if (playgroundOptions != nullptr) guards in the ctor is a nullable raw pointer encoding "no CLI" — but PlaygroundOptions is a pure-data settings struct whose default-constructed state is already "nothing set." Suggest by-value with a default:
AppContext(..., PlaygroundOptions playgroundOptions = {});The function consumes the data once; by-value lets the caller std::move(opts) (the TestFilters / TestIndices vectors). The 6 playgroundOptions != nullptr checks in the ctor lambda all collapse to unconditional access. JS-side code in validation_native.js is unaffected — it already handles _playgroundOptions being either undefined or fully populated.
|
|
||
| namespace CommandLine | ||
| { | ||
| PlaygroundOptions Parse(int argc, const char* const* argv) |
There was a problem hiding this comment.
The 13 repeated blocks of if (auto m = match(...); m.matched) { opt.X = ...; continue; } followed by if (!err.empty()) { opt.ParseError = true; ... return opt; } have clear structural duplication that table-driven dispatch eliminates.
A static kFlags table with one row per flag (longName, shortName, kind, apply functor) plus a single match/apply loop reduces ~280 lines to ~80. The same table can drive PrintUsage so flag definitions stop being a second source of truth for the help text — which is exactly what enabled the --list 3-way mismatch (see inline on validation_native.js).
Worth doing in this PR — every new flag pays the per-flag-boilerplate tax until it's fixed.
- AppContext: take PlaygroundOptions by value (defaulted to {}); drop the
nullable-pointer dance and the six nullptr-guard branches in the ctor.
- Win32: rename g_options -> options to match the file's existing global
naming convention; rename GetCommandLineArgumentsW ->
GetCommandLineArguments since the W suffix collides with the Win32
wide-char API convention (the function returns UTF-8 narrow strings).
- CommandLine: replace 13 repeated match/apply blocks with a static
kFlags table whose rows describe long name, short name, kind, value
label, help text, and an apply functor. PrintUsage walks the same
table so flag definitions are no longer a second source of truth and
adding a flag is a single-row edit.
- --list: emit canonical TSV (index, title, referenceImage,
exclusionReason) so the listing is machine-parseable. The
exclusionReason column reflects config state (ignores
--include-excluded). Update help text and instruction docs to match.
- validation_native.js: drop the leading underscore on _opts and the
five run-counters (_ranCount etc.); the prefix only carries meaning on
_playgroundOptions, which is host-injected from C++. Remove the
duplicate \if (breakOnFail) debugger;\ in the runner -- failTest()
already breaks before invoking the done callback.
- instructions: replace the hard-coded test-index=286 example with a
--test "<title>" form so config.json reorderings don't silently break
the doc; drop hard-coded rdc / renderdoc-py version numbers.
- Revert renderdoc-gpu-debug.instructions.md to master: the diff was
pure em-dash to '--' substitution and unrelated to debuggability.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pushed 1.
|
Make Playground crashes, asserts, and visual-test failures actually debuggable in CI and locally. Adds a callstack-printing crash handler, suppresses MSVC runtime dialogs, gives the Win32 build a real CLI with a headless mode, wires programmatic RenderDoc capture, makes the test runner report per-run summaries with pre-flight reference-image
checks, and gives
TestUtilsa host-side exit callback so tests can drive the process exit code.Crash + assert handling
Apps/Playground/Shared/Diagnostics.{h,cpp}— install Win32 SEH + CRT-invalid-parameter +abort/signal handlers that print abx::writeCallstack-style stack trace and aBN: <CATEGORY>banner before exiting; route_CrtDbgReport/_CrtSetReportModeaway from the modal "Microsoft Visual C++ Runtime Library" dialog so asserts don'tblock CI.
Core/Graphics/Source/BgfxCallback.cpp—fatal()now always logs tostderr+ bx trace, breakpoints only when a debugger is attached on Windows, then falls through toabort()instead of silent exit.Headless + CLI
Apps/Playground/Shared/CommandLine.{h,cpp}— new flag parser supports--headless,--once,--test <name>,--capture[=<frame>],--renderdoc-dll <path>, plus passthrough of script path / extra args.Apps/Playground/Win32/App.cpp— switch to/SUBSYSTEM:CONSOLE+wWinMainCRTStartup, UTF-8 console + unbuffered stdio, parse CLI, hide the top-level window in headless mode.Apps/Playground/CMakeLists.txt— addDiagnostics+CommandLine, linkbx,/SUBSYSTEM:CONSOLEfor non-UWP Win32.Apps/Playground/X11/App.cpp,macOS/main.mm— stubDiagnostics::Initializeso the new code links cleanly on every platform.RenderDoc
Diagnostics::SetupRenderDocloadsrenderdoc.dll(override via--renderdoc-dll), pins the in-process API, and arms a single-frame capture when--captureis set..github/instructions/playground/playground-renderdoc-capture.instructions.mddocuments the host-side capture flow end-to-end.Test runner (
Apps/Playground/Scripts/validation_native.js)TestUtils.referenceImageExists); tests with no reference are reported asmissingRef, not failures.--once/--testfrom the CLI;--capturetriggersTestUtils.captureNextFrameat the configured frame; bx-style debugger trigger on validation failure.TestUtils
Plugins/TestUtils— addSetExitCallback()so the host can react to tests callingexit()(drives the Playground process exit code); addreferenceImageExists(name)(Win32 checks the deployedReferenceImages/next to the executable; other platforms returntrueto preserve existing behaviour).Console output
Apps/Playground/Shared/AppContext.cpp— routeconsole.errortostderr, install a global unhandled-exception handler that funnels JS errors throughDiagnostics::DumpFailure.Docs
.github/instructions/babylon-native-debugging.instructions.md— reference for the Diagnostics output format, exit codes (0/1/2/3/-1), CLI flags,MISSING_REFERENCE_IMAGEflow, and the one-liner RenderDoc capture invocation..github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md— small alignment edits for the new--captureflag.