Skip to content

Improve Playground debuggability.#1688

Open
bkaradzic-microsoft wants to merge 7 commits intoBabylonJS:masterfrom
bkaradzic-microsoft:improve-debugability
Open

Improve Playground debuggability.#1688
bkaradzic-microsoft wants to merge 7 commits intoBabylonJS:masterfrom
bkaradzic-microsoft:improve-debugability

Conversation

@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor

Make Playground crashes, asserts, and visual-test failures actually debuggable in CI and locally. Adds a callstack-printing crash handler, suppresses MSVC runtime dialogs, gives the Win32 build a real CLI with a headless mode, wires programmatic RenderDoc capture, makes the test runner report per-run summaries with pre-flight reference-image
checks, and gives TestUtils a host-side exit callback so tests can drive the process exit code.

Crash + assert handling

  • Apps/Playground/Shared/Diagnostics.{h,cpp} — install Win32 SEH + CRT-invalid-parameter + abort/signal handlers that print a bx::writeCallstack-style stack trace and a BN: <CATEGORY> banner before exiting; route _CrtDbgReport / _CrtSetReportMode away from the modal "Microsoft Visual C++ Runtime Library" dialog so asserts don't
    block CI.
  • Core/Graphics/Source/BgfxCallback.cppfatal() now always logs to stderr + bx trace, breakpoints only when a debugger is attached on Windows, then falls through to abort() instead of silent exit.

Headless + CLI

  • Apps/Playground/Shared/CommandLine.{h,cpp} — new flag parser supports --headless, --once, --test <name>, --capture[=<frame>], --renderdoc-dll <path>, plus passthrough of script path / extra args.
  • Apps/Playground/Win32/App.cpp — switch to /SUBSYSTEM:CONSOLE + wWinMainCRTStartup, UTF-8 console + unbuffered stdio, parse CLI, hide the top-level window in headless mode.
  • Apps/Playground/CMakeLists.txt — add Diagnostics + CommandLine, link bx, /SUBSYSTEM:CONSOLE for non-UWP Win32.
  • Apps/Playground/X11/App.cpp, macOS/main.mm — stub Diagnostics::Initialize so the new code links cleanly on every platform.

RenderDoc

  • Diagnostics::SetupRenderDoc loads renderdoc.dll (override via --renderdoc-dll), pins the in-process API, and arms a single-frame capture when --capture is set.
  • .github/instructions/playground/playground-renderdoc-capture.instructions.md documents the host-side capture flow end-to-end.

Test runner (Apps/Playground/Scripts/validation_native.js)

  • Pre-flight reference-image check (TestUtils.referenceImageExists); tests with no reference are reported as missingRef, not failures.
  • Per-run summary line:
    Run complete. ran=N passed=N failed=N missingRef=N skipped=N
    
  • Honours --once / --test from the CLI; --capture triggers TestUtils.captureNextFrame at the configured frame; bx-style debugger trigger on validation failure.

TestUtils

  • Plugins/TestUtils — add SetExitCallback() so the host can react to tests calling exit() (drives the Playground process exit code); add referenceImageExists(name) (Win32 checks the deployed ReferenceImages/ next to the executable; other platforms return true to preserve existing behaviour).

Console output

  • Apps/Playground/Shared/AppContext.cpp — route console.error to stderr, install a global unhandled-exception handler that funnels JS errors through
    Diagnostics::DumpFailure.

Docs

  • .github/instructions/babylon-native-debugging.instructions.md — reference for the Diagnostics output format, exit codes (0/1/2/3/-1), CLI flags,
    MISSING_REFERENCE_IMAGE flow, and the one-liner RenderDoc capture invocation.
  • .github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md — small alignment edits for the new --capture flag.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Babylon Native Playground debuggability (CI + local) by standardizing crash/assert reporting with callstacks, adding a Win32 CLI/headless flow, wiring RenderDoc capture support, and enhancing the visual-test runner + TestUtils so failures and exits can be diagnosed and surfaced via process exit codes.

Changes:

  • Add Diagnostics crash/assert handling (callstack banners, finish-line, RenderDoc preload/version reporting) and integrate it into Playground apps.
  • Add Win32 CLI parsing (--headless, --once, --test, --capture, --renderdoc-dll, etc.) and plumb options into JS via _playgroundOptions.
  • Improve visual-test runner reporting (filtering, per-run summary, pre-flight reference checks) and add a host-side TestUtils exit callback hook.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
Plugins/TestUtils/Source/TestUtils.h Adds internal bridge for host exit callback + JS referenceImageExists binding.
Plugins/TestUtils/Source/TestUtils.cpp Implements SetExitCallback storage/dispatch for TestUtils.exit().
Plugins/TestUtils/Source/TestUtils_WinRT.cpp Invokes host exit callback on exit(); stubs referenceImageExists.
Plugins/TestUtils/Source/TestUtils_Win32.cpp Invokes host exit callback on exit(); implements Win32 reference-image existence check.
Plugins/TestUtils/Source/TestUtils_Unix.cpp Invokes host exit callback on exit(); stubs referenceImageExists.
Plugins/TestUtils/Source/TestUtils_macOS.mm Invokes host exit callback on exit(); stubs referenceImageExists.
Plugins/TestUtils/Source/TestUtils_iOS.mm Stubs referenceImageExists for iOS.
Plugins/TestUtils/Source/TestUtils_Android.cpp Stubs referenceImageExists for Android.
Plugins/TestUtils/Include/Babylon/Plugins/TestUtils.h Public API for registering a host exit callback.
Core/Graphics/Source/BgfxCallback.cpp Makes bgfx fatals reliably log to stderr + abort (with debugger-aware break on Windows).
Apps/Playground/X11/App.cpp Initializes diagnostics early on X11 builds.
Apps/Playground/Win32/App.cpp Converts Win32 Playground into console/CLI-driven entry flow (headless, RenderDoc preload, exit-code wiring).
Apps/Playground/Shared/Diagnostics.h Declares cross-platform diagnostics API (crash handling, finish-line, RenderDoc preload).
Apps/Playground/Shared/Diagnostics.cpp Implements crash/assert handlers, banner dumping, finish-line, and Windows RenderDoc DLL resolution/version output.
Apps/Playground/Shared/CommandLine.h Declares PlaygroundOptions and CLI parse/usage interface.
Apps/Playground/Shared/CommandLine.cpp Implements CLI parsing and usage text (flags for tests/headless/capture/renderdoc).
Apps/Playground/Shared/AppContext.h Adds PlaygroundOptions plumb-through to runtime initialization.
Apps/Playground/Shared/AppContext.cpp Routes JS errors/console.error into diagnostics banners; injects _playgroundOptions; enables bgfx diagnostic output mirroring.
Apps/Playground/Scripts/validation_native.js Adds CLI-driven filtering/once/listing, run summary, capture timing, and pre-flight reference checks.
Apps/Playground/macOS/main.mm Initializes diagnostics early on macOS build.
Apps/Playground/CMakeLists.txt Includes new Shared sources; links bx; switches Win32 desktop target to console subsystem with custom entry.
.github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md Documentation alignment edits for the new capture flag language.
.github/instructions/playground/playground-renderdoc-capture.instructions.md New end-to-end RenderDoc capture workflow doc for Playground tests.
.github/instructions/babylon-native-debugging.instructions.md New debugging reference doc for diagnostics output, flags, exit codes, and common scenarios.
Comments suppressed due to low confidence (1)

Apps/Playground/Win32/App.cpp:123

  • The new --list flag is parsed in wWinMain, but if no scripts are provided the app still loads experience.js and never runs validation_native.js (the only place that actually implements list output). As a result, Playground --list does not currently list tests or exit early as advertised; consider handling ListTests in native code (print + exit), or auto-loading validation_native.js when ListTests is set and Scripts is empty.
        if (g_options.Scripts.empty())
        {
            appContext->ScriptLoader().LoadScript("app:///Scripts/experience.js");
        }
        else
        {
            for (const auto& arg : g_options.Scripts)
            {
                appContext->ScriptLoader().LoadScript(GetUrlFromPath(arg));
            }

            appContext->ScriptLoader().LoadScript("app:///Scripts/playground_runner.js");
        }

Comment thread Plugins/TestUtils/Source/TestUtils.cpp
Comment thread Apps/Playground/Shared/CommandLine.cpp
Comment thread Apps/Playground/Scripts/validation_native.js
Comment thread Apps/Playground/Scripts/validation_native.js
Comment thread Apps/Playground/Shared/AppContext.cpp Outdated
Comment thread .github/instructions/babylon-native-debugging.instructions.md Outdated
bkaradzic added 5 commits May 6, 2026 14:16
 - Apps/Playground/Android/BabylonNative/CMakeLists.txt: BabylonNativeJNI compiled AppContext.cpp (now references Diagnostics::DumpFailure / SetExitCode / PrintFinishLine) but
 not Diagnostics.cpp; add it to sources and link bx.
 - Apps/Playground/Shared/Diagnostics.cpp: drop redundant Diagnostics:: qualifier on SetupRenderDoc -- the function is already inside namespace Diagnostics, which clang
-Wextra-qualification flags as an error on Linux/Android.
 - Apps/Playground/Shared/Diagnostics.cpp: gate the _CrtSetReportMode / _CrtSetReportFile / _CrtSetReportHook loop on #if defined(_DEBUG); these become no-op macros without
_DEBUG, leaving the `reportType` loop variable unused (C4189 -> C2220 under /WX in RelWithDebInfo + Sanitizers).
 Napi::Env::RunScript is a V8-only Node-API extension; JSI does not implement it (C2039: 'RunScript': is not a member of 'Napi::Env'). Use
env.Global().Get("Function").As<Napi::Function>().New({source}).Call({}) instead, which is portable across V8 and JSI/JSC.
@bkaradzic-microsoft bkaradzic-microsoft enabled auto-merge (squash) May 7, 2026 15:37
Copy link
Copy Markdown
Contributor

@bghgary bghgary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Reviewed by Copilot on behalf of @bghgary]

Substantial improvement to BN diagnostics — the compareFrame/stopFrame separation for --capture and the host-side SetExitCallback hook are both well-placed. Most of the substantive concerns below are about where the code lives (right responsibility) rather than what it does.

Architectural concerns

referenceImageExists is a workaround for a JsRuntimeHost XHR polyfill bug

The pre-flight reference-image check in validation_native.js is motivated by:

Without this, BN's URL polyfill throws a synchronous std::runtime_error inside its async task on a missing app:///ReferenceImages/.png, bypassing BABYLON.Tools.LoadFile's onLoadFileError and surfacing as UNCAUGHT JS error before the runner can record the failure.

The actual bug is in JsRuntimeHost/Polyfills/XMLHttpRequest/Source/XMLHttpRequest.cpp::Send():

m_request.SendAsync()
    .then(...)                                    // sendRegion cleanup
    .then(m_runtimeScheduler, none, [this]() {    // success-only continuation
        SetReadyState(ReadyState::Done);
        RaiseEvent(EventType::LoadEnd);
        m_eventHandlerRefs.clear();
    })
    .then(inline_scheduler, none, [env](expected<void, exception_ptr> result) {
        if (result.has_error()) {
            Napi::Error::New(env, result.error()).ThrowAsJavaScriptException();
        }
    });

The success-only middle .then is skipped on async failure (per arcana semantics), so ReadyState::Done and the LoadEnd/error events are never raised. The final continuation throws as a JS exception — so BABYLON.Tools.LoadFile's onLoadFileError (wired to xhr.onerror/xhr.onreadystatechange) never fires, and the failure surfaces as UNCAUGHT JS ERROR instead. This violates the XHR contract and breaks any caller using the standard onerror callback pattern — affects every file-load failure, not just missing reference images.

Suggested fix (in JsRuntimeHost): replace the success-only middle .then with a both-outcomes handler that on error sets a non-OK status and raises error + loadend. Drop the JS-throwing continuation. Then in this PR, referenceImageExists and the new TestUtils plumbing can be deleted; BABYLON.Tools.LoadFile's onLoadFileError will fire normally on missing references and the runner records the failure cleanly. If the eager early-out is still wanted for performance/clarity, fine — but rewrite the comment to say that's the reason, not a polyfill workaround.

console.error wrapper belongs in JsRuntimeHost's Console polyfill, not AppContext

AppContext.cpp runs new Function(body)() to install a JS-side console.error wrapper that appends (new Error()).stack to the message. Concerns with this living in the host:

  • Stack mangled into the message argargs.push('\nJS callstack:\n' + stack) pollutes the final argument seen by the polyfill callback.
  • Engine-format heuristics in JS — the lines.shift() normalization papers over per-engine stack-format differences in lowest-common-denominator JS. Belongs in the engine-aware polyfill where it can be done once and shared.
  • new Function(body) is eval-class — disallowed under CSP-style sandboxing. A native polyfill capture has no such constraint.
  • Fragile layering — wraps after Polyfills::Console::Initialize. Future scripts re-wrapping console.error either shadow or wrap this.

Suggested fix: move the capture into JsRuntimeHost's Polyfills::Console::Initialize using standard N-API:

auto err = env.Global().Get("Error").As<Napi::Function>().New({});
auto jsStack = err.Get("stack").As<Napi::String>().Utf8Value();

Pass jsStack to LogCallback as a separate parameter. Pure refactor — same new Error().stack capture, same fidelity, no contract change to N-API. Engine-specific stack normalization is a separate PR; this one isn't blocked on it. Drop the new Function(body) block from AppContext entirely.

Instruction docs vs. user docs

The new files in .github/instructions/ mix two distinct kinds of content:

  • Genuine technical documentation — exit codes, CLI flag reference, config.json knobs, RenderDoc DLL-loading mechanics, the worked test-286 example, the pre-flight failure-pattern table, "scroll up from the error line" troubleshooting. Useful for anyone debugging Playground tests.
  • Agent steering"Read this before...", "do NOT re-implement", "Don't add parallel mechanisms", the keyword-bait "When to Use" lists, "future agents", "Cite this as the canonical example". Only useful as agent policy.

.github/instructions/ paths are auto-loaded into Copilot agent context, so anything committed there is active steering rather than passive documentation. That makes maintenance discipline matter: when these go stale, agents are misled silently — exactly the readers least likely to push back.

Suggested split:

  • Tech content → Apps/Playground/README.md (or docs/playground-debugging.md). Curated for humans; agents can still read it from the file system.
  • .github/instructions/ keeps only tight agent-behavior policy if any — e.g., "this repo has X infrastructure already, prefer it over reinventing," with a link to the human doc for mechanics.

This separates knowledge from agent policy and avoids the trap where the two get interleaved and rot together.

Other issues called out inline.

size_t height,
DebugLogCallback debugLog,
AdditionalInitCallback additionalInit = {});
AdditionalInitCallback additionalInit = {},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const PlaygroundOptions* with a nullptr default plus if (playgroundOptions != nullptr) guards in the ctor is a nullable raw pointer encoding "no CLI" — but PlaygroundOptions is a pure-data settings struct whose default-constructed state is already "nothing set." Suggest by-value with a default:

AppContext(..., PlaygroundOptions playgroundOptions = {});

The function consumes the data once; by-value lets the caller std::move(opts) (the TestFilters / TestIndices vectors). The 6 playgroundOptions != nullptr checks in the ctor lambda all collapse to unconditional access. JS-side code in validation_native.js is unaffected — it already handles _playgroundOptions being either undefined or fully populated.

Comment thread Apps/Playground/Win32/App.cpp Outdated
Comment thread Apps/Playground/Win32/App.cpp Outdated

namespace CommandLine
{
PlaygroundOptions Parse(int argc, const char* const* argv)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 13 repeated blocks of if (auto m = match(...); m.matched) { opt.X = ...; continue; } followed by if (!err.empty()) { opt.ParseError = true; ... return opt; } have clear structural duplication that table-driven dispatch eliminates.

A static kFlags table with one row per flag (longName, shortName, kind, apply functor) plus a single match/apply loop reduces ~280 lines to ~80. The same table can drive PrintUsage so flag definitions stop being a second source of truth for the help text — which is exactly what enabled the --list 3-way mismatch (see inline on validation_native.js).

Worth doing in this PR — every new flag pays the per-flag-boilerplate tax until it's fixed.

Comment thread Apps/Playground/Scripts/validation_native.js Outdated
Comment thread Apps/Playground/Scripts/validation_native.js Outdated
Comment thread Apps/Playground/Scripts/validation_native.js Outdated
Comment thread .github/instructions/babylon-native-debugging.instructions.md Outdated
Comment thread .github/instructions/playground/playground-renderdoc-capture.instructions.md Outdated
Comment thread .github/instructions/renderdoc/renderdoc-gpu-debug.instructions.md Outdated
- AppContext: take PlaygroundOptions by value (defaulted to {}); drop the
  nullable-pointer dance and the six nullptr-guard branches in the ctor.
- Win32: rename g_options -> options to match the file's existing global
  naming convention; rename GetCommandLineArgumentsW ->
  GetCommandLineArguments since the W suffix collides with the Win32
  wide-char API convention (the function returns UTF-8 narrow strings).
- CommandLine: replace 13 repeated match/apply blocks with a static
  kFlags table whose rows describe long name, short name, kind, value
  label, help text, and an apply functor. PrintUsage walks the same
  table so flag definitions are no longer a second source of truth and
  adding a flag is a single-row edit.
- --list: emit canonical TSV (index, title, referenceImage,
  exclusionReason) so the listing is machine-parseable. The
  exclusionReason column reflects config state (ignores
  --include-excluded). Update help text and instruction docs to match.
- validation_native.js: drop the leading underscore on _opts and the
  five run-counters (_ranCount etc.); the prefix only carries meaning on
  _playgroundOptions, which is host-injected from C++. Remove the
  duplicate \if (breakOnFail) debugger;\ in the runner -- failTest()
  already breaks before invoking the done callback.
- instructions: replace the hard-coded test-index=286 example with a
  --test "<title>" form so config.json reorderings don't silently break
  the doc; drop hard-coded rdc / renderdoc-py version numbers.
- Revert renderdoc-gpu-debug.instructions.md to master: the diff was
  pure em-dash to '--' substitution and unrelated to debuggability.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

Pushed 5b4ad6bd addressing the inline items (10 of 12). Two open routes for the architectural ones — want to confirm preference before opening companion work.

1. referenceImageExists / XHR polyfill bug

Confirmed your read of XMLHttpRequest.cpp::Send() — the success-only .then at L265 skips SetReadyState(Done) + RaiseEvent(LoadEnd) + m_eventHandlerRefs.clear() on async failure and falls through to the JS-throwing trailing continuation at L273. So BABYLON.Tools.LoadFile's onLoadFileError can never fire. Two routes:

  • A — open a JsRuntimeHost PR replacing the success-only middle .then with a both-outcomes handler (set non-OK status, raise error + loadend, drop the JS-throwing continuation). Bump the dep here, delete TestUtils::referenceImageExists + the runner pre-check before this PR lands.
  • B — land this PR's pre-check as-is, file a JsRuntimeHost issue, remove the workaround in a follow-up here once that's fixed.

2. console.error stack wrapper

Agree the new Function(body)() capture belongs in Polyfills::Console::Initialize. Three routes:

  • A — JsRuntimeHost PR adds a third callback parameter (e.g. jsStack) populated via N-API (env.Global().Get("Error").New({}).Get("stack")); bump here, delete the new Function(body) block.
  • B — land this PR's wrapper, file an issue, follow up.
  • C — drop the wrapper from this PR entirely. Rely on the unhandled-exception handler's banner alone; console.error keeps just the legacy [Error] ... line. (console.error covers Babylon.js's recoverable-but-still-bad paths that often don't reach the unhandled handler, but the JS engine-format heuristics in the wrapper are fragile, so dropping is defensible.)

Preference for each?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants