fix: bundled QA retest findings from 0.7.2 (2026-04-16)#228
Open
szymdzum wants to merge 40 commits into
Open
Conversation
…very generateSelector() produced input[name="X"] for any name-bearing input, which collided across radios/checkboxes sharing a name. The resolver then treated the preview selector as uniquely identifying and dropped the index, so dom click/fill on a grouped input always hit the first match. For radios and checkboxes that carry a value, append [value="..."] to the preview so each element in a group resolves uniquely.
fetchNetworkRequests() called fetchPreviewData() with no lastN argument, so the daemon's peek handler applied its 10-item default and truncated the response before the list command ever saw it. The --last option operated on that pre-truncated slice, so users couldn't retrieve more than 10 requests regardless of the value (the HAR export path, which uses a different IPC call, kept working). Pass lastN=0 (unlimited) and let the list command's own --last do the slicing, matching the pattern fetchConsoleMessages already uses.
status.ts suggested 'bdg query <script>' (the command is 'bdg dom eval') and 'bdg tabs' (no such command). dom.ts suggested 'bdg details dom N' (details only supports network/console). All three sent users down dead-end paths. - status 'Commands' section: 'Query browser: bdg dom eval <script>' - status 'Suggestions' section: drop the 'List Chrome tabs' entry - dom query 'Next steps': point to 'bdg dom a11y describe N' instead
handleIndexGetSemantic() passed the string 'cached query' as the selector argument to elementAtIndexNotFoundError, so users saw 'Re-run "bdg dom query cached query" to refresh the cache' — where 'cached query' reads as a required argument name. Make selector optional on the error helper and emit a generic 'Re-run your original query' suggestion when the caller doesn't know the original selector.
formatHeaderSection prepended a colon to the key before calling keyValue(), which adds its own colon for the padding. Result was 'Referer:: https://...' throughout headers output. Let keyValue own the colon.
…ment findTargetRequestForHeaders() fell back to any request whose mimeType contained 'html' when no current-nav Document was found, and further fell back to any request with response headers. A favicon.ico 404 response body typed 'text/html' would win the mimeType fallback, so 'bdg network document' returned the favicon on pages like /post. With no html-mimed request at all, 'bdg network headers' would return an arbitrary request (the headers fallback). Drop both loose fallbacks. Prefer the current-navigation Document, then any Document resource, then error — resourceType is the only signal that reliably distinguishes the main document from collateral. Contract tests updated to encode resourceType-based selection and to verify favicon-with-html-mimetype no longer wins.
The default action treats any bare argument as a URL, so 'bdg statsu' was interpreted as 'http://statsu' and launched a Chrome session. Real subcommand names get routed by Commander first, but their misspellings fall through to the [url] positional. Before URL validation, check the argument against the registered command names with a Levenshtein-2 threshold and error out with a 'Did you mean' suggestion. Truly bogus single-token strings (no dots/slashes/ scheme and not a command typo) still reach Chrome — scope-limited fix focused on the typo case the report called out.
Commander treats '' as a present argument, so 'bdg dom query ""' slipped through and reached the CDP query helper which silently returned zero nodes and a success exit code. Agents couldn't tell the difference between 'selector matched nothing' (exit 0, valid answer) and 'you passed an empty string' (exit 81, usage error). Guard the selector at the command boundary.
…nothing buildHeader fell into the 'last X of Y' branch even when the filter produced zero matches, so empty results read as 'last 0 of 10' — which is technically true but reads like a paging error. Surface the intent: '(no matches; 10 captured)' when the filter emptied the list, '(none captured)' when there were no requests to begin with.
'bdg https://example.com status' silently ignored 'status' and launched a session because the root Commander program accepted arbitrary excess arguments. Turn on allowExcessArguments(false) on the root so any trailing positional after [url] raises Commander's 'too many arguments' error instead of being dropped. Subcommands declare their own arguments and are unaffected.
Two linked a11y issues: 1. Bare 'bdg dom a11y' printed the help block instead of doing anything, even though the search argument is documented as optional. Route bare invocations to handleA11yTree so the subcommand dumps the tree — the most useful default for an a11y-shorthand. 2. 'bdg dom a11y Example' wrapped the term in asterisks to form 'name:*Example*', which parseQueryPattern then passed through to matchesPattern where includes() looks for the literal string '*example*'. A heading named 'Example Domain' didn't contain the asterisks so the match failed. Drop the asterisk wrap in the bare-mode shorthand and strip leading/trailing '*' wildcards in parseQueryPattern. Users who habit in asterisks now get the expected substring match; users who don't never had to type them in the first place.
The empty-state guard checked options.network === undefined, but Commander defaults the flag to false, so 'bdg peek --console' never hit the marker branch when there were no messages — users saw the PREVIEW header, skipped straight to the tip line, and couldn't tell the difference between 'no data captured' and 'command silently misbehaved'. Gate on options.console / !options.network / hasConsoleData so the empty console section renders its '(none)' marker in both the user- asked-for-console and default cases, matching the DOM block.
…%O, %%)
formatConsoleArgs blindly joined every RemoteObject with spaces, so
console.log('%cFoo', 'color:red') rendered as '%cFoo color:red' —
the format directive and its CSS argument both leaked into the text,
badly on pages with lots of %c styling.
Detect a format string in the first argument and apply the WHATWG
console substitution rules for the common directives: %s/%d/%i/%f
consume and format their arg, %o/%O render structured objects, %c
silently drops the CSS arg, %% renders a literal %. Unknown directives
pass through as literal text.
formatConsole chose summarize mode whenever --list wasn't set, so 'bdg console --level info' rendered the error/warning summary header (and 'No errors or warnings found') on an already-filtered info stream. Treat an explicit --level as implying chronological mode — the summary's deduplicated errors/warnings layout has no meaningful equivalent for info/debug.
Peek's --network/--console/--dom flags were documented as filters but implemented as 'addSections'. The gates only dropped the network block when --console was set and vice-versa — --dom never suppressed network or console, so 'bdg peek --dom' printed everything plus the tree. Introduce a hasExplicitFilter signal: when any of network/console/dom is explicitly set, only the requested sections render. With no flags, the default (network + console) stays; DOM remains opt-in due to the tree's verbosity.
formatDomEval ran every result through JSON.stringify, so a script returning a string came back wrapped in quotes — agents parsing document.title had to strip them (and couldn't tell a plain string apart from an accidentally-JSON-encoded one). Emit strings raw; numbers/objects/booleans/null keep JSON formatting so they remain round-trippable.
… none The main entrypoint called ensureDaemonRunning() for every non- documentation invocation, so status and stop spawned (and then tore down) an entire worker just to learn what the caller already implied by running them: there's no session yet. Slow and wasteful on clean systems. Gate the spawn: for the status and stop commands, only launch a daemon when one is already running (reconnect path). Otherwise let each command's existing daemon-connection-error handler produce the appropriate 'not running' message without the side effect.
--brief dropped the Value column entirely, so after running fill/click agents had to switch to the full table to verify the change landed — defeating the point of a quick-state summary. Add a compact VAL column that shows the field's current state: - text-like fields → quoted value, or ∅ when empty - checkbox/radio/switch → [x] / [ ] Widen the brief table to 65 cols to fit the new column without truncating labels further.
…GUMENTS) Commander's default exit handler collapsed every usage failure — unknown option, missing required argument, too-many arguments, invalid argument value — onto exit 1, which scripts couldn't distinguish from a real crash. Install exitOverride on the root program to translate the usage-level Commander error codes to EXIT_CODES.INVALID_ARGUMENTS while letting help/version still exit 0. Commander prints its own error text to stderr, so the override only needs to own the exit code.
setTargetInfo() was only called during initial Chrome setup; subsequent navigations (location.href assignment, pushState-style routing, server redirects) left the cached URL and title frozen at the original values. Agents checking 'bdg status' after a navigation got stale state and couldn't use it to confirm page changes. - Thread an onMainFrameNavigation callback through startNavigationTracking and subscribe to Page.frameNavigated + Page.navigatedWithinDocument so in-document routing updates the URL too. - Update the stored URL immediately from the event (cheap, event-driven); refresh the title via Runtime.evaluate on Page.loadEventFired plus scheduled retries, since the event fires before the new document's <title> is ready. - Fall back to host+pathname when document.title is empty so the format matches what Chrome's /json endpoint populated at session start.
…submit waitForCompletion only subscribed to Page.frameNavigated when the caller passed --wait-navigation, so a default 'bdg dom submit' that POSTed and then navigated came back with Navigation: no because the listener was never attached — navigationOccurred stayed at its initial false. Always subscribe to Page.frameNavigated so the reported flag reflects what happened; waitNavigation continues to control whether we block for navigation. Filter to main-frame events so subframe loads during submit (ads, iframes) don't falsify the flag.
The soft deprecation had lived long enough that users were trained to ignore it while scripts silently depended on the shorthand. Cut it: any invocation of 'bdg peek --network' now fails with exit 81 and a message pointing to 'bdg network list' — the strict superset replacement. Option description updated so 'bdg peek --help' reflects the removal.
… --verbose The 50+ line post-start walkthrough is genuinely useful when a human runs bdg at their terminal for the first time, but it drowns script and CI output where nobody reads it. Drive the choice off process.stdout.isTTY with explicit overrides: - TTY → landing page (unchanged for interactive users) - piped/redirected → 'Session started: <url>' only - --verbose → force the landing page even when piped - --quiet → already forced minimal; stays that way No breaking change for interactive usage, and automation no longer needs --quiet purely to tame the banner.
--quiet was scoped to 'bdg <url>' via the start command's own option list, so 'bdg --quiet status' parsed the flag (applyCollectorOptions attaches to the root) but no other formatter consulted it. Users got a flag that looked universal but only did anything at session start. Contract: --quiet suppresses tips, 'Next steps:', 'Suggestions:', and 'Commands:' sections while keeping the primary data/result. --json continues to be the machine-readable path; the two compose. Covered in this pass (where the noise actually lives): - bdg status — drops the 'Commands:' footer - bdg status (no session) — drops the 'Suggestions:' block - bdg dom query — drops both 'Next steps:' and 'Suggestions:' - bdg peek — drops the 'Tip:' tail in compact and verbose modes - bdg dom submit — drops the 'Next steps:' block and updates its reference to point at 'bdg network list' instead of the now-removed 'bdg peek --network' - daemon-startup log lines on the main entrypoint are also silenced Commands without decorative sections accept the flag as a global and no-op on it.
Four regressions/new bugs surfaced in the 2026-04-16 retest:
- network headers <bogus-id> surfaced the worker's "not found" error via
IPCError default, landing on exit 104 (UNHANDLED_EXCEPTION). Check
response.status explicitly and return RESOURCE_NOT_FOUND (83) with a
"list captured requests" suggestion. Same pattern for the no-arg
document alias when nothing is captured.
- cdp <method> with a CDP "Invalid parameters" reply also fell through
to 104. Map likely-user-input CDP errors (invalid parameters / must
have / is required) to INVALID_ARGUMENTS (81) and point at
`--describe` for the parameter schema.
- dom query "Next steps -> Extract text" rendered
`bdg cdp Runtime.evaluate --params '{"expression": "document.querySelector('h1').textContent"}'`
The outer single-quote wrapper collided with the inner selector quotes,
so pasted verbatim the shell split the string and the CDP call ran
`document.querySelector(h1)` -> ReferenceError. Switch to
`bdg dom eval "document.querySelector('sel').textContent"` which is
bash-safe (double-quoted outer, single-quoted inner, no nesting).
- dom get <index> stale-cache path threw CommandError with
RESOURCE_NOT_FOUND (83), but CLAUDE.md documents STALE_CACHE (87) as
the semantic code for stale nodeIds. Align with the documented
convention so agents can discriminate "selector never matched" from
"indexed cache went stale."
Moving the slice into the data-producing lambda so `--json` and the
human formatter agree on what "last N" means. Prior behaviour:
bdg network list --last 5 --json
-> .data.requests = [all 80]
-> .data.filtered = [all 80] # bug: ignored --last
-> .data.totalCount = 80
bdg network list --last 5
-> "last 5 of 80" header, 5 rows printed
Agents scripting against the JSON path got the full capture regardless
of --last, inverse of the original 10-item cap but equally confusing.
New shape:
.data.requests - raw unfiltered capture
.data.filtered - post-filter, post-slice (what humans see)
.data.filteredTotal - post-filter, pre-slice (new, so callers can
still reason about "how much survived the
filter" without re-running)
.data.totalCount - unchanged (full capture count)
…headers #3 follow-up: commit 3781fcb guarded typos close to existing commands (edit distance <= 2), so `bdg statuss` is rejected with "Did you mean bdg status?". But arbitrary single tokens like `bdg unknown` or `bdg myproject` still fell through validateUrl because normalizeUrl blindly prepends http://, producing a parseable `http://unknown/`. A real Chrome process would launch, against what the user almost certainly meant. Add a pre-normalisation shape check: the input must contain `.`, `/`, `:`, an explicit scheme, or match `localhost` to count as a URL. Users with a genuine bare intranet hostname can disambiguate with an explicit `bdg http://foo` or a port (`bdg foo:80`). Failures exit 80 (INVALID_URL) with a message that points at both escape hatches and `bdg --help`. #6 follow-up: commit f6ebaaf intentionally made `bdg network headers` default to the main document when no ID is passed (deterministic vs the old "random request" behaviour). Leave the default, but when the human path runs without an ID add a one-line stderr hint pointing at `bdg network list` so agents don't silently assume they got the request they asked for. Skip the hint under --json to keep the automation path stderr-clean (matches #9/#10 contract).
C2: buildSessionOptions ran outside the try/catch in start.ts's action handler, so any CommandError it threw (e.g. `--max-body-size abc`) escaped as an unhandled node exception with a dist-path stack trace and exit 1. Move the call inside the existing try/catch so the validation error goes through handleValidationError and exits 81. S5: positiveIntRule always printed "value" as the field name and used invalidIntegerError even when the value WAS an integer but out of range (e.g. `peek --last 0` → "0 is not a valid integer"). Thread fieldName through IntegerRuleOptions and split the message into two: invalidIntegerError for NaN, outOfRangeError for min/max violations. Also drop the leading "Error: " from these messages — genericError at the boundary adds it once.
The canonical message producers (daemonNotRunningError, sessionAlreadyRunningError, sessionTargetMismatchError, elementNotFoundError, unknownError) all baked "Error: " into their output, and CommandRunner.ts:124 re-wrapped result.error with genericError, producing "Error: Error: Daemon not running" in the human output and a leaked "Error: " prefix inside the JSON .error field. Strip the literal prefix from the message producers and wrap every direct caller (CommandRunner, startHelpers) with genericError so the prefix lands exactly once. JSON consumers now get a clean payload.
details network/console relied on validateIPCResponse which converts any error-status response into a generic IPCError. CommandRunner doesn't honour IPCError.exitCode, so the worker's "Network request not found" and "Console message not found" messages surfaced as exit 104 (UNHANDLED_EXCEPTION) — making it look like a bdg crash. Check response.status directly in details.ts; when the worker's error message matches the not-found / no-messages patterns, throw a CommandError with RESOURCE_NOT_FOUND (83) and an actionable suggestion. Other IPC errors still fall through to IPCError. Separately, the worker returned "available: 0--1" when no console messages existed. Check for empty array first and return "No console messages captured yet for this session".
S3: a user-authored script throwing (ReferenceError, SyntaxError, Promise rejection, throw statement) exited 110 (SOFTWARE_ERROR). That code is reserved for bdg bugs — user code blowing up is an argument problem. Map to INVALID_ARGUMENTS (81) in evalHelpers. S4: the eval error tip recommended a --file option that doesn't exist. Replace with a heredoc example that actually works.
BREAKING: data.requests is now the filtered + --last-sliced list (mirroring the human output). Previously it was the unfiltered full capture, and the filtered list hid at data.filtered — so anyone reading data.requests after applying --filter got silently wrong answers. New shape: data.requests — final result (what the human output shows) data.matchCount — matches before --last slicing data.totalCount — unfiltered capture count Removed: data.filtered, data.filteredTotal, data.total. Pre-1.0 change; the dual-field shape was introduced in the previous QA branch so there's minimal wild usage. Also thread fieldName through positiveIntRule for --last so invalid values say "--last out of range" instead of "--value out of range".
Every other --json command uses `data: {...named fields}`. getCookies
put the array directly at `data: [...]`, so scripts doing
`.data.cookies` silently got undefined.
Shape is now consistent: `data.cookies` carries the array.
C4: bdg dom click "", click -1, fill "" "" reached the fill/click
helper, whose browser-side script runs document.querySelectorAll()
and threw SyntaxError. The helper dumped its own JS source as
"Expression received: (function(selector, index) { ... }" and printed
recovery hints that re-invoked the broken call (bdg dom query "").
Add a boundary guard in runElementCommand that rejects empty /
whitespace-only selectors and negative numeric indices with
INVALID_ARGUMENTS (81) before the resolver or helper is called.
S12: bdg dom query "###" (invalid CSS) previously swallowed the CDP
error and reported "no nodes found" with exit 0. queryDOMElements now
checks for CDP status:'error' on DOM.querySelectorAll and translates
the rejection to exit 81 with a selector-syntax hint.
The previous fix(#14) skipped daemon spawn for status/stop but left cleanup out. Running bdg cleanup on a clean directory would start a daemon, do nothing, and leave it running. cleanup reads/removes files under ~/.bdg directly; it doesn't need an IPC peer. Add it to LOCAL_ONLY_WHEN_NO_DAEMON.
When --chrome-ws-url pointed at an unreachable WebSocket, the user saw
a multi-line dump of the worker's stderr: "[worker] Starting (PID ...),
[worker] Config: {...}, [worker] Connecting to existing Chrome..."
with a trailing "WebSocket closed: 1006".
The worker log is already written to ~/.bdg/daemon.log via the daemon's
stderr. Drop WorkerStartError.details from the IPC-forwarded message
and, when chromeWsUrl is set, replace the generic "Worker process
exited before sending ready signal" with a targeted message that names
the URL and the typical cause.
bdg http://127.0.0.1:1/ and bdg https://expired.badssl.com/ previously exited 0 with "Session started" even though Chrome rendered a chrome-error:// page. Any automation that started bdg against a broken URL silently carried on. - Add EXIT_CODES.NAVIGATION_FAILED (91) — next free in the 80-99 user-error range. INVALID_URL (80) is for syntax, which isn't this. - In the worker, check Page.navigate response for errorText and throw NavigationFailedError carrying the semantic exit code. - Worker forwards the exit code through process.exit(91) so the daemon can distinguish nav failure from a generic worker crash. - Daemon's WorkerStartError gains a NAVIGATION_FAILED code; extract the concrete error text from the worker's stderr and forward a clean one-line message to the CLI. - IPCErrorCode.NAVIGATION_FAILED and errorMapping wire the code through to EXIT_CODES.NAVIGATION_FAILED. Smoke test: bdg http://127.0.0.1:1/ → exit 91, "net::ERR_UNSAFE_PORT" bdg https://expired.badssl.com/ → exit 91, "net::ERR_CERT_DATE_INVALID" bdg example.com → exit 0 (unchanged)
bdg dom eval "while(true){}" used to wedge the session for 30 seconds
per call: Runtime.evaluate never returned, cdp.send hit its own 30s
timeout, and every subsequent dom eval hung the same way because the
Runtime was still executing the loop.
- Pass Runtime.evaluate's own `timeout` parameter (10s). Chrome now
terminates the script itself and returns exceptionDetails, so the
browser isn't left spinning.
- Wrap cdp.send in try/catch and, on either outcome (cdp-side timeout
or Chrome-side "Execution was terminated"), call
Runtime.terminateExecution to kill anything still running. This keeps
the session responsive for the next call.
- Map the timeout to EXIT_CODES.CDP_TIMEOUT (102). The old behaviour
reached CommandRunner as a generic failure and exited 101
(CDP_CONNECTION_FAILURE), which falsely implied the WebSocket died.
Smoke:
bdg dom eval "while(true){}" → exit 102 after 10s
bdg dom eval "1+1" → exit 0, returns 2 immediately
Also move a misplaced helper in startHelpers.ts below its imports.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundled fixes for the 11 critical/significant findings from the second-pass manual QA on 0.7.2 (see
docs/quality/MANUAL_TESTING_2026-04-16_retest.mdon the parent branch). Stacks on top of #227 — merge that first for a clean diff.Critical
bdg <url> --max-body-size abcleaked a Node stack trace (validation threw outside the action try/catch). Now exits 81 with a clean message.bdg dom eval "while(true){}"wedged the Runtime for 30 s per retry. AddedRuntime.evaluate timeout=10s+Runtime.terminateExecutionon timeout; exits 102.bdg dom click "",click -1,fill "" ""leaked the helper's JS source. Boundary guard inrunElementCommandrejects these with exit 81.bdg http://127.0.0.1:1/andbdg https://expired.badssl.com/exited 0 on chrome-error pages. DetectPage.navigateerrorTextand exit the newEXIT_CODES.NAVIGATION_FAILED = 91.bdg network list --json.data.requestsnow carries the filter +--lastresult (was unfiltered). Dropped.data.filtered, renamed total counts tomatchCount/totalCount.--chrome-ws-urlbad URL dumped multi-line worker stderr. One-line user-facing error now; raw log goes to~/.bdg/daemon.log.Significant
bdg statusprintedError: Error: Daemon not running. Strip the prefix from the message producers and letgenericErroradd it once.details network/console <missing>exited 104 (UNHANDLED_EXCEPTION). Translate to 83 (RESOURCE_NOT_FOUND); also fix"available: 0--1"when no console messages exist.dom evaluser-script errors exited 110 (SOFTWARE_ERROR, reserved for bdg bugs). Map to 81.--fileoption hint indom evalerror tips.peek --last 0said"0 is not a valid integer". Split intoinvalidIntegerError(NaN) andoutOfRangeError(min/max); threadfieldNamethroughpositiveIntRuleso messages name the actual flag (--last,--max-body-size) rather than--value.getCookies --jsonwas a bare array atdata: [...]. Nowdata: { cookies: [...] }to match the rest of the contract.bdg cleanupspawned a daemon just to inspect~/.bdg. Added toLOCAL_ONLY_WHEN_NO_DAEMON(same spirit as test: add comprehensive contract tests for network collector and IPC server #14 for status/stop).bdg dom query "###"silently "no nodes found". Detect CDP's syntax rejection and exit 81 with a selector-syntax hint.Deferred
dom queryrace) — 3/3 rounds showed non-deterministic 0/1 counts forh1on MDN. Needs a deterministic repro and investigation before fixing; filed for a follow-up.--indexstale bounds. Small; next round.New exit code
Adds
EXIT_CODES.NAVIGATION_FAILED = 91to the stable registry (80-99 user-error range). Documented insrc/utils/exitCodes.ts+ the--help --jsonexit-code catalog.Test plan
npm test— 650 pass, 0 fail, 1 skippedbdg http://127.0.0.1:1/→ exit 91,net::ERR_UNSAFE_PORTbdg https://expired.badssl.com/→ exit 91,net::ERR_CERT_DATE_INVALIDbdg example.com→ exit 0, unchangedbdg dom eval "while(true){}"→ exit 102 after 10s; nextbdg dom eval "1+1"returns2immediatelybdg dom click ""/click -1/fill "" ""→ exit 81 with clean message (no JS source leak)bdg dom query "###"→ exit 81 with "Invalid CSS selector"bdg network list --filter method:POST --json→.data.requestscontains only POSTsbdg network getCookies --json→.data.cookies[]shapebdg statusafterkill -9 daemon→ singleError:prefixbdg details network BOGUS→ exit 83bdg cleanupon empty~/.bdg→ no[bdg] Starting daemon...line