feat(heal): Self-Healing toggles + vendored rfheal library by raffelino · Pull Request #41 · viadee/roboscope

raffelino · 2026-05-15T12:15:46Z

Summary

HEAL-1 — Per-step Self-Healing checkbox in the Flow Editor detail panel: visible only for the 13 Browser-library keywords that have a Heal * variant; hidden when no Browser/RoboScopeHeal import is present; rewrites the step keyword through the normal form path (unsaved-changes badge fires, no runtime mutation).
HEAL-2 — Suite-level Self-Healing toggle button in the RobotEditor toolbar: promotes/reverts every heal-able keyword in the file in one click, shows a toast with the count, keeps the Code-tab editor buffer in sync.
HEAL-VENDORED — Vendors robotframework-roboscopeheal into backend/vendor/ so make dev and offline release ZIPs work without a sibling repo or a PyPI publication. Offline build scripts updated; sync script added.
HEAL shared utility — frontend/src/utils/healToggle.ts (pure functions, fully tested): HEAL_VARIANTS map, getHealVariant, getBaseKeyword, classifiers, library-import add/remove, applyHealToForm.
RECORDER-VIS-1 — Recorder lifecycle SSE events + restart-browser control so users always know whether the recorder is starting / ready / crashed and can recover without losing captured commands.
Various recorder fixes: iframe ancestry in sidecar, proactive iframe inventory, verifier improvements, SelectorPicker shows composite selector, per-candidate Effektiv override.

Test plan

cd frontend && npx vitest run — 717 tests, 0 failures
cd backend && .venv/bin/pytest tests/test_vendored_rfheal_present.py tests/environments/test_vendored_heal_auto_install.py -v — 20 tests, 0 failures
Open a .robot file with Library Browser in the Flow Editor → select a Click step → confirm the Self-Healing checkbox is visible and toggles the keyword name
Open the same file in the Code/Visual tab → confirm the suite-level button appears in the toolbar and clicking it switches all keywords + shows a toast
cd backend && uv sync on a fresh clone (no sibling rfheal repo) → python -c "import RoboScopeHeal" succeeds

🤖 Generated with Claude Code

…an open ones - debug-1-dap-driver-foundation: in_progress → done (merged in v0.9.0) - 1-1-database-migration-for-phase-4-models: in-progress → done (Phase 4 shipped) - EE-1-konami-code-robot-parade: draft → planned (ready-for-dev) - sprint-status.yaml: new "Post-0.9.0 backlog" section tracking the Interactive Debugger epic (DEBUG-2/3 ready-for-dev, DEBUG-1 done), Launch UX polish (LAUNCH-1 ready-for-dev), and Easter Eggs (EE-1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pure-frontend EE-1: pressing ↑↑↓↓←→←→BA outside any text-entry element sends an inline-SVG robot marching from left to right along the bottom of the screen. Listener mounts once at the App shell, respects prefers-reduced-motion, and never captures clicks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LAUNCH-1: post-0.9.0 distribution UX polish addressing two pain points reported on Windows (and present on every standalone platform): JSON-formatted boot logs are hostile to humans, and the URL the user needs to open is buried in 30+ log lines. - LOG_FORMAT=text flips main.py from pythonjsonlogger to a readable LEVEL logger: message form. JSON stays default everywhere else (Docker, make dev, CI). Standalone start scripts now default-set LOG_FORMAT=text in env. - Loud banner printed via stdout (NOT logger) after lifespan startup completes, showing http://localhost:<PORT>. Suppressed under PYTEST_CURRENT_TEST so test output stays clean. - Optional OPEN_BROWSER=1 (.env knob, default OFF) calls webbrowser.open after the banner. Failures swallowed so a headless host doesn't crash startup. - Windows / non-utf8 PYTHONIOENCODING falls back to ASCII box drawing; Unicode `═` would mojibake on Windows cmd in legacy code-page modes. - 21 new tests in tests/test_main.py covering formatter selection (incl. the AC4 default-JSON regression assertion), banner suppression under pytest, ASCII fallback, and OPEN_BROWSER truthy/falsy aliases + crash-resistance. - scripts/dist-README.md documents the banner appearance, the LOG_FORMAT toggle, and the OPEN_BROWSER auto-open flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Story DEBUG-2 — RUNNER+ users now get a 🐞 Debug button next to the existing Retry on a failed run. Click → backend spawns a RobotCode debug-launch subprocess (DEBUG-1 foundation), parses the run's output.xml for the deepest failing keyword, sets a breakpoint there, and the user lands in DebugPanel.vue with live call-stack + variable scopes pushed via /ws/notifications. Backend - src/debug/router.py — `/api/v1/debug/sessions` (start, control, state, disconnect). RUNNER+ effective-role gate; API tokens stay role-capped (no team/project elevation). - src/debug/session_manager.py — in-process registry with 409 dedup on (user_id, run_id), per-session event forwarder, 5-min idle timeout, 30s `terminated` grace before subprocess kill. - src/debug/output_xml_walker.py — defusedxml walk for the first failing keyword's source+line; recurses into child suites and picks the deepest failure. - src/debug/state_fetcher.py — pulls stackTrace → scopes → variables (top frame only) for the post-`stopped` snapshot. - src/debug/schemas.py — Pydantic models for the API surface. - src/audit/event_types.py — DEBUG_SESSION_STARTED/_ENDED. - src/main.py — wires forwarder + state-fetcher in lifespan; best-effort stop_all() on shutdown. Frontend - src/stores/debug.store.ts — Pinia store, 8 tests covering start/state-replace/output-cap/topic-routing/control/stop. - src/components/debug/DebugPanel.vue — header + stack + scopes + output log; toolbar buttons disabled when not paused. - src/components/execution/RunDetailPanel.vue — Debug button gated on run.status === 'failed' && hasMinRole('runner'); inline DebugPanel renders below the heal report. - src/api/debug.api.ts — REST client + sendBeacon disconnect. - src/composables/useWebSocket.ts — dispatches `debug_event` to the store; topic-routed inside the store so events for other sessions are silently dropped. - i18n EN/DE/FR/ES under `debug.*` (btn, panel, error groups). Tests - 46 backend tests: walker + router (RBAC, audit, dedup, control, state, ownership). RobotDebugSession is mocked end-to-end via the manager's injectable factory — no real `robotcode`. - 8 frontend store tests + DebugPanel renders cleanly in tsc + prod build. Out of scope per story: conditional breakpoints, watch expressions, multi-test debug, optional E2E (RUNs Chromium). 5-min idle-timeout is heuristic — revisit after first user feedback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

DEBUG-3 — extends the DEBUG-2 backend + UI so a user can click any keyword/assignment/control/return node in the Flow Editor's step detail panel and press "Run up to here" to launch the test under robotcode debug-launch with a breakpoint on that line. Backend: - POST /api/v1/debug/sessions accepts a second body shape {file, test_name, line, repo_id} via a single discriminator-free Pydantic model (mutually exclusive with the existing {run_id}). - _validate_step_invocation walks the .robot to confirm the line is inside the named test (and NOT the test-case header — RF won't break there). Path-traversal guarded against repo root. - 409 dedup at file+line scope so a same-step click silently resumes; a different line in the same file produces TWO sessions (the frontend stops the first via a confirm-modal before issuing the second POST — the backend treats them as independent). - Audit DEBUG_SESSION_STARTED now carries `source: "run"|"flow_editor"`. Frontend: - RobotStep.\_lineNumber annotated by parseRobotToForm (1-based). Added to both RobotStep interfaces (RobotEditor.vue + flowConverter.ts) and propagated through cloneStep so the deep-clone path covered by FlowEditorStepIsolation.spec.ts keeps the metadata across the detail panel re-builds. - New "Run up to here" button in FlowEditor's step-detail panel, visible only for stepType in {keyword, assignment, return, if/else_if/else, for/while, try/except/finally} on a Test Case node when the user has RUNNER+ and a filePath/repoId pair are in scope. Hidden in the Keywords section (RF debug needs --test). - AC6 rapid-fire semantics in store.classifyStepClick: - 'idle': no session → start - 'same': resume silently (409 → adopt existing session) - 'different': render confirm-modal "stop and restart" - DebugPanel rendered as fixed-position overlay teleported to body while a session is active; Stop returns to the editor with the canvas state intact. - Dirty-buffer surfaces a save-prompt modal instead of debugging an out-of-sync file; user saves manually then re-clicks. i18n: - flowEditor.debug.* keys in EN/DE/FR/ES (label, title, error copy, two modal copies). Tour rotation grows from 30 to 31 tips (tip31 points at this affordance). Tests: - backend tests/debug/test_router.py — 11 new tests covering happy path, test-header-line rejection, unknown test, line-outside- test, missing file, path-traversal, mixed body shapes, empty body, RBAC, same-line dedup, different-line non-dedup. 19/19 green; full debug+audit suite 100/100 green. - frontend tests/stores/debug.store.spec.ts — 8 new tests covering startFromStep happy path + 409 silent-resume + non-409 rethrow, classifyStepClick four cases (idle, same, different, terminated), and a regression pin on cloneStep \_lineNumber preservation. 16/16 store tests green; full vitest 514/514 green. - vue-tsc clean; npm run build clean (4 locale chunks emit fine). Story spec: \_bmad-output/implementation-artifacts/debug-3-run-up-to-selection-action.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Re-run failing test in interactive DAP session. Builds on DEBUG-1's DAP driver foundation: adds POST /api/v1/debug/sessions, WebSocket debug:session:<uuid> topic, control endpoints, DebugPanel.vue, and DEBUG_SESSION_STARTED/ENDED audit codes. RBAC gated to RUNNER+. 409 dedup per (user, run). 5-min idle timeout + 30s grace on terminated. i18n in EN/DE/FR/ES. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Run up to selected step in Flow Editor with same DAP panel. Extends DEBUG-2's POST /debug/sessions to accept {file, test_name, line, repo_id} via discriminated Union; adds AC4 header-line guard and path-traversal validation. FlowEditor step-detail panel gains a '▶ Bis hier ausführen' button gated on saved-buffer + RUNNER+ + breakable step type. Same DebugPanel.vue reused as teleported overlay. AC6 multi-tab dedup: silent same-line resume via 409, confirm-modal restart on different-line. RobotStep gains optional _lineNumber via parser annotation (extends cloneStep, pinned by FlowEditorStepIsolation). i18n EN/DE/FR/ES + tipOfTheDay tip31. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Standalone-start UX polish for v0.9.1: LOG_FORMAT=text toggle flips main.py logging from JSON to readable for the bundled distribution start scripts (Docker / make dev / tests stay on JSON). Loud "open this URL" banner via print() (not logger) after FastAPI lifespan startup, with ASCII fallback on Windows or non-utf8 PYTHONIOENCODING. Optional OPEN_BROWSER=1 calls webbrowser.open. Build scripts default LOG_FORMAT=text in the distributed .env.example; dist-README.md documents the new toggles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Konami code easter egg. Pressing ↑↑↓↓←→←→BA outside any text input sends a 4s SVG robot marching across the bottom of the viewport. Layout-independent (event.code), pointer-events: none so it never captures clicks, aria-hidden, fully disabled under prefers-reduced-motion. Single global listener at App.vue. No i18n, no docs, no analytics — discovery is the point. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Before DEBUG-2/3 spawn, the router checks for the `robotcode` binary in the project's venv. Missing → 424 Failed Dependency with detail {code, repo_id, env_id, package, message}. New endpoint POST /api/v1/debug/sessions/install-prerequisites runs `uv pip install robotcode` into that venv via async subprocess (300s timeout, log tail captured). Frontend dialog (DebugPrereqDialog) catches the 424 from both entry points (RunDetailPanel 🐞 + FlowEditor ▶ Bis hier ausführen), offers Install/Cancel; on Install the store retries the original start automatically. Audit code DEBUG_ROBOTCODE_INSTALLED. i18n EN/DE/FR/ES. Tests: 8 new prereq.py units + 7 router tests (424 paths + install endpoint happy/already-installed/failure/RBAC) + 5 frontend store tests (424 catch + retry-after-install + cancel). Pre-existing event-loop-pollution bug in test_router.py's autouse fixture fixed on the way (asyncio.run instead of get_event_loop on teardown). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

In-dialog RobotCode install on missing prereq. Replaces the generic 502 on first-time debug clicks with a 424 Failed Dependency surfaced as a modal: 'RobotCode is not installed in this project's environment. Install now? [Install] [Cancel]'. On Install the backend runs uv pip install robotcode into the project's venv (audited via DEBUG_ROBOTCODE_INSTALLED) and the frontend retries the original debug start. Both DEBUG-2's 🐞 button and DEBUG-3's ▶ Bis hier ausführen trigger the same flow. i18n complete in EN/DE/FR/ES. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… install) Per UX guidelines: secondary/cancel buttons use BaseButton variant=ghost, primary actions use BaseButton variant=primary with the built-in :loading spinner instead of bare HTML buttons + ad-hoc classes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…imeout The bare 'robotcode did not announce a TCP port within 15.0 s' error gave users no clue what robotcode was actually doing during boot. Two complementary changes: 1. Capture stdout/stderr during the port-wait into a 200-line ring buffer; on timeout/exit include the last 20 lines in the DebugSessionStartFailed message so users (and future bug reports) can see the real failure mode. 2. Spawn the subprocess in a fresh session/process-group on POSIX (start_new_session=True). On cleanup, SIGKILL the entire pgid via os.killpg so any Robot Framework → Browser library → Playwright → Chromium grandchildren get reaped along with the parent. Without this, orphaned grandchildren routinely outlive a session and can block the next port-0 bind by holding shared state under ~/.cache/robotcode/. Closing VS Code 'fixed' the issue for users only because that incidentally tore down those zombies; this is the proper fix. Also: bump the default port_parse_timeout from 15s to 30s (cold robotcode boots can take 20s+ on slow venvs after a fresh install) and expose ROBOSCOPE_DEBUG_PORT_TIMEOUT for operator override. Two new tests pin the new error shape (boot-output included + returncode surfaced on early exit). 11/11 in tests/debug/test_robot_debug_session.py green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The umbrella robotcode package alone gives us the CLI shell but NOT the debug-launch subcommand — that's registered as a click plugin by the robotcode-debugger PyPI package. Without the [debugger] extra, spawn fails at runtime with "No such command 'debug-launch'" — exactly what the new diagnostic message just surfaced. Two changes: 1. Install package: robotcode → robotcode[debugger]. Pulls in the umbrella + the debugger plugin that registers the subcommand. 2. Prereq check: also verify <venv>/lib/python*/site-packages/ robotcode/debugger/ exists. Catches the partial-install state (umbrella alone) BEFORE the spawn explodes — user gets the install dialog instead of a runtime error toast. Tests updated accordingly: fixtures seed the debugger marker alongside the binary, plus a new unit test pinning the "binary-without-plugin → False" case (the actual user-visible failure mode that triggered this fix). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Production failure: clicking Debug surfaced "robotcode exited with code 2 during boot. Last output: Error: No such option: -w." followed by the test silently never stopping at the breakpoint. The Story DEBUG-1 foundation was wired against an older robotcode CLI that has since dropped flags and changed protocol order. Three fixes: 1. **CLI argv:** modern `robotcode debug-launch` rejects `-w` and the trailing `<robot_path>` positional. Spawn is now just `robotcode debug-launch --tcp 127.0.0.1:<port>`. Port is pre-allocated by us (the launcher rejects port 0 / "1<=x<=65535"). 2. **Connect strategy:** robotcode prints nothing when it's listening, so the regex-on-stdout port parser never fired. Replaced with a poll-connect loop on the pre-allocated port. A background stdout-pump task drains output continuously into a 200-line ring buffer (a) so the pipe doesn't fill once Log keywords run, and (b) so the diagnostic error includes the real robotcode output. 3. **Handshake order:** DAP spec requires the client wait for the `initialized` event before sending `setBreakpoints`. We were firing `setBreakpoints` immediately after the initialize response, which modern RobotCode rejects with `Unknown Command 'setBreakpoints'`. New sequence: - `initialize` (await response) - fire-and-don't-await `launch` (servers commonly defer the launch response until after configurationDone — awaiting deadlocks) - wait for `initialized` event - `setBreakpoints` per file - `configurationDone` The launch payload also now includes `python`, `cwd`, `target`, and `console: "internalConsole"` — without those the launcher either dispatches `runInTerminal` to us (we don't handle it) or fails to spawn the child runtime. **Integration test:** new `@pytest.mark.integration` class `TestRealRobotCodeSpawn` exercises the entire pipeline against the user's installed robotcode in ~/.roboscope/venvs/roboscope-default. Catches breaking changes in the robotcode CLI surface BEFORE they hit users. - `test_real_spawn_handshake_and_test_runs` — passes today, asserts the RF banner appears on stdout (proves spawn + handshake worked). - `test_real_breakpoint_pauses_execution` — `xfail(strict=True)`. The remaining bug: even though the test runs end-to-end, the `stopped` event never arrives. Tracked as a separate launcher → child proxy layer issue; the strict-xfail will alert when fixed. **Process-tree cleanup** retained from earlier: cleanup cancels both the new stdout-pump task and the launch future before SIGKILL'ing the process group, so no orphans survive between sessions. Plus drive-by ruff cleanups in DEBUG-1's dap_client.py (typing.Callable → collections.abc.Callable, asyncio.TimeoutError → TimeoutError, contextlib.suppress in stop()). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

remaining stop-event proxy bug Layer-isolation diagnostic via a new `pause`-based integration test: both breakpoint AND pause flows xfail with the same symptom (test runs end-to-end, no `stopped` event), proving the remaining bug is NOT in breakpoint path resolution but in the launcher → us event- forwarding chain for the `StoppedEvent` family. Other events (`output`, `initialized`) traverse the same proxy and arrive fine. Changes: - `_launch_args` now includes `outputMessages: True` and `outputLog: True` so RF execution messages stream over the DAP `output` channel as the test runs (useful for the run-detail panel even before DEBUG-5 lands). - New `test_real_pause_request_pauses_execution` integration test marked `xfail(strict=True)` — pinned alongside the breakpoint test. When BOTH turn green at once, the fix is the same root cause; if only one does, the spec's hypothesis was wrong. - Story doc `debug-5-breakpoint-resolution.md` captures the full investigation: what's been verified working (spawn + handshake), what's been ruled out (path resolution — confirmed by the pause test), and concrete next-step instructions for whoever picks up the proxy debugging (patch venv's `debugger.py::send_event` with a print to see if `stopped` is being sent at all). Sprint status: DEBUG-5 → ready-for-dev (pinned by xfail tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…w (DEBUG-5) Root cause: RobotCode's listener emits robot* events (robotStarted, robotEnqueued, robotEnded, robotLog, etc.) whose bodies inherit from SyncedEventBody. Each such event makes the in-process Debugger synchronously block the listener thread for up to 15 s on `self.sync_event.wait(15)`, waiting for a `robot/sync` RPC request back from us. We never sent one. The very first synced event (`robotEnqueued` from ListenerV3.start_suite) tied up the listener so RF never reached start_test → start_keyword → process_start_state, which is where breakpoint matching lives. So breakpoints AND pause both silently failed — `output` and `initialized` got through because their bodies don't mix in SyncedEventBody. Fix: register a handler for every robot* event family that fires a fire-and-forget `robot/sync` request on the DAP client. That sets the gating Event in the child, the listener thread returns, RF proceeds normally, breakpoints fire as expected. Diagnosis: instrumented the user's venv with print()-to-file tracing (server.py, debugger.py, listeners.py, launcher/client.py), ran the pause integration test, watched the trace stop dead between "V2.start_suite ENTER" and the next listener line. Read on_debugger_send_event source → spotted the synced wait. Reverted all venv instrumentation; this commit is clean. Tests: - 3/3 integration tests now pass against the user's installed robotcode in ~/.roboscope/venvs/roboscope-default: - test_real_spawn_handshake_and_test_runs (was passing) - test_real_breakpoint_pauses_execution (was strict-xfail) - test_real_pause_request_pauses_execution (was strict-xfail; uses Sleep 3s so pause races RF mid-execution before termination) - 71/71 unit tests in tests/debug/ green; no regressions. Story spec at _bmad-output/.../debug-5-breakpoint-resolution.md records the full diagnosis path + the actual fix shape (NOT path resolution as initially hypothesised — the launch payload, handshake, and breakpoint paths were all correct; the missing piece was honoring RobotCode's robot/sync ack contract). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three layers of tests, each catching a different failure mode: 1. **Real-DAP control tests** (`test_robot_debug_session.py ::TestRealControlButtons`) — five `@pytest.mark.integration` tests that drive `RobotDebugSession.continue_/next_/step_in/ step_out/disconnect` against the user's installed robotcode. Asserts the matching `stopped`/`terminated` event arrives with the correct `reason` (breakpoint/step/etc.). 2. **Real-router HTTP tests** (`test_router_integration.py` ::TestRealRouterControls`) — five `@pytest.mark.integration` tests that drive `POST /api/v1/debug/sessions/{id}/{cmd}` against a real RobotDebugSession (factory swap reverts the unit-test fake-session injection). Polls `GET /sessions/{id}/state` after each control to confirm the cache advances. Mirrors what the frontend buttons trigger end- to-end. 3. **Frontend component tests** (`DebugPanel.spec.ts`) — 15 Vitest tests that mount the actual panel against a fake API and verify: - Continue / Step Over / Step In / Step Out gate on `paused & !terminated` (Stop is always enabled — the user must always be able to abort). - Each button click → correct `debug.store` action → correct `postControl` arg. - Stop emits `closed` even when disconnect throws (the panel must close even if backend is unreachable). - State events update `paused_at` line in the header. - Terminated events surface the badge. - Output events append to the live log. - Cross-session events (different `topic`) are ignored. The user-reported "I can't cleanly step / abort / continue via the UI" was rooted in DEBUG-5 (missing robot/sync ack at the DAP layer), not in the buttons themselves. With that fix, the buttons NOW work; these tests are the regression watchdogs that catch any future drift in the chain. Test totals (before this commit): - backend tests/debug/ unit: 71 ✓ - backend integration (DAP-direct controls): 5 ✓ (covered by 8 in TestRealRobotCodeSpawn + TestRealControlButtons) - backend integration (HTTP router): 5 ✓ new - frontend full vitest: 543 ✓ (was 528, +15 from DebugPanel.spec) Drive-by lint cleanups (E501 wrap on two helper signatures). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bug: ``step._lineNumber`` is stamped once when ``parseRobotToForm`` reads the file. The moment the user inserts a new keyword above the selected step, every step below shifts in the source file but its ``_lineNumber`` doesn't. Clicking the "Bis hier ausführen" button then sent the backend a stale line — the test ran past the keyword the user pointed at, or stopped at the wrong line entirely. Fix: new ``computeStepLine(form, isResource, tcIdx, stepIdx)`` in ``flowConverter.ts`` mirrors ``serializeFormToRobot`` line-for-line and returns the LIVE source line. ``FlowEditor.vue::_stepDebugPayload`` now calls it on every click instead of reading the stale field. The function tracks every emitter branch the serializer cares about: * preamble lines (leading comments) * Settings section (incl. multi-line ``[Documentation]`` continuations via ``...`` continuation lines) * Variables section * Test Cases section header + per-testcase metadata (``[Documentation]`` multi-line, ``[Tags]``, ``[Setup]``, ``[Teardown]``, ``[Timeout]``, ``[Template]``) and the trailing blank between test cases 13 unit tests in ``FlowEditorComputeStepLine.spec.ts`` pin every emit branch, plus the primary regression: insert/remove a step and the surviving step's reported line shifts correctly. The legacy ``step._lineNumber`` field stays — still set at parse time for diagnostics, no longer load-bearing for the debug button. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds 21 new tests covering corner cases the user asked about ("steppe einzeln durch und probiere jede Funktion auch innerhalb von komplexeren Vorgängen"). All 21 pass against real robotcode + real RobotDebugSession; no production fixes needed — the previous DEBUG-5 robot/sync ack already addressed every blocker. Backend real-DAP scenarios (TestComplexDebugScenarios, 8 tests): - sequential step-through (pause → next × 2 → continue → terminated) with reason=step asserted between each step. Pins the exact UI flow the user reported as "doesn't feel clean". - breakpoint inside a FOR loop fires per iteration (3 stops for 3). - step into a user-defined keyword descends, step out returns to the caller scope. - pause during a long-running keyword (Sleep 5s) produces a stopped with reason=pause; continue lets it finish. - multiple breakpoints fire in source order (line 3 then line 5). - disconnect while RF is mid-execution (no breakpoint, in a Sleep) terminates cleanly with no orphaned subprocess. - paused_at line advances after each next — this is what the run-detail panel header renders. - control calls after terminated are benign no-ops, not crashes. Backend HTTP-router scenarios (TestComplexRouterScenarios, 3 tests): - full walk-through via HTTP: start → pause → next × 2 → continue → terminated, with state-cache line assertions per step. - 409-dedup response carries the existing session_id so the frontend can silently re-attach. - control hits AFTER disconnect return 404 (not crash) — guards against stale browser tabs racing the reap. Frontend store scenarios (10 tests): - terminated event arriving DURING a pending control resolves cleanly — the WS update wins, postControl doesn't reset state. - multiple state events use last-write-wins. - output buffer capped at 300 (verified by pumping 350 events). - events for a different session_id are silently ignored. - events arriving after reset() are ignored. - rapid sequential controls all dispatch (no implicit dedup — the disabled-button contract owns that). - control without an active session is a silent no-op. - stop after reset is a no-op (no double-disconnect). - sessionId/isActive lifecycle: null → active → null after stop. - terminated event flips isActive=false but keeps sessionId set so the badge renders. Test totals after this commit: - backend tests/debug/ default: 71 passed - backend tests/debug/ -m integration: 19 passed (11 + 8) - frontend full vitest: 566 passed (+10 new store edge cases) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the longest-standing rough edge in the Self-Healing opt-in contract: today the user has to leave the visual editor, open the Code tab, manually rename `Click` → `Heal Click` for each step, and also remember to add `Library RoboScopeHeal` to the Settings section. Two new toggles let the same opt-in happen in-place, preserving every existing safety invariant. HEAL-1 — Flow Editor detail panel checkbox: - Visible only on `keyword` / `assignment` steps whose keyword is one of the 13 supported names (bare or already `Heal *`). - Toggling rewrites THAT single step's `keyword` field via the same `onStepFieldChange` → `updateStepFromNode` → `rebuildAndReselect` path the rest of the panel uses. The unsaved-changes badge fires; no runtime mutation. - Library row is added/removed via the existing settings array (idempotent helpers — duplicate adds and accidental removals of user-configured rows are impossible). HEAL-2 — RobotEditor toolbar toggle: - Single button next to the `.robot` badge, label reflects state (`Self-Healing: On / Off`). Hidden when zero heal-able keywords exist (a Log-only file). - One click rewrites every heal-able step across all test cases AND user keywords, plus adds/removes the bare `Library RoboScopeHeal` row. Toast confirms the count. - Form mutation uses direct property assignment on the reactive (settings/testCases/keywords); the existing deep watchers emit `update:content` to the parent, marking the file unsaved. Shared utility — `frontend/src/utils/healToggle.ts`: - `HEAL_VARIANTS` map derived from `backend/src/recording/heal/ library.py`. Adding a new `@keyword("Heal …")` there means one line here. - `applyHealToForm(form, mode)` walks the parsed form (NOT raw text), so `Run Keyword Click selector` is structurally safe: `Click` is an argument, not a step keyword, and never gets rewritten. - Preserves array identity for unchanged sub-trees so Vue reactivity only flags the parts that actually changed. Design invariants honoured (per CLAUDE.md "SH-2 opt-in contract"): 1. Explicit per-keyword opt-in — no Browser-library monkey-patch. 2. Source rewrite, never runtime mutation — every Heal swap is a one-line `.robot` diff the user sees in git. 3. Custom-configured `Library RoboScopeHeal <args>` rows are preserved across both directions; bare auto-added rows are removed when the last Heal* keyword leaves the file. 4. User-defined `Heal Login` / `Heal Foo` keywords (anything not in HEAL_VARIANTS) are NEVER touched — disable doesn't rename them to bare and enable doesn't try to promote them. Tests: - 42 unit tests in `healToggle.spec.ts` cover: the 13-keyword map (pinned set + Object.freeze), trim/case-sensitivity of lookups, library-row add/remove idempotence (incl. preserve- configured), `applyHealToForm` enable/disable across test cases + user keywords, the keyword-as-argument edge case, immutability of input forms, array-identity preservation. - Frontend vitest: 608/608 (+42 new). vue-tsc + production build clean. Stories: - `_bmad-output/implementation-artifacts/heal-1-flow-editor-per-step-toggle.md` - `_bmad-output/implementation-artifacts/heal-2-explorer-suite-level-toggle.md` Out of scope: repo-wide bulk toggle, library-arg config UI, custom-Heal-keyword registry. The `no-heal` Robot tag remains the per-test runtime escape hatch layered on top of these source choices. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the silent-recorder gap reported in user testing: after clicking *Start Recording*, the Live view sat on a `Connecting…` badge for the entire Chromium boot — sometimes seconds, sometimes forever — with no feedback about what was happening, and no recovery affordance when the spawn failed silently (missing \$DISPLAY on a Linux server, Playwright wheel not initialised, blocked port). Backend The in-process command FIFO (`v2_command_queue`) now carries a heterogeneous `RecordedCommand | LifecycleEvent` stream. A new `LifecycleEvent` carries one of four phases plus a wall-clock timestamp captured at enqueue, and an optional human-readable message for the crash variants. `iterate_events()` yields both types in insertion order; `iterate_commands()` is preserved as a filter-only wrapper for any W.2-era caller that doesn't care about lifecycle. `v2_recorder_task` emits at four well-defined boundaries: - `browser_starting` — immediately before `pw.chromium.launch(...)`. - `browser_ready` — after `context.new_page()` + the optional initial `goto`, i.e., the point the user can click and see events arrive. - `browser_crashed` (in-loop) — from `_on_disconnect` when the browser disconnects without a user-initiated stop. The crash message names the disconnect channel ("browser disconnected unexpectedly"); for outer wrapper crashes we surface the exception string instead so a $DISPLAY problem reaches the user. - `browser_restarting` — emitted by the HTTP endpoint just before it signals the current task down. The fresh task then emits its own `browser_starting` → `browser_ready`. The wrapper `run_v2_recorder_session` moved its `_mark_status(COMPLETED)` call OUT of the inner `_recorder_loop` and into its own `finally` block. That lets it discriminate three exit paths: 1. clean stop → mark COMPLETED + finalize_session + tear_down. 2. crash (exception bubbled out) → push `browser_crashed` lifecycle, mark FAILED, finalize_session + tear_down. 3. stop-for-restart (`_restart_pending` set) → SKIP all three — the new task reuses the same queue + DB row. The new endpoint `POST /recordings/sessions/{id}/restart-browser` ties it together. 404 / 403 / 409 / 501 enforced (status not RECORDING returns 409, recorder-disabled env returns 501, owner check 403, missing session 404). On the happy path it pushes a `browser_restarting` lifecycle event onto the SSE channel so the pill flips immediately, signals `_restart_pending` + stop, polls up to 5s for the wrapper to vacate `_stop_signals`, then dispatches a fresh `run_v2_recorder_session` with the same target_url stashed on the session row. Two recovery branches: when no task is in `_stop_signals` (process restart leftover) we dispatch a fresh task directly; when `signal_restart_v2` races to False between checks we fall through to the same recovery path. SSE generator updated to multiplex: `event: command` for `RecordedCommand`, `event: lifecycle` with `{ phase, ts, message }` JSON for `LifecycleEvent`, the existing `event: end` sentinel unchanged. Frontend `RecordingLiveView.vue` replaces the 4-state `streamState` (`connecting | live | done | error`) with a richer `phase` enum driven by the backend lifecycle events: `connecting → browser_starting → browser_ready → (browser_restarting → ...) → done | error | browser_crashed`. A new EventSource listener for `lifecycle` events routes payloads through the same `_transitionTo()` state machine the SSE transport events use. Two new template areas: - A *phase card* next to the heading carrying the pill, a live `mm:ss` uptime label that ticks each second once `browser_ready` fires (reset on `browser_restarting`), and a "Restart browser" button enabled in `browser_ready` / `browser_crashed` and disabled during the transient phases (where the backend would 409 anyway). - A red crash banner with the backend's error message under `browser_crashed`. A `command`-first fallback flips `connecting / browser_starting` to `browser_ready` if a command somehow arrives before a lifecycle event (late attach, restart-mid-stream). It deliberately does NOT promote `browser_crashed → browser_ready` — a stray late command from a buffered binding is recorded but doesn't lie about the phase. API client `restartV2Browser(sessionId)` added in `recording-v2.api.ts`. i18n complete in EN/DE/FR/ES under `recorder.live.lifecycle.*` + `restartBrowser` + `crashTitle`. Tests Backend (14 new in `test_v2_recorder_vis.py`): - Queue: `enqueue_lifecycle` returns False without `register`, heterogeneous `iterate_events` preserves insertion order across commands + lifecycle, `iterate_commands` filter-only backcompat, `LifecycleEvent.ts` defaults to wall-clock at construction. - Wrapper: a raise inside the inner loop pushes `browser_crashed` onto the queue and tears down (queue gone from the registry). - `signal_restart_v2` returns False without an active task, returns True + sets the event + marks `_restart_pending` when a task is registered. - Restart endpoint: 404 / 403 / 409 / 501 paths, two happy-path branches (dispatch when no task active, signal-and-dispatch when active). Both happy paths assert `dispatch_task` was invoked with `(run_v2_recorder_session, session_id, target_url)`. - SSE multiplex: a producer-thread that interleaves lifecycle/command/lifecycle/command and finalises drains through the streaming response, the response body contains exactly 2× `event: lifecycle\n` and 2× `event: command\n` followed by the `event: end` sentinel. Existing `TestHappyPathDoesNotMarkFailed` updated for the wrapper's new ownership of `_mark_status(COMPLETED)` — the test's intent (verify FAILED branch doesn't fire on a clean exit) is preserved by asserting NO FAILED entries among the observed calls instead of asserting zero calls overall. Frontend (17 new in `RecordingLiveView.lifecycle.spec.ts`): mirrors the state machine inline (SFC setup script isn't importable) and exercises every transition entry point: each of the four lifecycle phases, full restart cycle resets uptime timestamp, crash clears the message slot when ready arrives again, the command-first fallback for `connecting` and `browser_starting`, the no-regression assertion for already-ready / crashed / restarting, plus the `mm:ss` formatter edge cases (null → null, pad zeros, clamp negative deltas). Frontend totals: 625/625 (was 608, +17). Backend `tests/recording/`: all four files green. `vue-tsc --noEmit` clean, `npm run build` clean. Out of scope (deferred to follow-ups): - Showing the Chromium PID — Playwright Python's stable API does not expose it; uptime + phase carry the user-visible signal. - Auto-restart on crash — restart stays user-initiated so flaky- spawn root causes aren't masked. - A repo-wide stuck-recordings sweep — the existing launcher reset button covers that. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ontent The HEAL-2 suite-level toggle mutates the reactive form and relies on the deep watchers on `form.settings/testCases/keywords` to emit `update:content` for the parent. Those watchers are guarded by `inFormEditingTab` (true on `visual` / `flow`, FALSE on `code`) so free typing in CodeMirror isn't echoed back during a code edit. That guard turned the toolbar toggle into a no-op on the Code tab: form was rewritten, but CodeMirror still showed the old text, the parent never saw the unsaved-content event, and the next switch to visual/flow would call `parseRobotToForm(internalCode)` which overwrites the rewritten form with the stale code — the toggle's effect was lost. Fix: in `onHealSuiteToggle`, when `activeTab === 'code'` we explicitly: - serialize the form back to source (`serializeFormToRobot()`), - update `internalCode.value` so a later tab switch sees the new code, - dispatch a CodeMirror change so the user actually sees the keyword rename happen in the visible buffer, - emit `update:content` so the parent marks the file dirty. Visual + Flow paths are unchanged. They already work correctly: the deep watchers fire, emit `update:content`, the parent feeds the new content back via `props.content`, and the watcher on `props.content` calls `parseRobotToForm(newContent)` so the form is reloaded from the freshly serialized source. The Visual `v-for` and Flow's `robotFormToFlow` computed re-render against the new form references right away — keyword inputs already show `Heal Click` after the toggle, no extra fix needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

A step keyword named `Click` in a `.robot` file does NOT necessarily mean Browser-library's Click — it could just as easily be a custom user keyword the file defines under `*** Keywords ***` with the same name. Until now, both the HEAL-1 per-step checkbox and the HEAL-2 suite toolbar button surfaced any time those heal-able names appeared in a step. Toggling on such a file would rename the user's own keyword to `Heal Click`, leaving it unresolvable at runtime — breaking the test rather than healing it. Add a real library-import gate on top of the heal-able-keywords check: the toggle is now visible only when the file actually imports `Library Browser` (or one of the pip-name variants: `robotframework-browser`, `robotframework_browser`, `robotframework-browser-batteries`, `robotframework_browser_batteries`) OR has already opted in by importing `Library RoboScopeHeal`. The matcher is case-insensitive and tolerates the `Library Browser auto_closing_level=KEEP` form (args don't affect detection). New helpers in `frontend/src/utils/healToggle.ts`: - `hasBrowserLibraryImport(form)` — true when any settings row matches one of the canonical Browser names (regex matches the five spellings above; `key === 'Library'` enforced so a `Documentation` row that mentions "Browser" doesn't trip it). - `hasRoboScopeHealImport(form)` — true when any settings row imports `RoboScopeHeal` (bare or with config args). Once the user has explicitly opted into the heal contract the toggle stays available even without an explicit Browser library row. Plumbed into both: - `RobotEditor.vue::healSuiteState` returns `'hidden'` when neither import is present. - `FlowEditor.vue::selectedStepHealMode` returns `'hidden'` under the same condition. Means a clicked `Click` step in a file that doesn't import Browser will not even reveal the checkbox. Tests: 11 new unit tests in `healToggle.spec.ts` covering the canonical name, the four pip-name variants, case-insensitivity, the args-aware match, the no-Library-row negative path, the documentation-mentions-Browser negative path, the empty form. Plus the RoboScopeHeal counterpart with the bare + configured-args paths. Frontend totals: 636/636 (was 625, +11). Stories: both HEAL-1 and HEAL-2 specs updated with the new edge-case row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… dict, not an object User-reported: a recording against heise.de's cookie banner saved `Click iframe[src*="cmp.heise.de"] >>> text="Zustimmen"` as the first selector — the active candidate at index 0 — even though `text="Zustimmen"` matched THREE elements in the iframe (two buttons plus a paragraph). Run 72 reproduced strict-mode failure at replay: "strict mode violation: locator(...) resolved to 3 elements". The Story S.3 selector verifier is supposed to catch exactly this: during the live recording it runs each candidate through Playwright's `.locator(value).evaluate_all(...)` and flips `verified_unique=True` on the ones that resolve to a single visible+actionable element. Single-actionable candidates sort to the front; multi-match ones are either disambiguated to `>> nth=0` or kept with a heavy penalty. The sidecar told the story: every recorded candidate had `verified_unique: false` and no MatchInfo. The verifier wasn't running. Root cause Playwright's `BrowserContext.expose_binding` invokes its callback with `source = dict(context=ctx, page=page, frame=frame)` — a plain `dict`, see `playwright/_impl/_page.py:1539`. Our `on_capture` extracted the frame via attribute access: frame = getattr(source, "frame", None) or getattr(source, "page", None) `getattr` on a `dict` always returns the default — dicts have keys, not attributes — so `frame` was ALWAYS None. The downstream `_verify_command_candidates` early-returns on `frame_or_page is None`, leaving the unsorted, unverified candidate list intact. The existing unit tests for the verifier wire-up pass `_FakePage` directly as the second argument, bypassing the source-extraction path entirely. So this regression has been live since the wire-up landed (`d339584 fix(recorder): wire verify_candidates into the v2 capture handler`) without a single test catching it. Fix New helper `_resolve_frame_target(source)` does the right thing: - `isinstance(source, dict)` → use `source.get("frame")` then fall back to `source.get("page")`. Covers the Playwright path. - `source is None` → return `None`. The recorder's own `_on_new_page` emission calls `on_capture(None, payload)`; this keeps the synthetic switch-page event harmless. - Otherwise → keep the `getattr(...)` fallback so older test stubs that expose `.frame` / `.page` as attributes still work without churn. `on_capture` now calls `_resolve_frame_target(source)` in place of the buggy getattr chain. Tests Six new tests in `TestResolveFrameTarget`: - dict with frame → returns frame - dict without frame (top page) → falls back to page - dict without either → None - source=None → None (synthetic Switch Page event) - object with .frame attribute → returns frame (legacy stubs) - object with only .page attribute → returns page All 13 wire-up tests green (was 7). Existing _FakePage-style tests unchanged. Manual verification path — `_recorder_loop` will now pass a real `Frame` into the verifier on iframe captures; the next recording against heise.de will land with `verified_unique=True` on the unique candidates and `text="Zustimmen" >> nth=0` (the disambiguated form) at slot 0 instead of the multi-match raw `text="Zustimmen"`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…al-browser E2E coverage User reported: after the previous frame-resolution fix (`a06f296`), recording on heise.de's cookie banner produced `# RBSCOPE: dropped Click — no selector captured` — every candidate dropped. Worse than before the fix, because at least the unverified candidates used to leak through. The right diagnosis was: `a06f296` made `_resolve_frame_target` return the actual iframe Frame (not None) so the verifier finally got to RUN. But during a click that dismissed the iframe (or navigated the page), the verifier ran AGAINST a frame that detached mid-flight. Every `loc.evaluate_all(...)` call raised, the `except Exception` arm coerced the result to `MatchInfo(0, 0, 0)`, and `verify_candidates` dropped every 0-match candidate. Net effect: the recorder turned every click-that-affects-the-DOM into a no-selector capture. Plus the user pointedly asked: *do you have real-browser E2E tests for this scenario?* No. The unit tests for the verifier wire-up use `_FakePage` stubs that can't reproduce a navigation race. Until this commit, the regression class had zero E2E coverage. Verifier contract change `verify_candidates` (selector_verification.py) and the helper `_resolve` (v2_recorder_task.py) now distinguish three outcomes: 1. **`MatchInfo(t>0, ...)`** — verification ran cleanly, the selector matched something live. Classify gold / visible-only / hidden / multi-match and rank as before. 2. **`MatchInfo(total=0, ...)`** — verification ran, selector resolved to nothing. Drop (existing behavior — the selector is truly stale). 3. **`None` returned, OR locator_factory raised** — verification COULD NOT RUN (frame detached after a navigation-triggering click, page closed mid-flight, transient browser-side error). Preserve the candidate at the TAIL of the result list with `verified_unique=False` intact. Synthesis produced it for a reason and the user pointed at SOMETHING when the click was captured. Sorted within the tail by quality_score desc so the best static-heuristic candidate is the first unverified one. Concrete bound on the round-trip: `_resolve` now wraps `loc.evaluate_all(...)` in `asyncio.wait_for(..., timeout=1.0)`. Without that, a click on a page that's mid-navigation can leave the JS round-trip hanging until Playwright's default timeout — unacceptable for an interactive recorder where the user might click ten things in a second, and a hung verify pegs the entire loop. `TimeoutError` is also an `Exception`, so it routes through the same preserve-as-unverified branch as any other failure. Tests Unit tests in `test_selector_verification.py`: - `test_factory_exception_preserves_candidate_as_unverified` replaces `..._is_dropped_not_kept_unverified` — same setup (factory raises), new assertion (candidate preserved at tail, unverified). - `test_factory_none_return_preserves_candidate_as_unverified` pins the explicit-None contract. - `test_factory_unverifiable_tail_sorted_after_verified` drives a mixed list (two clean matches + two raise) and asserts the verified ones lead, the unverifiable tail follows, both halves sorted by quality_score desc. Unit tests in `test_v2_recorder_verify_wire.py`: - `test_locator_factory_raise_preserves_candidate_as_unverified` replaces the old "invalid syntax dropped" assertion — clean matches still lead, the boom candidate ends up at the tail unverified. - `test_locator_factory_raise_preserves_candidate_when_no_other_match` pins the heise.de case directly: all three candidates raise (iframe detached), all three preserved, sorted by qs desc. Real-browser E2E in `test_v2_recorder_e2e.py` (the part the user pointedly asked about): - `test_click_that_navigates_preserves_selector_candidates` drives the recorder against `recorder_multipage_a.html`, clicks the `[data-testid="goto-page-b"]` link (full-page navigation), waits for Page B's heading, then asserts the captured Click has AT LEAST ONE selector candidate. - `test_click_inside_iframe_that_removes_itself_preserves_selectors` uses the new `recorder_iframe_banner.html` (parent) + `recorder_iframe_inner.html` (loaded via `src=`, not srcdoc — srcdoc iframes don't expose a stable `frame_url` for RECORDER-FRAMES tagging). The inner button posts a message that the parent uses to `.remove()` the iframe — exact Sourcepoint shape. Asserts the captured Click has `frame_url` set AND at least one selector candidate. Both E2E tests reproduce the exact failure mode the user reported and pass after the fix (3/3 in series, 8.72 s). Non-integration recording suite: 59/59 green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Sidecar limitation user noticed: the legacy `frame_url: str | None` field stored the iframe's URL but nothing else. The emitter rebuilt the cross-frame locator at serialise time with ONE hardcoded strategy — `iframe[src*="<host>"]` — which: - broke whenever the host was not unique on the page (multiple CMP iframes from the same vendor — exactly the Sourcepoint multi-banner case); - gave the picker no alternative iframe selector to switch to (the user could only re-pick the INNER selector); - had no support for nested iframes. Schema (`backend/src/recording/selector_schema.py`) New `FrameDescriptor { url, selector_candidates }` model. New `frame_chain: list[FrameDescriptor]` field on `RecordedCommand`, default empty for top-frame events + backward compatible with pre-FRAMES-2 sidecars. Order: index 0 is outermost iframe, last entry is the iframe whose document the event came from. Recorder (`backend/src/recording/v2_recorder_task.py`) - `_capture_frame_chain(frame_or_page)` walks parent ancestry via Playwright's `frame.parent_frame` + `frame.frame_element()` (CDP-level call, works cross-origin), bounded by `asyncio.wait_for(..., timeout=1.0)` so a detaching iframe fails fast rather than hanging the recorder. - `_synthesise_iframe_candidates(element_handle, parent_frame, frame_url)` builds ranked candidates per rung using the iframe element's attributes: qs 95 — `iframe[data-testid="..."]` qs 90 — `iframe#<id>` qs 85 — `iframe[name="..."]` qs 75 — `iframe[src="<exact>"]` qs 65 — `iframe[src*="<host>"]` (legacy fallback strategy) qs 40 — `iframe.<first-class>` (last resort) Each candidate is verified by counting matches against the PARENT frame (where the iframe element lives, not where its content does). 0-match candidates dropped; 1-match candidates flagged `verified_unique=True`; multi-match preserved with the flag False. Output sorted (verified DESC, qs DESC). - Wire-up in `_verify_command_candidates`: alongside the inner- selector verification, capture the chain in the same async pass so the iframe still exists when we ask (best chance — the banner-removes-itself flow only takes effect after the user's original click handler completes). Emitter (`backend/src/recording/robot_emit.py`) - `_iframe_chain_locator(cmd)` composes `outer >>> inner` for cross-frame replay, picking each rung's `selector_candidates[0]` (pre-sorted, so the testid/id/name strategy wins when available). Rung with empty candidates falls back to the legacy `iframe[src*="<host>"]` derived from that rung's url — partial chains still produce a valid composite locator. - `_emit_command` prefers the chain when present, falls back to the legacy URL-only path otherwise. Old sidecars keep working without a re-record. E2E coverage (the user explicitly asked us to pin in real-browser tests, including verifying the SIDECAR FILE contents) `backend/tests/fixtures/recorder_iframe_stable.html` — new fixture: parent page with a single iframe that has id, testid, name AND src; the iframe content button does NOT remove the iframe on click (vs. the Sourcepoint flow). Lets the recorder's chain synthesis run cleanly. `test_iframe_click_records_frame_chain_with_id_candidate_in_sidecar`: - Records a click in the stable iframe. - Asserts the captured `RecordedCommand` carries a populated `frame_chain` (the structural fix). - Asserts the first rung's best candidate is testid/id/name strategy (NOT the qs-65 src-host fallback). - Asserts the FIRST candidate's `verified_unique=True`. - Asserts the emitted .robot line uses the high-quality iframe locator (`iframe[data-testid="..."] >>>`) and NOT `iframe[src*="127.0.0.1"]`. - Observed emit: `Click iframe[data-testid="consent-banner"] >>> [data-testid="agree-btn"]` `test_iframe_click_when_iframe_detaches_falls_back_to_url_strategy`: - Records a click in the self-removing iframe (Sourcepoint shape — `recorder_iframe_banner.html`). - Asserts `frame_url` is preserved (the a06f296 regression chain). - Asserts inner `selector_candidates` is non-empty (the 9db5c3b preserve-on-exception chain). - Asserts the emitted line still has an `iframe ... >>> inner` wrapper — i.e., the legacy URL-host fallback fires when the chain capture didn't make it in time. - Observed emit: `Click iframe[src*="127.0.0.1:58004"] >>> [data-testid="agree-btn"]` Plus four unit tests in `test_robot_emit.py::TestFrameChainEmit`: chain-wins-over-legacy, empty-chain-fallback, rung-without- candidates-uses-url-fallback, nested-iframe-composition. Totals: 62/62 non-integration + 5/5 E2E green. Backward compat Pre-FRAMES-2 sidecars (with `frame_url` only) keep working: the emitter checks `frame_chain` first, falls back to the URL-derived single-strategy path. No re-record needed for existing recordings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two unrelated wins bundled because they share user feedback from the same heise.de + debug session. (1) Recorder emit-time defensive disambiguation User report: re-recording on heise.de still produced an unrunnable `.robot` for the cookie-banner "Zustimmen" click. Investigation showed the chain was right and the inner candidates were preserved (the 9db5c3b + 8311b32 fixes did their jobs), but the candidate landing in slot 0 was `text="Zustimmen"` with `verified_unique=False` — the verifier couldn't run because the iframe detached mid-flight, so the verifier's `_with_nth_match` disambiguation never fired. Browser library's strict mode rejects `text="Zustimmen"` at replay because three elements match (the button plus two paragraphs). Fix in `_render_selector`: when the active candidate is `verified_unique=False` AND uses a multi-match-prone strategy (text, role, aria, generic css without an id), wrap the value with `>> nth=0` at emit time. The wrap is suppressed when the selector already carries `nth=`, `>>>`, or `>>` chains so we never double-wrap a verifier-disambiguated candidate or interfere with hand-edited chains. Six unit tests in `test_robot_emit.py::TestDefensiveDisambiguation` pin the contract: unverified text → wrapped, verified text → bare, css with `#` → no wrap (id selectors are unique enough), xpath → never wrapped (synthesis writes explicit-enough xpath), already- disambiguated → not double-wrapped, unverified pure-class CSS → wrapped. Net effect on the existing heise.de sidecar (cmd[1] — the Zustimmen click, frame_chain emptied by the iframe detach): Before: `Click iframe[src*="cmp.heise.de"] >>> text="Zustimmen"` After: `Click iframe[src*="cmp.heise.de"] >>> text="Zustimmen" >> nth=0` The `nth=0` is cosmetic noise but makes the recording RUNNABLE under Browser library strict mode without forcing the user to edit the file by hand. The 5 E2E recording tests stay green because their fixtures all produce testid-strategy candidates which are not in the risky set. Existing test suite: 76/76 unit recording-tests green (defensive wrap doesn't fire on verified candidates which is what the tests assert against), 5/5 E2E green. (2) DebugPanel layout polish User report: - "the stack-trace file paths are too long and bleed into the other panel" - "make the current line number much, much more prominent" - "show the textual content of the current line somewhere" Fixes in `frontend/src/components/debug/DebugPanel.vue`: - Grid layout: `grid-template-columns: 220px minmax(0, 1fr)` plus `min-width: 0` on both children. Without the explicit `minmax(0, ...)` on the second column, CSS grid lets content in the first column overflow rather than truncating — the classic "long file path pushes the call-stack column off- screen and overlaps Variables on the right" symptom. - Stack-file: `overflow: hidden`, `text-overflow: ellipsis`, `white-space: nowrap`, plus the template now renders `basename(frame.file)` (filename only, no path) with the FULL path on the wrapping span's `title` attr — hover surfaces the rest. Same `title` + truncation applied to the stack-name span. - New prominent paused-line callout below the header: shows `LINE` label + line number at 32px / 700 weight in the primary brand colour, with filename basename + keyword name as secondary metadata. Rendered only when paused + line is set; takes the question "where am I in the test?" from "scan the header pill carefully" to "the giant number tells you immediately." Filename truncates on overflow same as the stack-file lines so a long path never breaks layout. i18n: new `debug.panel.lineLabel` key in EN/DE/FR/ES. The "textual content of the current line" piece is deferred — would need either a backend pause-event extension to push the line text alongside the line number, or a frontend `/explorer/.../file` fetch sliced at the line. Both are real follow-ups, scope-managed out of this commit. Frontend totals: 636/636 vitest green, vue-tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User report: re-recorded heise.de "Zustimmen" cookie banner, replay still doesn't work. The selector `iframe#sp_message_iframe_…` they saw in an earlier sidecar doesn't land in the new one. They asked flat out: do I have a successful run? Honest answer: no — my E2E used synthetic fixtures with a stable iframe (`recorder_iframe_stable.html`). For the self-removing iframe shape (`recorder_iframe_banner.html`, modelled on Sourcepoint), the test asserted only that inner selectors were preserved and the emitted line had SOME iframe wrapper — it never asserted the chain itself was populated. So the regression class "iframe id-based locator falls back to URL host when iframe detaches before chain capture" was uncovered. Root cause `_capture_frame_chain` (8311b32) ran AFTER inner-selector verification inside `_verify_command_candidates`. By then the self-removing CMP banner had already detached its iframe; the follow-up `frame.frame_element()` call raised, the chain rung landed with 0 candidates, and the emitter fell back to the legacy `iframe[src*="<host>"]` URL-derivation. Fragile when the host is not unique on the page and impossible to override from the picker because there's only one candidate. Fix architecture — beat the race by capturing the iframe identity BEFORE the click happens JS (top-frame, `capture_script.py`) The top frame's capture script — and only the top frame — now enumerates `document.querySelectorAll("iframe")` on DOMContentLoaded and posts an `iframe_register` event per iframe. Each event carries the iframe's id / name / data-testid / src / classes PLUS a per-candidate uniqueness count computed synchronously via `document.querySelectorAll(candidate).length` in the same tick. Synthesis + verification both happen in JS, in one synchronous slice, before any user click can detach the iframe. (An earlier draft used a MutationObserver to catch late- loaded iframes but that broke iframe click capture — apparently the high-frequency mutation callbacks flooded the binding queue fast enough that subsequent click events were silently dropped. A `load` listener for `<iframe>` elements is the lighter alternative if late-load coverage is needed; this commit ships the initial-scan-only variant since it suffices for static Sourcepoint-style banners.) Backend (`v2_recorder_task.py`) - Per-session `iframe_inventory: dict[str, dict]` indexed by both `iframe_src` and `iframe_contentUrl` (when same-origin and JS can read it). Builds up as register events arrive. - `on_capture` recognises `kind: "iframe_register"` and routes to the registry — returns early, no `RecordedCommand` produced, no slot in the user-visible command stream. - `_capture_frame_chain` now consults the inventory FIRST, falling back to the live `frame_element()` only when no registry hit (cross-origin iframes whose contentUrl JS couldn't read, ad iframes that registered too late, etc.). - New `_candidates_from_inventory(inventory, frame_url)`: exact-match key lookup first, then substring fallback against stored `iframe_src` (handles internal iframe navigations where `frame_url` no longer matches the initial `src` attribute). Maps the JS-side `count` field to `verified_unique = count == 1`, sorts by (verified DESC, qs DESC), drops 0-match candidates. E2E proof The Sourcepoint-shape test now asserts the chain IS populated even though the iframe self-removes within milliseconds of the click: test_iframe_click_when_iframe_detaches_falls_back_to_url_strategy [frames-2-detach] frame_chain candidates (post-detach): ['iframe[data-testid="banner-frame"]', 'iframe#banner-frame', 'iframe[src="recorder_iframe_inner.html"]'] [frames-2-detach] emitted line: Click iframe[data-testid="banner-frame"] >>> [data-testid="agree-btn"] # rbs:5075fc5030b0 That's the same fixture that previously emitted `iframe[src*="127.0.0.1:58004"] >>> …` (host-only legacy fallback). The id/testid-based locator is the same shape that the user's heise.de sidecar HAD ONE recording prior, before the Zustimmen click detached the iframe — and now retains it through the detach. Test totals: 5/5 E2E + 68/68 non-integration recording suite green. Caveats / known limitations carried forward - Cross-origin iframes whose contentWindow.location.href JS can't read register only their initial `src` attribute as the lookup key. If the iframe navigates internally to a different URL before the click, the lookup may miss and we fall back to live `frame_element()` (which may still work if the iframe is still attached at that point). - No nested-iframe support beyond what Playwright's `frame.parent_frame` walk already covers — the JS inventory sees the IMMEDIATE iframes of the top frame; iframes inside iframes get their own register events when the inner top runs the script (recursively, via `add_init_script`'s per-document install), but the chain composition across multiple registers is not yet tested. Real-world consent banners are 1-deep, so this is acceptable for v1. - MutationObserver path deliberately omitted (see comment in capture_script.py). Late-injected iframes that load AFTER DOMContentLoaded won't be inventoried until a workaround lands. The `load` event on iframe elements (capture phase) is the planned next step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… live view Two user-reported bugs, both from the same heise.de Zustimmen recording session: (1) "klappt immer noch nicht" — late-loaded CMP iframes missed The Sourcepoint consent banner on heise.de injects its iframe via its async-loaded SDK ~600-1500ms AFTER `DOMContentLoaded`. The previous proactive-inventory commit (3004b27) scanned exactly once at `DOMContentLoaded`, which is BEFORE the iframe exists. The inventory was empty when the user clicked, the live-API fallback in `_capture_frame_chain` hit a detached frame (the click had already triggered the banner's removal), and `frame_chain[0]` landed with zero candidates → emitter fell back to the legacy `iframe[src*="<host>"]` shape. Retry-scan added: `setTimeout(_registerIframesOnce, …)` at 100 / 300 / 700 / 1500 / 3000 / 5000 ms after init, plus a capture- phase `load` listener on the document so each iframe's own `load` event re-triggers the scan. Dedupe via `seenIframeKeys` (NEW — was missing in the minimal version) keys on `src|id|name|testid`, so re-scans are no-ops for already- registered iframes — no binding-queue flood, no risk of dropped click events (which a MutationObserver-based attempt caused earlier in the same session). New E2E fixture `recorder_iframe_late_load.html` mirrors the Sourcepoint shape exactly: the parent's inline `<script>` does `setTimeout(() => parent.appendChild(<iframe ...>), 600)`. New E2E `test_iframe_loaded_after_DOMContentLoaded_still_registered` records a click inside that iframe and asserts: - `frame_chain[0].selector_candidates` is NON-EMPTY (the retry-scan caught the late-loaded iframe before the click); - the emitted line uses `iframe#sp_message_iframe_1234567` or `iframe[data-testid="cmp-banner"]` or `iframe[name="consent"]`, NOT the legacy URL-host fallback. Observed emit on the new fixture: Click iframe[data-testid="cmp-banner"] >>> [data-testid="agree-btn"] Same shape will work on heise.de — the Sourcepoint iframe id is `sp_message_iframe_<message_id>` and gets caught the same way. (2) "im recorder bildschirm nicht der richtige selektor angezeigt" The Live recorder view's SelectorPicker shows only the active INNER candidate (e.g. `text="Zustimmen"`). The .robot the user saves has the composite form `iframe#sp_message_iframe_… >>> text="Zustimmen" >> nth=0` — which is what'll actually run on replay. Two different mental models on the same screen made it impossible for the user to tell whether their recording was going to work. New util `frontend/src/utils/effectiveSelector.ts` mirrors the Python emitter's `_emit_command` selector-composition logic exactly: - `renderSelector(cand)` — handles strategy prefixes (`xpath=`, `text=`) and defensive `>> nth=0` for unverified risky- strategy candidates (text / generic css / role / aria — same `_RISKY_UNVERIFIED_STRATEGIES` set as the Python side). - `iframeChainPrefix(cmd)` — composes `outer >>> inner` from `cmd.frame_chain`, rung-fallback to URL-host when a rung has no candidates. - `effectiveSelector(cmd)` — the full composite line. Wired into RecordingLiveView as a new `.robot:` preview row under each step in the live list. The picker still shows candidate alternatives in its dropdown (selection lives there); the preview shows what the user is actually going to save. 19 unit tests pin parity with the Python emitter, including: - all four strategy-prefix branches - the six defensive-disambiguation branches (mirrors Python's `TestDefensiveDisambiguation` 1:1) - the iframe-chain composition branches (mirrors `TestFrameChainEmit`) - the heise.de integration: chain + defensive disambiguation + multi-strategy candidate list → expected `iframe#sp_message_iframe_1454968 >>> text="Zustimmen" >> nth=0` - edge cases: no candidates → empty, active_candidate_index not always slot 0 i18n: new `recorder.live.effectiveTitle` tooltip key in EN/DE/FR/ES. Test totals - Backend recording E2E: 6 tests (added the late-load case to the existing 5) — all green. - Frontend vitest: 655/655 (added 19 in effectiveSelector.spec.ts on top of the 636 baseline). 1 pre-existing unhandled-rejection error from main is the DebugPanel.spec.ts noise fixed on release-0.10.0. - `vue-tsc --noEmit` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User-reported screenshot: the Flow Editor node label correctly shows the composite `iframe#sp_message_iframe_14549 >>> text="Zustimmen" >> nth=0`, but the detail-panel SelectorPicker on the right still displayed only the raw inner `text="Zustimmen"` and its dropdown alternatives were likewise inner-only — so the user couldn't tell which alternative would actually run cleanly under Browser library's strict mode after the iframe wrapper + defensive disambiguation are applied. Two-line fix in SelectorPicker.vue: import the helper added in the previous commit and bind both the active-value `<code>` and the per-row `<code>` in the dropdown to `effectiveSelectorForCandidate(cmd, c)` instead of `c.value`. The raw value lives on the `title=` attribute for hover (and the inline-edit input still operates on the raw — editing the verbatim value is the right contract; viewing the composite is the right default display). New util export: `effectiveSelectorForCandidate(cmd, cand)`. `effectiveSelector(cmd)` is now a thin wrapper that picks the active candidate from `cmd` and delegates — same behaviour for existing callers (RecordingLiveView), new behaviour for callers that want "what would happen if I picked THIS row". Test totals: 19 unit tests on effectiveSelector unchanged (parity with Python emitter pinned via the same scenarios). Full vitest: 655/655 green, vue-tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…scopeheal The heal-library lived twice: in-tree under `backend/src/recording/heal/` (~1300 LOC across 4 modules) AND as the already-extracted sibling repo `roboscope-rfheal/`. The two had drifted ~340 lines since extraction (April 28); the in-tree copy carried the more recent RECORDER-FRAMES iframe-wrap guard + type-narrowing fixes that the extracted v0.1.0 didn't have. Single source of truth resolved by moving everyone onto the sibling repo under its new PyPI distribution name `robotframework-roboscopeheal` (Robot Framework community convention — matches `robotframework-browser`, `robotframework-seleniumlibrary`, …). Python import unchanged (`from RoboScopeHeal import …`); only the `pip install` name moves. Sibling repo (`/Users/rat/git/mateo2/roboscope-rfheal`) work — not in this commit but coordinated with it: - pyproject `name` renamed to `robotframework-roboscopeheal` - version bumped to 0.2.0 - URLs repointed to `viadee/robotframework-roboscopeheal` - 4 source files + 7 test files re-synced from in-tree, imports rewritten `src.recording.heal.X` → `RoboScopeHeal.X` - `defusedxml>=0.7.1` added to runtime deps (was implicitly pulled in via the monorepo backend, now made explicit) - CHANGELOG with the RECORDER-FRAMES + rename entries - 100/100 tests green in the sibling venv Backend changes (this commit): - `backend/pyproject.toml`: declare `robotframework-roboscopeheal>=0.2` as a runtime dep. Until v0.2 lands on PyPI, devs install editable via `uv pip install -e ../roboscope-rfheal` (documented inline next to the dep line). The dep spec is forward-compatible — when PyPI ships, the same constraint resolves to the published distribution. - `backend/src/execution/router.py` + `backend/src/stats/ service.py`: import `parse_heal_audit` from `RoboScopeHeal.heal_report` instead of `src.recording.heal.heal_report`. 3 imports total. - `backend/tests/recording/test_iframe_locator_contract.py`: same rewrite (2 imports). - Delete `backend/src/recording/heal/` entirely — 4 source modules + __init__.py, ~1290 LOC removed. - Delete `backend/tests/recording/heal/` entirely — 7 test files + __init__.py, the same tests now live in the sibling repo where they're closer to the code under test AND get exercised by the rfheal CI rather than only when someone runs the RoboScope monorepo's pytest suite. Migration verification — 164 tests green across the boundary: - `tests/recording/test_robot_emit.py` : 36 ✓ - `tests/recording/test_v2_recorder_verify_wire.py`: 14 ✓ - `tests/recording/test_iframe_locator_contract.py`: 9 ✓ (new import path) - `tests/execution/test_heal_report_endpoint.py` : 5 ✓ (the one consumer of `parse_heal_audit` outside the heal layer) - `roboscope-rfheal/tests/` :100 ✓ (entire heal test surface, in the sibling repo's venv) Caveats while PyPI is not yet published: - Fresh clones need `uv pip install -e ../roboscope-rfheal` before `make dev` works. README + CLAUDE.md updates deferred until the PyPI flip; the inline comment in pyproject.toml is the immediate signpost. - The 4 RoboScopeHeal modules now live in TWO places in the user's working tree (`roboscope/backend/` ↛ ↻ `roboscope-rfheal/`). The in-tree copy is GONE; the sibling is the single source. Future hotfixes against the heal layer go to the sibling repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ut PyPI Story HEAL-VENDORED. The previous extraction (`refactor(heal): move heal library out` — `f15f9d1`) made the heal library a runtime dependency on `robotframework-roboscopeheal>=0.2`, which is fine for the author's local dev loop (sibling-repo + uv.sources) but fails for EVERYONE else: - Fresh clone of RoboScope without sibling rfheal repo → `uv sync` fails with "path not found". - Standalone offline ZIP (`roboscope_offline_<platform>.zip`) pre-downloads wheels via `pip download`; the heal dep isn't on PyPI yet so it gets silently skipped and the install bundle ships without the heal library. - Docker build hits the same wall. This commit ships the heal library WITH every RoboScope release, zero external dependencies. PyPI publication is now an optional optimization, not a blocker for shipping. Vendor directory `backend/vendor/robotframework-roboscopeheal/` carries the full source tree from the upstream `roboscope-rfheal` repo at v0.2.1: - 4 source modules (candidate_finder, fingerprint, heal_report, library) + __init__.py - pyproject.toml declaring name=`robotframework-roboscopeheal`, version=`0.2.1`, with `robotframework-browser` now an OPTIONAL `[browser]` extra (the previous hard dep forced pip to resolve Playwright + node + Chromium at install time, bloating the offline bundle by ~80 MB for an import-time check that never happens — the heal lib delegates to Browser via `BuiltIn().run_keyword()` at TEST-RUN time, not at module-import time, so install + import succeed without Browser present). - LICENSE + NOTICE (Apache-2.0 provenance preserved per license requirements). - CHANGELOG with the 0.2.0 + 0.2.1 history. Tests deliberately NOT vendored. They live in the upstream rfheal repo where they're tied to that repo's CI; vendoring would double the test surface in the RoboScope monorepo for zero new signal — RoboScope's own e2e Recorder tests + the heal-report-endpoint test already exercise the integration boundary. backend/pyproject.toml - `[tool.uv.sources]` re-pointed from `../roboscope-rfheal` (sibling, dev-only) to `vendor/robotframework-roboscopeheal` (committed, travels with every clone). - Inline comments rewritten to reflect the vendored model instead of the manual-sibling-install model. PyPI flip is now purely a decision about which install source to prefer. scripts/sync-roboscopeheal.sh - New helper. Copies sibling `../roboscope-rfheal/` into the vendor tree after showing the user a brief recursive diff + asking for confirmation. Excludes the sibling-only directories (tests/, uv.lock, .gitignore, dev caches). `ROBOSCOPE_SYNC_ASSUME_YES=1` bypasses the prompt for scripted release-prep use. scripts/build-mac-and-linux.sh - New step in the wheel-collection block: builds the `robotframework-roboscopeheal` wheel from `backend/vendor/...` via `python -m build --wheel`, drops it next to the pip-downloaded wheels in `$DIST/wheels`. `install.sh`'s `pip install --no-index --find-links wheels/` matches it by version automatically. - Transient `pip install --upgrade build` to bootstrap the build tool on hosts that don't have it (typical CI runners don't). - Falls through with a warning if the vendor directory is missing — better than failing the entire release script, since a hand-edited bundle could plausibly skip the heal library. backend/tests/test_vendored_rfheal_present.py — 11 tests - Asserts the vendor directory exists. - Asserts each of the 8 canonical files (4 modules + __init__ + pyproject + LICENSE + NOTICE) is present. - Asserts the vendored pyproject declares the canonical distribution name (catches an upstream rename without a matching vendor sync). - Asserts `RoboScopeHeal.__version__` (installed) matches the vendored `pyproject.toml` version (catches the "stale wheel in venv, fresh source on disk" scenario). Verification matrix - 11/11 vendor-presence tests green. - 70/70 heal-tangential tests green (test_robot_emit, test_v2_recorder_verify_wire, test_iframe_locator_contract, test_vendored_rfheal_present). - Sync script tested both ways: detects drift, no-ops on clean state. - Wheel build from vendor: `uv build --wheel` produces `robotframework_roboscopeheal-0.2.1-py3-none-any.whl` (32 KB). - Clean-venv install of the wheel pulls only `defusedxml` + `robotframework` + the package itself — `robotframework-browser` correctly stays out of the install closure. - Import works cleanly: `>>> from RoboScopeHeal import RoboScopeHeal, parse_heal_audit` PyPI flip path (when access lands) 1. Push the upstream `roboscope-rfheal` repo to `github.com/viadee/robotframework-roboscopeheal`. 2. `cd ../roboscope-rfheal && uv build && twine upload …`. 3. In THIS repo: remove `[tool.uv.sources]` from `backend/pyproject.toml` — the dep then resolves from PyPI. 4. The vendor directory CAN stay for offline-install fallback OR be removed; orthogonal decision. The build-script wheel- build step short-circuits with a warning if vendor is missing. Sibling repo (`roboscope-rfheal/`) sees a coordinated 0.2.1 commit (`8a6806c`) with the same `robotframework-browser` → extra move. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t venv Story HEAL-VENDORED phase 2. The first phase (`436ea54`) made the heal library available to the RoboScope SERVER (dashboards, the heal-report endpoint) by vendoring the source. That solves perspective 1 — RoboScope's own code can `import RoboScopeHeal`. But the more important consumer is perspective 2: the user's test cases. A user writes *** Settings *** Library Browser Library RoboScopeHeal *** Test Cases *** Login Heal Click [data-testid="submit"] …and expects "Save & Run" to just work. Today it doesn't: the heal library lives in the BACKEND venv (vendor-editable-install), but the user's .robot test runs in a PROJECT venv that RoboScope created via `environments/tasks.py::create_venv` — and that venv only had `robotframework` pre-installed. `pip install robotframework-roboscopeheal` from the package-management UI would have failed (the name doesn't resolve on PyPI yet). Fix `environments/tasks.py::create_venv` now also installs the heal library from the vendored source tree as a sibling step to the default `robotframework` install. The user gets it for free on every fresh venv — no UI roundtrip, no PyPI dependency. - New `_vendored_heal_path()` helper: resolves `backend/vendor/robotframework-roboscopeheal/` relative to `tasks.py`'s location, returns an absolute Path. `uv pip install` happily accepts a local source-tree path as a positional package spec. - New `_install_vendored_heal_into_venv(venv_path, env_id)` helper: runs `pip_install_cmd(venv_path, vendor_path)` via subprocess, NON-FATAL on failure (heal is opt-in test ergonomics, not a hard requirement — a build error here shouldn't tank the entire venv creation). Three failure branches: * vendor dir missing → log WARNING, return. Watchdog test in `test_vendored_rfheal_present.py` catches this in CI. * pip exit non-zero → log WARNING with rc + stderr tail. User can install manually from the package-management UI later. * subprocess raises (uv missing on PATH, OS error) → log WARNING + exc_info, return. - Wired into `create_venv` immediately after the `robotframework` install. Heal is now part of "the canonical starter set" for every project. Why install only at create-time (not refresh): users who explicitly remove heal from their venv via the package-management UI shouldn't have it silently re-added on every backend restart. The create-time-only contract makes the auto-install user-overridable: install heal once, then either keep it or uninstall it for good. Verification 5 unit tests in `test_vendored_heal_auto_install.py`: - `_vendored_heal_path()` resolves to an absolute path whose layout matches the vendor tree (`name == "robotframework- roboscopeheal"`, parent == `vendor`) AND the resolved pyproject.toml actually exists on disk. - missing-vendor-dir → no subprocess fired, WARNING logged. - happy path → vendor path string appears in pip argv, INFO log emitted. - pip exit non-zero → WARNING with rc=1 logged, no raise. - subprocess raises (FileNotFoundError("uv: not found")) → WARNING with "raised" logged, no exception escapes. Integration smoke (executed manually, not in CI to avoid the ~20s venv-build cost): $ python -c "<inline smoke harness>" venv: /var/folders/.../rs-heal-smoke-llzid0eg/.venv venv create rc=0 rf install rc=0 import test: v 0.2.1 stderr= End-to-end: fresh tempdir venv → robotframework installed → heal auto-installed from vendor → `python -c "import RoboScopeHeal; print(RoboScopeHeal.__version__)"` outputs "0.2.1" cleanly. Now closes perspective 2 — anyone who downloads RoboScope (source clone, offline ZIP, Docker) can use `Library RoboScopeHeal` + `Heal *` keywords in their .robot tests on day one, without PyPI access and without manually fishing the vendored wheel out of the bundle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tection Three intertwined fixes that came out of a heise.de Sourcepoint banner debugging session — chasing why a recorded `Click iframe#x >>> text="Zustimmen" >> nth=0` was producing wrong results after swapping selectors in the FlowEditor's right detail panel. 1. Swap writes the composite, not the raw inner `applySelectorSwap` previously copied only `candidate.value` into `step.args[0]`. The SelectorPicker UI advertises `effectiveSelectorForCandidate(cmd, c)` — iframe-chain prefix + `renderSelector` inner + defensive `>> nth=0` — as "what gets saved", but the swap path dropped everything except the raw inner. Result: a swap on an iframe-recorded command silently stripped the iframe wrap, the .robot fired a top-frame click, and the user saw "the selector you picked from the menu isn't what landed in the file". Now `applySelectorSwap` (and the new sibling `composeEffectiveSelector` used by Edit / Add) compose via `effectiveSelectorForCandidate` — same function the picker display uses. What-you-see is now what-gets-saved by construction, not by parallel implementations that can drift. 2. isCustomSelectorValue: symmetric composite-match, not regex strip The "eigener Wert, nicht aus der Aufzeichnung" badge AND the `window.confirm` swap-overwrite dialog used a shape-specific strip regex `^iframe\[[^\]]+\] >>>` to peel decorations off `step.args[0]` before comparing against `candidate.value`. That regex only matched ATTRIBUTE-CSS iframe candidates (`iframe[src*="…"]`); the heise.de sidecar uses an id-based candidate (`iframe#sp_message_iframe_1454968`) which slipped through. Every legitimate swap fired the confirm prompt and badged itself as custom. Worse: after fixing the iframe regex to `lastIndexOf(' >>> ')`, the SAME asymmetry hit `xpath=` / `text=` prefixes that `renderSelector` adds on the write side. A swap to an xpath candidate landed `iframe#x >>> xpath=//button[…]` in args[0] but `candidate.value` is `//button[…]` (no `xpath=`) → strip still didn't match → false-custom again. New approach is a 3-step hybrid: 1. Raw exact match against `c.value` (legacy / bare values). 2. STRICT composite match: `effectiveSelectorForCandidate( cmd, c) === current` for every candidate. Symmetric with the write path — anything any swap COULD have written is recognised as non-custom by construction. Picks up the `xpath=` / `text=` prefixes, the iframe-chain (any strategy / nesting depth), and the defensive nth=0 uniformly because they ALL live behind that one function. 3. Loose fallback: strip `lastIndexOf(' >>> ')` + nth=N from current, compare against `c.value` and `renderSelector(c)`. Only catches legacy sidecars whose `frame_chain` was lost but the .robot still carries an iframe wrap — strict step 2 would miss those. 3. `effective_override` field — user-supplied verbatim emit form The composite is auto-built from synthesised iframe candidates + risky-strategy defensive nth. On heise.de the synthesised iframe rung was `iframe#sp_message_iframe_1454968` (session-specific message_id) — replay-stable only if Sourcepoint hands back the same id, which it doesn't always. The user had no way to substitute a hand-tuned chain locator (`iframe[src*="cmp.heise.de"]`, host-substring) without ditching the structured candidate. New field on `SelectorCandidate`: `effective_override: str | None`. When set non-empty, every layer — the FlowEditor composer, the Python emitter (web `_emit_command` + desktop `_emit_desktop_command`), the SelectorPicker's display, the custom-detection — short-circuits to the verbatim string. Strategy + value stay tied to quality classification (the coloured quality dot still reflects the locator's stability), they're just decoupled from the emit form. UX in the SelectorPicker's ✏ Edit form and the "+ Add custom" row: a third "Effektiv" input below strategy + value, prefilled with the auto-composed form, live-synced with value/strategy until the user types in it (then it decouples and becomes the override). Orange-tinted border + "Override aktiv" badge + ↺ reset button when the typed form differs from auto. On commit, if `effective === auto` the override is CLEARED (back to recompose); else stored verbatim. Pydantic schema gets the new field with default None; legacy JSON sidecars round-trip cleanly (test pinned). The four locale files (en/de/fr/es) get six new i18n keys for the input label, placeholder, tooltip, reset title, override-badge title, badge text. Verification Frontend: 70 / 70 unit tests green incl. - 3 new effectiveSelector override tests (verbatim short-circuit, null / empty / whitespace fallbacks) - 1 new applySelectorSwap override test (composite via override) - 2 new SelectorPicker override tests (Edit + Add emit verbatim override on `effective` payload field) - 3 new isCustomSelectorValue regressions (id-based iframe shape, xpath-after-swap, multi-level chain) - vue-tsc clean Backend: 56 / 56 recording tests green incl. - 4 new TestEffectiveOverride cases (skip wrap+nth, empty fallback, JSON round-trip, legacy-without-field load) - Desktop emitter mirrors the same override contract Pinned by test_robot_emit.py + test_selector_schema.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…able Two unrelated stats-view bugs caught in the same session: 1. Success-rate chart x-axis labels were spread across the FULL container width via `justify-content: space-between`, but the bars themselves stop at `.chart-bar { max-width: 20px }` and don't fill the full width. So the bars sat left-stacked with a chunk of whitespace on the right while the labels ran 0 % → 100 % of the parent — first label aligned with bar 0 but the last label drifted right of the last bar by however many pixels of whitespace the bars left unfilled. Fix: render one x-axis slot per bar (same `display: flex; gap; min-width: 4px; max-width: 20px` as `.chart-bar`), only populate the slots whose bar index matches a chosen label position. `chartXLabels` now carries `{ idx, text }` instead of just `text`; `labelForBar(i)` looks up the text for slot `i`. Each visible label sits directly under its bar by construction. 2. The Pass/Fail Trend table showed dates ascending (oldest at the top) — opposite of every other "newest first" pattern in the app and what users expect from an at-a-glance view. Fix: `trendsDesc = computed(() => [...stats.trends].reverse())` and v-for iterates that. Backend still returns ascending, which keeps the SUCCESS-RATE chart's left-to-right chronological read correct — only the table render flips. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…kage UI Implements the "Vendor-default + PyPI-upgrade" distribution model chosen for the heal library. Heal is now first-class in the Package Management UI: a "ships with RoboScope" badge marks the entry, a one-click install resolves to the bundled vendor copy today, and an explicit version pin will hit PyPI (once published) as the upgrade path past whatever version RoboScope shipped with. Backend - `POPULAR_RF_LIBRARIES` (router.py) gains a heal entry with a new `shipped_with_roboscope: True` flag. The frontend's existing popular-package loop picks it up automatically; the flag is the only data attribute that distinguishes shipped libraries from ordinary popular ones. - `_SHIPPED_VENDOR_PACKAGES` registry in `tasks.py` — central map from PyPI distribution name to vendor directory under `backend/vendor/`. Currently has one entry (`robotframework-roboscopeheal`). Adding a library to this registry is all that's needed for the install-resolution logic to redirect "no-version install" requests to the on-disk source. - `_shipped_vendor_path(name)` — case-insensitive lookup helper. Returns the absolute vendor path if the name is registered AND the directory exists on disk, else None. The "exists" check protects against a stripped-tree release accidentally trying to `uv pip install /path/that/does/not/exist` — caller falls back to PyPI cleanly with a single WARN log. - `install_package` (tasks.py) consults the registry before building the pip argv. The rule: request shape resolves to ------------------------------- ------------------ install("heal", version=None) vendor source path install("heal", version="0.4") "heal==0.4" → PyPI install("other", *) "other"/"other==X" → PyPI Logs at INFO when the shipped-path resolution kicks in so the reason for an unexpected source is traceable. Frontend - `EnvironmentsView.vue` template adds a `variant-badge shipped` span next to the package name when `pkg.shipped_with_roboscope` is truthy. Blue tint (`#dbeafe / #1e40af`) to distinguish from the existing green "Recommended" and amber "Requires Node.js" badges. The `popularPackages` ref type was extended with the new optional field so vue-tsc accepts the prop. - Hover tooltip on the badge explains the install behavior: "klicking Installieren uses the bundled copy; pin a version to fetch from PyPI". Translated EN/DE/FR/ES (two new keys each: `environments.shippedWithRoboscope` + `shippedWithRoboscopeTitle`). Tests - `test_tasks.py::TestInstallPackage` — two new cases: * `test_shipped_no_version_installs_from_vendor_path` — pins that the pip argv carries the vendor path (containing "vendor/robotframework-roboscopeheal") and NOT the bare package name (which would trigger PyPI). * `test_shipped_with_version_goes_to_pypi` — explicit version bypasses the vendor; argv has "robotframework-roboscopeheal==0.4.0" and zero vendor-path leakage. - `test_vendored_heal_auto_install.py` — four new `_shipped_vendor_path` cases: real package resolves, case- insensitive lookup matches mixed/upper case, unknown packages return None (no spurious vendor redirect for ordinary PyPI installs), missing vendor dir → None + WARN. - `test_tasks.py::TestCreateVenv::test_creates_venv_with_uv` — updated expected subprocess count 2 → 3 (heal auto-install third call), and assertion that the third argv carries the vendor path. Pre-existing test failure left over from commit 0ee1e23 — fixed in passing as part of this work. 29 / 29 backend tests in the touched modules pass. When PyPI happens The flip is minimal: drop the entry from `_SHIPPED_VENDOR_PACKAGES` (heal stops being treated as shipped), drop the heal seed from `create_venv` (project venvs no longer get the auto-install), and remove `[tool.uv.sources]` from `backend/pyproject.toml` (backend resolves heal from PyPI). The "ships with RoboScope" badge disappears naturally because the flag is gone from the popular-libraries response. CLAUDE.md / CHANGELOG documents the model so the flip is a single small PR rather than an archaeology exercise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…+ offer rfbrowser init When a subprocess-runner test fails with the classic Browser- library error "browserType.launch: Executable doesn't exist at .../chromium-NNNN/...", surface a yellow banner at the top of the ReportDetailView with a one-click "Run rfbrowser init" button. The button POSTs to the existing `/api/v1/environments/{env_id}/rfbrowser-init` endpoint and flips to a "started" state on success. No more "open terminal, activate venv, run rfbrowser init, deal with conflicts" workflow for the most common Browser library install pitfall. Story HEAL-DIAG-1. Architecture - `backend/src/execution/diagnostics.py` (new) — pure detection layer. `detect_report_diagnostic(run, results)` walks the run's error_message plus every test_result.error_message, pattern-matches against a registry of detectors, and returns the first match's payload or None. Today only one detector is registered (`_detect_playwright_browser_missing`); adding more is a 3-line change (function + entry in `_DETECTORS` + locale section + banner-renderer hook). Detection regex matches BOTH the literal `browserType.launch: Executable doesn't exist` line AND the trailing "Looks like Playwright was just installed" ASCII box — Browser library versions emit one, the other, or both depending on which Playwright minor version is bundled, and the detector shouldn't be silently brittle to phrasing tweaks. Gating: only fires for `runner_type == SUBPROCESS` AND non-null `environment_id`. The docker runner has its own browser provisioning baked into the image (rfbrowser init on the host wouldn't help the container); a subprocess run without an environment has no venv to init against. - `ReportDetailResponse.diagnostic: dict | None` — backend serialises the detector result alongside the existing report + test_results. Discriminated union by `code`; today's only value is `playwright_browser_missing`. - `reports.router.get_report_detail` calls the detector at response time. No new DB columns, no schema migration — diagnostic is derived from already-stored data. Frontend - `RunDiagnosticBanner.vue` (new) — a small banner component that takes a `RunDiagnostic` prop, renders the title / description / action label from i18n keys keyed on the diagnostic code (so a new code = locale section + done; no component change required), and on click POSTs to the EXACT endpoint the backend advertised. The frontend doesn't hard- code `/environments/N/rfbrowser-init` — keeps the door open for a future "out-of-disk-space" diagnostic that would POST to a totally different endpoint. Phase machine: idle → triggering → started OR failed. The "started" state shows a ✓ badge and hides the button (no auto-polling — the Environments view owns the install- progress UI; the banner just kicks the job off). The "failed" state surfaces the backend error detail so the user can self-diagnose without devtools, and keeps the button visible for a retry. Re-entry guard: rapid double-clicks during the triggering phase short-circuit to a single POST. Without it, two overlapping init runs would fight over the same venv. - `ReportDetailView.vue` — mounts the banner above the tabs so it's visible regardless of which view (Summary / HTML Report) the user is on. - Type: `RunDiagnostic` in `domain.types.ts` mirrors the backend payload. `ReportDetail.diagnostic?: RunDiagnostic | null`. - i18n: 6 new keys per locale (EN/DE/FR/ES) under `reports.diagnostic.*`: title + description + action label for `playwright_browser_missing`, plus shared startedBadge / startedMessage / failedMessage strings. Tests - Backend: `tests/execution/test_diagnostics.py` — 11 cases covering the real heise.de error blob (pinned verbatim from a live run so Playwright wording drift trips CI), run-level vs test-level error placement, action payload shape (env_id + endpoint + method match what the banner expects), gating conditions (docker runner → None, missing env_id → None, None run → None), and regex robustness (case-insensitive, each OR branch matches alone). - Frontend: `tests/components/RunDiagnosticBanner.spec.ts` — 6 cases covering i18n title/description rendering in EN + DE, endpoint-from-payload POST contract, started-phase badge swap, failed-phase error detail surface, re-entry guard (3 rapid clicks → exactly 1 POST). - E2E: `e2e/tests/run-diagnostic-banner.spec.ts` — 2 cases. Both mock `GET /reports/{id}` (avoids a 30 s real subprocess run + a few hundred MB Playwright download) and intercept the action endpoint to assert the EXACT URL was posted. Positive case verifies banner visibility, label localisation, button click → started badge, action endpoint hit. Negative case verifies the banner stays absent when the report has no diagnostic on it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t tests 47 new tests covering HEAL-1 per-step toggle mode classification and keyword rewrite roundtrip, and HEAL-2 suite-level state machine plus enable/disable/library-import wiring. Story files updated to `review`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

26 new Playwright tests covering: - recorder-lifecycle.spec.ts: SSE auth guards (401/403/404), phase pill transitions (browser_starting → browser_ready → browser_crashed), restart-browser click, extension-transport non-stuck edge case. - heal-toggle.spec.ts: HEAL-VENDORED out-of-box test (fresh venv auto- seeds robotframework-roboscopeheal without PyPI, importable at RF runtime); HEAL-1 per-step checkbox (hidden on Log, visible on Click, keyword rewrite); HEAL-2 suite toggle (enable/disable all, revert). - debug-session.spec.ts: API guards (422/404), debug button visibility by run status, 424 prereq dialog cancel path, install+retry path, 409 dedup, no-output.xml fallback path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

raffelino and others added 30 commits May 8, 2026 11:16

raffelino and others added 10 commits May 11, 2026 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(heal): Self-Healing toggles + vendored rfheal library#41

feat(heal): Self-Healing toggles + vendored rfheal library#41
raffelino wants to merge 40 commits into
mainfrom
feat/heal-toggle

raffelino commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

raffelino commented May 15, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant