Skip to content

chore(cua): sync driver v0.1.4#1586

Merged
zerob13 merged 1 commit intodevfrom
codex/cua-sync
May 7, 2026
Merged

chore(cua): sync driver v0.1.4#1586
zerob13 merged 1 commit intodevfrom
codex/cua-sync

Conversation

@zerob13
Copy link
Copy Markdown
Collaborator

@zerob13 zerob13 commented May 6, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Claude Code computer-use compatibility mode with dynamic server naming
    • Introduced window-scoped screenshot capability with optional file output
    • Enhanced keyboard event delivery with optional authentication envelope
    • Unified typing functionality with configurable inter-character delay fallback
  • Improvements

    • Refined window and process management for better cross-app automation
    • Extended hotkey support for NSMenu keyboard shortcuts
  • Documentation

    • Added comprehensive sync planning and specification documents
    • Updated tool and capability documentation for v0.1.4 compatibility
  • Tests

    • Expanded integration test coverage across Safari, Chrome, Tauri, Electron, and Blender platforms

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

📝 Walkthrough

Walkthrough

This PR synchronizes the vendored CUA Driver from v0.0.15 to v0.1.4, removing deprecated tools (get_accessibility_tree, type_text_chars), unifying text input with delay-based fallbacks, adding Claude Code compatibility shims, and introducing comprehensive integration tests and documentation for the update process.

Changes

CUA Driver v0.1.4 Sync Planning & Coordination

Layer / File(s) Summary
Sync Strategy & Scope
docs/issues/cua-driver-v014-sync/spec.md, docs/issues/cua-driver-v014-sync/plan.md, docs/issues/cua-driver-v014-sync/tasks.md
New documentation files outline the sync goals (preserve DeepChat runtime behavior), acceptance criteria (remove certain upstream tools, maintain type_text with delay_ms fallback), and a prioritized 6-step task checklist including conflict resolution, metadata updates, skill patching, and validation steps.

Local Fork Compatibility Layer

Layer / File(s) Summary
Tool Policy Updates
plugins/cua/plugin.json, plugins/cua/policies/tool-policy.json
Removed get_accessibility_tree ("allow") and type_text_chars ("ask") tool entries; renamed type_text_chars to type_text in tool-policy.json to align with upstream v0.1.4.
Skill Documentation
plugins/cua/skills/cua-driver/SKILL.md
Updated MCP tool list to reference only type_text, removing obsolete type_text_chars reference.
Upstream Metadata
plugins/cua/vendor/cua-driver/upstream.json
Updated cherry-pick metadata: tag, commit, version, and appliedAt fields changed from v0.0.15 (commit 8a4c513...) to v0.1.4 (commit d422294...) applied 2026-05-06.

CUA Driver Core Implementation (v0.1.4 Upstream)

Layer / File(s) Summary
Version & Build Config
plugins/cua/vendor/cua-driver/source/.bumpversion.cfg, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/CuaDriverCore.swift
Version constant and bumpversion config updated from 0.0.15 to 0.1.4.
Documentation & Installation
plugins/cua/vendor/cua-driver/source/README.md, plugins/cua/vendor/cua-driver/source/scripts/install.sh, plugins/cua/vendor/cua-driver/source/scripts/install-local.sh, plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh, plugins/cua/vendor/cua-driver/source/Skills/cua-driver/*
Added Claude Code computer-use compatibility section to README; inserted MCP registration instructions in install scripts; added cache and Claude config cleanup to uninstall; updated skill documentation to remove type_text_chars and add delay_ms guidance.
Data Shape & State
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/AppState/AppState.swift
AppStateSnapshot gains new optional screenshotFilePath: String? field with full encode/decode support for disk-based screenshot output.
App Lifecycle & Window Capture
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Apps/AppLauncher.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Capture/WindowCapture.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Focus/FocusGuard.swift
AppLauncher now unhides apps after background launch; WindowCapture docs updated to reference list_windows; FocusGuardError message updated to recommend type_text with delay_ms.
Input Handling & Focus
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/KeyboardInput.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/MouseInput.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/FocusWithoutRaise.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/SkyLightEventPost.swift
KeyboardInput adds attachAuthMessage parameter for flexible event delivery; MouseInput adds useFrontmostHIDPath to conditionally route through HID tap; FocusWithoutRaise now resolves PSN from window ID via modern SkyLight APIs; SkyLightEventPost adds new PSN/window-owner functions and setFrontProcessNoWindows for NSMenu shortcuts.
CLI Infrastructure
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CuaDriverCommand.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CallCommand.swift
Added DumpDocsCommand subcommand; updated MCPConfigCommand with claudeCodeComputerUseCompat flag and dynamic server-name generation; refactored MCPServersSnippet to accept serverName/args; renamed image-out to screenshot-out-file in CallCommand; added AppKit bootstrap helper; ConfigCommand now supports capture_mode and max_image_dimension config keys.
CLI Documentation Extraction
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/Docs/CLIDocExtractor.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/DumpDocsCommand.swift
New CLIDocExtractor module with CommandDoc, ArgumentDoc, OptionDoc, FlagDoc, CLIDocumentation types; DumpDocsCommand outputs CLI/MCP/combined docs as JSON with optional pretty-printing.
MCP Server Layer
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/CuaDriverMCPServer.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ToolRegistry.swift, plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Docs/MCPDocExtractor.swift
MCP server factory now accepts optional serverName parameter; ToolRegistry removes GetAccessibilityTreeTool and TypeTextCharsTool from default handlers; MCPDocExtractor provides tool documentation extraction.
Tool Implementations
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/
GetAccessibilityTreeTool removed entirely; TypeTextCharsTool removed; TypeTextTool merged to support AX-based bulk insert with CGEvent delay_ms fallback; GetWindowStateTool adds screenshot_out_file argument for disk-based output; ScreenshotTool now requires window_id; PressKeyTool and HotkeyTool add window_id for NSMenu shortcuts; LaunchAppTool updated description for background launch; SetValueTool improved with animation and type_text fallback; ClickTool comment updated.
Claude Code Compatibility
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ClaudeCodeComputerUseCompatTools.swift
New compatibility shim exposing window-scoped screenshot tool with base64 image and scale-factor tracking for Claude Code computer-use flows.

Integration Test Suite (v2 Harness)

Layer / File(s) Summary
Test Infrastructure & Fixtures
plugins/cua/vendor/cua-driver/source/Tests/integration/conftest.py, plugins/cua/vendor/cua-driver/source/Tests/integration/driver_client.py
New pytest conftest with fixtures for binary, driver, focus_monitor, html_server, tauri_app, electron_app; enhanced default_binary_path resolution to honor environment variable and scan multiple candidate paths.
Test Harness Modules
plugins/cua/vendor/cua-driver/source/Tests/integration/harness/driver.py, plugins/cua/vendor/cua-driver/source/Tests/integration/harness/tree.py, plugins/cua/vendor/cua-driver/source/Tests/integration/harness/cv.py, plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py
New Driver class wrapping DriverClient with window discovery and action wrappers; tree_markdown parser (find, find_all, ax_value, ax_text, contains_text); computer-vision module (decode, crop, diff_ratio, find_template, save/load_reference); UXMonitor for polling cursor/frontmost/overlay violations.
Test Assets & Fixtures
plugins/cua/vendor/cua-driver/source/Tests/integration/assets/test_page.html, plugins/cua/vendor/cua-driver/source/Tests/integration/fixtures/form_all_inputs.html
New test HTML pages with interactive UI (button counter, text inputs, canvas, etc.) for cross-browser validation.
Browser & App Tests
plugins/cua/vendor/cua-driver/source/Tests/integration/test_safari.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_chrome.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_electron.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_tauri.py
New test modules validating button clicks, text input, canvas interaction, dropdown selection, checkbox toggling, and no-focus-steal behavior across Safari, Chrome, Electron, and Tauri test apps.
Background & Focus Tests
plugins/cua/vendor/cua-driver/source/Tests/integration/test_background_focus.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_background_menu_shortcut.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_launch_app_visible.py
Updated background focus test to use type_text with delay_ms; added new module for hotkey NSMenu activation with window_id; added launch_app visibility documentation.
Specialized & End-to-End Tests
plugins/cua/vendor/cua-driver/source/Tests/integration/test_chrome_minimized_nav.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_blender.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_chrome_form_fill.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_form_fill_qwen.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_launch_safari_bg.py
Updated Chrome minimized nav test to use type_text with delay_ms; added Blender background automation tests (window state, menu hotkeys, screenshot NCC); added Hermes-driven form-fill tests for Chrome and Qwen; added Hermes-driven Safari launch test.
Removed Tests
plugins/cua/vendor/cua-driver/source/Tests/integration/test_double_click_delivery.py, plugins/cua/vendor/cua-driver/source/Tests/integration/test_pixel_click_delivery.py
Deleted obsolete test modules for double-click and pixel-click delivery (replaced by newer test approach).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A driver hops from old to new,
Type-text with delay, tried and true;
Tools polished, screenshots now on disk,
Windows focused without a risk.
With harness tests that leap and play,
The CUA bounds into the day! 🌟

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/cua-sync

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/LaunchAppTool.swift (1)

10-27: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tool description contradicts itself after the v0.1.4 behavior change.

The opening paragraph still claims Launch a macOS app hidden — ... the target's window is not drawn on screen, but the new paragraph at lines 23–27 (and the actual AppLauncher.launch implementation, which now calls unhide()) state that the window does appear on screen. LLM clients consume this description verbatim; the self-contradiction will produce inconsistent tool selection.

Since this file is vendored upstream and the PR scope is a sync, consider either an upstream report or a minimal local compatibility edit to reconcile the two paragraphs (e.g., drop "hidden" / "not drawn on screen" from the opener so it matches the unhide semantics).

Suggested wording reconciliation
-                Launch a macOS app hidden — the driver never brings the
-                target to the foreground, and the target's window is not
-                drawn on screen. This tool is strictly a background-launch
-                primitive; if a human operator wants to switch the
-                foreground app, they should do so through normal macOS UI
-                (Dock, Cmd-Tab, Spotlight), not through this driver.
+                Launch a macOS app in the background — the driver never
+                brings the target to the foreground. The window may appear
+                on screen but never steals focus. This tool is strictly a
+                background-launch primitive; if a human operator wants to
+                switch the foreground app, they should do so through
+                normal macOS UI (Dock, Cmd-Tab, Spotlight), not through
+                this driver.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/LaunchAppTool.swift`
around lines 10 - 27, The tool description in LaunchAppTool.swift contradicts
AppLauncher.launch behavior (which now calls unhide()) by saying the app is
launched "hidden" and its window "is not drawn on screen"; update the docstring
in LaunchAppTool (the multi-line description at the top of the tool) to reflect
the current behavior — e.g., remove "hidden" / "not drawn on screen" and state
that the app is launched in the background with its window unhidden but it does
not steal focus — or alternatively file an upstream report and add a brief
compatibility note; ensure the change references the LaunchAppTool description
and AppLauncher.launch semantics so LLM clients see consistent behavior.
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ToolRegistry.swift (1)

34-44: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Confirm: type_text_chars removal breaks replay of old trajectory recordings.

type_text_chars is removed from actionToolNames (line 34-44) and its handler is removed from the default registry (lines 214-245), but ReplayTrajectoryTool (line 144-154) invokes tools by name directly through registry.call(parsed.toolName). When replaying old trajectory recordings containing "tool": "type_text_chars" actions, the call will fail with MCPError.invalidParams("Unknown tool: type_text_chars") (line 56) since no handler exists.

The type_text_chars reference in TrajectoryLoader.isClickOrTypeClassTool (line 273 of TrajectoryLoader.swift) is only used for rendering classification and does not provide a fallback for the dispatch path.

Confirm no rename shim or migration path exists before removing the tool handler entirely.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ToolRegistry.swift`
around lines 34 - 44, Removing "type_text_chars" breaks replay because
ReplayTrajectoryTool invokes registry.call(parsed.toolName) and
MCPError.invalidParams is raised when no handler exists; either restore a
handler/entry or add a shim/alias mapping so old recordings are translated to
the new tool name. Update ToolRegistry.actionToolNames (or the default registry
setup where handlers are registered) to include a shim for "type_text_chars"
that forwards to the new "type_text" implementation (or map "type_text_chars" →
"type_text" inside the registry.call resolution path), and keep
TrajectoryLoader.isClickOrTypeClassTool unchanged for rendering; ensure the shim
uses the same parameter contract as "type_text" so replaying legacy trajectories
succeeds without altering stored recordings.
🧹 Nitpick comments (6)
plugins/cua/vendor/cua-driver/source/scripts/install.sh (1)

309-309: ⚡ Quick win

Clarify "tool name cue" phrasing.

The phrase "tool name cue" on line 309 is ambiguous. Consider rephrasing for clarity:

  • If you mean "tool name hint/signal": "does not expose the tool name hint for mcp__cua-computer-use__screenshot"
  • If you mean the tool name itself serves as a cue: "does not expose the mcp__cua-computer-use__screenshot tool name as a usage indicator"
📝 Suggested clarification
-          Use MCP for this path; CLI screenshots do not expose the
-          mcp__cua-computer-use__screenshot tool name cue.
+          Use MCP for this path; CLI screenshots do not expose the
+          mcp__cua-computer-use__screenshot tool name hint.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/cua/vendor/cua-driver/source/scripts/install.sh` at line 309, The
phrase "tool name cue" is ambiguous in the string mentioning
mcp__cua-computer-use__screenshot; update the wording to be explicit whether it
refers to a hint/signal or the actual tool name used as an indicator. Replace
the ambiguous phrase with one of the suggested alternatives (for example, "does
not expose the tool name hint for mcp__cua-computer-use__screenshot" or "does
not expose the mcp__cua-computer-use__screenshot tool name as a usage
indicator") wherever the token mcp__cua-computer-use__screenshot appears so
callers/readers clearly understand the intended meaning.
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift (1)

182-184: 💤 Low value

ConfigSetCommand doc comment omits the two new settable keys

The doc block still only enumerates agent_cursor.* paths. capture_mode and max_image_dimension should be listed so the inline documentation stays in sync with the actual keyset.

📝 Proposed update
 ///
 /// Accepts dotted snake_case leaf paths:
+///   - `capture_mode` (string enum, e.g. `vision`)
+///   - `max_image_dimension` (integer)
 ///   - `agent_cursor.enabled` (bool)
 ///   - `agent_cursor.motion.{start_handle|end_handle|arc_size|arc_flow|spring}` (number)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift`
around lines 182 - 184, Update the doc comment for ConfigSetCommand to include
the two newly supported keys: add `capture_mode` and `max_image_dimension` to
the enumerated dotted leaf paths (alongside the existing `agent_cursor.*`
entries) so the inline documentation matches the actual settable keyset; modify
the comment block immediately above the ConfigSetCommand declaration (where the
`agent_cursor.*` examples appear) to list `capture_mode` (string/enum) and
`max_image_dimension` (number) with the same dotted-snake style used for other
keys.
plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py (1)

132-141: ⚡ Quick win

Consider logging swallowed exceptions in _tick.

The blanket except Exception: pass (Ruff S110/BLE001) is a reasonable design choice — the poller must never crash — but silently dropping every PyObjC error makes flaky test failures hard to triage. A throttled stderr write (or first-time-only log) preserves the no-crash invariant while surfacing real issues.

♻️ Proposed fix
     def _tick(self) -> None:
         try:
             if self.check_cursor:
                 self._check_cursor()
             if self.check_frontmost and self.sentinel_pid is not None:
                 self._check_frontmost()
             if self.check_overlay_z:
                 self._check_overlay_z()
-        except Exception:
-            pass  # never crash the monitor thread
+        except Exception as exc:  # noqa: BLE001 — never crash the monitor thread
+            if not getattr(self, "_warned_tick_error", False):
+                import sys
+                print(f"[ux-monitor] tick error suppressed: {exc!r}", file=sys.stderr)
+                self._warned_tick_error = True
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py`
around lines 132 - 141, The blanket except in _tick swallows all exceptions;
change it to catch Exception as e and log the exception details (including
traceback via traceback.format_exc()) without crashing the monitor thread;
implement a throttled or first-time-only emission (e.g. add a boolean flag like
self._logged_tick_exception or a timestamp check on the instance) so the
exception is printed to stderr or processLogger only once or at a controlled
rate while keeping the existing calls to _check_cursor, _check_frontmost and
_check_overlay_z intact.
plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextTool.swift (2)

134-154: 💤 Low value

Silent-accept detection: edge case with appended writes.

The check axValue.isEmpty after a non-empty write reliably catches Chromium's silent-accept (DOM never updates from empty → empty). However, when the field already contains content and AXSelectedText only writes into a selection range, AXValue will be non-empty regardless of whether the new characters actually landed — the heuristic returns success. This is likely acceptable in practice since AXSelectedText writes typically replace the selection on apps that honor it; flagging in case you'd like a stricter check (e.g., compare pre/post AXValue or substring-match the inserted text).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextTool.swift`
around lines 134 - 154, The silent-accept heuristic currently treats a non-empty
post-write AXValue as success, which misses cases where AXSelectedText
appended/inserted into an existing value but the new characters weren't actually
applied; update the logic in TypeTextTool (around
AXInput.stringAttribute("AXValue", of: element) and the block that returns
CallTool.Result) to perform a stricter verification: capture the pre-write
AXValue, then after the write compare pre/post values to ensure the inserted
substring or change occurred (e.g., check that the new value contains the
written text at the expected position or that the length increased
appropriately); if the comparison fails, treat it as a silent-accept and fall
through to the CGEvent fallback path.

1-225: 💤 Low value

Add a comment to TrajectoryLoader.swift explaining why type_text_chars is retained.

The retention of type_text_chars in the isClickOrTypeClassTool function (line 273) is intentional for backward compatibility with pre-v0.1.4 recorded trajectories. Since it no longer appears elsewhere in the active codebase and was removed from upstream, a brief comment such as:

case "click", "double_click", "right_click", "type_text", "type_text_chars": // type_text_chars kept for pre-v0.1.4 trajectory replay

would clarify the intent and prevent accidental removal during future maintenance.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextTool.swift`
around lines 1 - 225, Add a brief inline comment in TrajectoryLoader.swift at
the isClickOrTypeClassTool function where the switch/case includes
"type_text_chars" to explain why it's retained for backward compatibility with
pre-v0.1.4 recorded trajectories; reference the case line that lists "click",
"double_click", "right_click", "type_text", "type_text_chars" and add a note
like "type_text_chars kept for pre-v0.1.4 trajectory replay" so future
maintainers won't remove it.
plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_launch_safari_bg.py (1)

57-60: 💤 Low value

Consider also skipping when the hardcoded _HERMES_BIN path is missing.

The test currently skips on missing ANTHROPIC_API_KEY but not on missing hermes binary. If ~/.hermes/hermes-agent/venv/bin/hermes doesn't exist on the runner, _run_hermes raises FileNotFoundError instead of a clean skip. Since this is vendored upstream, only worth a downstream patch if you run this in CI.

Also applies to: 127-138

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_launch_safari_bg.py`
around lines 57 - 60, The test should also skip when the hardcoded hermes binary
path is missing to avoid FileNotFoundError: add a check for
os.path.exists(_HERMES_BIN) and call pytest.skip with a clear message before
calling _run_hermes (or at the test module/setup start), similar to the existing
skip for missing ANTHROPIC_API_KEY; update the places referencing _HERMES_BIN
and the test that invokes _run_hermes (also at the later block around the same
logic at lines corresponding to 127-138) so tests cleanly skip when the binary
is not present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh`:
- Around line 111-205: The Python heredoc can exit non‑zero on write failures
(e.g., during shutil.copy2/os.replace) which makes the shell's PY_OUTPUT
assignment fail under set -e and abort the uninstaller; modify the shell wrapper
so the python3 call never returns a non‑zero exit (e.g., catch exceptions inside
the heredoc around the backup/write/replace sequence and ensure it exits 0 even
on failure, or alternately run the python heredoc in a subshell and append ||
true to the PY_OUTPUT assignment) so that failures in shutil.copy2,
tempfile.mkstemp, or os.replace do not propagate as a non‑zero exit from python3
and the script continues to the CLI fallback and final messages.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift`:
- Around line 154-157: Update the doc comment for ConfigSetCommand to include
the two new config leaf keys: add `capture_mode` (string, allowed values
"vision", "ax", or "som") and `max_image_dimension` (non-negative integer)
alongside the existing `agent_cursor.*` entries so the command documentation
matches applyConfigKey in SetConfigTool.swift; ensure both keys are documented
with their types and allowed/validation constraints in the ConfigSetCommand
comment block.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CuaDriverCommand.swift`:
- Line 252: The telemetryEntryEvent(for:) switch currently falls through to
guiLaunch for the new "dump-docs" subcommand, misclassifying its usage; add a
dedicated case for the "dump-docs" command in the telemetryEntryEvent(for:)
function (the same switch that handles other subcommands) and return the
appropriate telemetry event (e.g., a dumpDocs or equivalent enum/case) instead
of letting it hit guiLaunch so telemetry for dump-docs is recorded correctly.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/Docs/CLIDocExtractor.swift`:
- Around line 53-86: Update the CLIDocExtractor documentation and add a guard
test: change the enum docstring to state that CLIDocExtractor is a manually
maintained mirror of the CLI (not derived from ArgumentParser) and must be kept
in sync with real commands; then add a unit/CI test that iterates over each
ParsableCommand (e.g., CallCommand, MCPCommand, ServeCommand) and compares their
declared `@Flag/`@Option/@Argument metadata against the corresponding static
CommandDoc entries (mcpDoc, callDoc, listToolsDoc, etc.), failing the build if
any flag/option/argument names or defaults differ so drift is detected early.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/FocusWithoutRaise.swift`:
- Around line 183-201: The code currently computes activated but still calls
action() even when activation failed; change the flow so that after computing
activated (from targetPSN.withUnsafeBytes →
SkyLightEventPost.setFrontProcessNoWindows) you return early if activated is
false (e.g., if !activated { return false }) so the unauthenticated key is not
posted; keep the existing defer that restores the prior frontmost (using prevOk
and prevPSN with SkyLightEventPost.setFrontProcessNoWindows) so cleanup still
runs synchronously, then only call try action() when activated is true and
finally return activated.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/SkyLightEventPost.swift`:
- Around line 434-465: The fallback branch of getProcessPSN(forWindowId:into:)
is unimplemented even though isFocusWithoutRaiseAvailable may report
GetProcessForPID as present; implement a real windowId -> pid -> PSN path using
the existing getProcessForPIDFn: when getWindowOwnerFn/getConnectionPSNFn path
is unavailable but getProcessForPIDFn is non-nil, call
CGWindowListCopyWindowInfo (or CGWindowListCreateDescriptionFromArray) for the
single windowId, extract kCGWindowOwnerPID (as an Int32/Int), pass that pid into
getProcessForPIDFn to populate psnBuffer and return true on success (return
false on failures or missing pid); keep this logic inside getProcessPSN and
reference getProcessForPIDFn and getProcessPSN to locate the change.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/HotkeyTool.swift`:
- Around line 102-119: The withMenuShortcutActivation call currently ignores its
Bool result, causing failures in the window_id activation path to be treated as
success; change the invocation of FocusWithoutRaise.withMenuShortcutActivation
(the closure that calls KeyboardInput.hotkey) to capture its Bool return value
and, when it returns false, propagate a failure by throwing an appropriate Error
(or returning a failure) instead of proceeding to treat the hotkey as
successful, so that stale/unresolvable window_id or missing SPI support yields
an error rather than a "✅ Pressed ..." success.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/PressKeyTool.swift`:
- Around line 160-174: The NSMenu activation path currently ignores the Bool
returned by FocusWithoutRaise.withMenuShortcutActivation so failures still
appear as success; change the code around rawWindowId/withMenuShortcutActivation
to capture the Bool result (e.g. let activated = try
FocusWithoutRaise.withMenuShortcutActivation(...){ ... }) and if activated is
false then propagate failure (throw a descriptive error or return a failure
result) instead of falling through to the success path; ensure
KeyboardInput.press is only treated as successful when
withMenuShortcutActivation returned true (use the same function names
rawWindowId, FocusWithoutRaise.withMenuShortcutActivation, and
KeyboardInput.press to locate where to adjust control flow).

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/conftest.py`:
- Around line 88-101: The code that starts the FocusMonitor uses
subprocess.Popen and then blocks on proc.stdout.readline() inside the for-loop,
so the loop can hang and never reach the RuntimeError; change the read logic in
the startup loop that looks for "FOCUS_PID=" to use a
non-blocking/read-with-timeout approach (e.g., select.select or poll on
proc.stdout.fileno()) and enforce the overall deadline (40*0.1s) per iteration
so the loop can break on timeout, terminate proc with proc.terminate() and then
raise the RuntimeError; update the block around subprocess.Popen/for loop that
reads proc.stdout to use the non-blocking check and explicit timeout enforcement
for FOCUS_PID.
- Around line 256-259: Wrap the launchctl setenv call in a try/finally around
the fixture's yield so CUA_LOAD_URL is always removed on teardown: call
subprocess.run(["launchctl", "setenv", "CUA_LOAD_URL", test_page_url],
check=False) before entering try, then yield to the rest of the fixture, and in
the finally block run subprocess.run(["launchctl", "unsetenv", "CUA_LOAD_URL"],
check=False) to ensure the env var is cleared even if setup fails before yield;
update the fixture in conftest.py (the Electron/GU I fixture containing the
yield) accordingly.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py`:
- Around line 93-100: The monitor's frontmost-check gate (self.check_frontmost)
is being disabled in UXMonitor._run/stop and never re-enabled, so reuse of the
same UXMonitor instance leaves frontmost checking permanently off; fix by
reinitializing the gate in UXMonitor.start() (set self.check_frontmost = True)
so start() re-arms frontmost checking before starting the thread, or
alternatively update the comment near where check_frontmost is set to False to
state that it only resets on new instance construction; reference
UXMonitor.start(), UXMonitor.stop()/_run, and the attribute check_frontmost when
making the change.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/test_blender.py`:
- Around line 187-195: The test is currently overwriting the golden
toolbar_ref.png before diffing; change the logic so you only call
save_reference(ref_crop, ref_path) when the reference file does not already
exist (use os.path.exists(ref_path)); ensure the directory _ASSETS_BLENDER is
created with os.makedirs(_ASSETS_BLENDER, exist_ok=True) before attempting to
save on the first run, otherwise call load_reference(ref_path) and compute ratio
= diff_ratio(ref_crop, loaded) and assert ratio < 0.001 as before; keep
references to ref_path, ref_crop, save_reference, load_reference, diff_ratio and
toolbar_ref.png to locate the code to modify.

In
`@plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_chrome_form_fill.py`:
- Around line 345-358: The _run_hermes function currently ignores subprocess
return codes; update it to check result.returncode after subprocess.run and
treat any non-zero exit as a failure by raising an exception (e.g.,
RuntimeError) or calling pytest.fail with a message that includes
result.returncode, result.stdout, and result.stderr; keep the combined output
for debugging but do not return it when returncode != 0. Locate the call in
_run_hermes and add the conditional check using the existing result variable to
surface errors before returning the output.

In
`@plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_form_fill_qwen.py`:
- Around line 113-123: The setUpClass preflight checks the Qwen endpoint but
doesn't verify the Hermes binary before _run_hermes is invoked, causing a
FileNotFoundError instead of skipping; update setUpClass to check for the
existence/executability of _HERMES_BIN (or use shutil.which on _HERMES_BIN) and
call unittest.SkipTest with a clear message if it's missing, ensuring the tests
are skipped cleanly; reference the setUpClass method, _HERMES_BIN symbol,
_run_hermes helper, and use unittest.SkipTest to perform the skip.

---

Outside diff comments:
In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ToolRegistry.swift`:
- Around line 34-44: Removing "type_text_chars" breaks replay because
ReplayTrajectoryTool invokes registry.call(parsed.toolName) and
MCPError.invalidParams is raised when no handler exists; either restore a
handler/entry or add a shim/alias mapping so old recordings are translated to
the new tool name. Update ToolRegistry.actionToolNames (or the default registry
setup where handlers are registered) to include a shim for "type_text_chars"
that forwards to the new "type_text" implementation (or map "type_text_chars" →
"type_text" inside the registry.call resolution path), and keep
TrajectoryLoader.isClickOrTypeClassTool unchanged for rendering; ensure the shim
uses the same parameter contract as "type_text" so replaying legacy trajectories
succeeds without altering stored recordings.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/LaunchAppTool.swift`:
- Around line 10-27: The tool description in LaunchAppTool.swift contradicts
AppLauncher.launch behavior (which now calls unhide()) by saying the app is
launched "hidden" and its window "is not drawn on screen"; update the docstring
in LaunchAppTool (the multi-line description at the top of the tool) to reflect
the current behavior — e.g., remove "hidden" / "not drawn on screen" and state
that the app is launched in the background with its window unhidden but it does
not steal focus — or alternatively file an upstream report and add a brief
compatibility note; ensure the change references the LaunchAppTool description
and AppLauncher.launch semantics so LLM clients see consistent behavior.

---

Nitpick comments:
In `@plugins/cua/vendor/cua-driver/source/scripts/install.sh`:
- Line 309: The phrase "tool name cue" is ambiguous in the string mentioning
mcp__cua-computer-use__screenshot; update the wording to be explicit whether it
refers to a hint/signal or the actual tool name used as an indicator. Replace
the ambiguous phrase with one of the suggested alternatives (for example, "does
not expose the tool name hint for mcp__cua-computer-use__screenshot" or "does
not expose the mcp__cua-computer-use__screenshot tool name as a usage
indicator") wherever the token mcp__cua-computer-use__screenshot appears so
callers/readers clearly understand the intended meaning.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift`:
- Around line 182-184: Update the doc comment for ConfigSetCommand to include
the two newly supported keys: add `capture_mode` and `max_image_dimension` to
the enumerated dotted leaf paths (alongside the existing `agent_cursor.*`
entries) so the inline documentation matches the actual settable keyset; modify
the comment block immediately above the ConfigSetCommand declaration (where the
`agent_cursor.*` examples appear) to list `capture_mode` (string/enum) and
`max_image_dimension` (number) with the same dotted-snake style used for other
keys.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextTool.swift`:
- Around line 134-154: The silent-accept heuristic currently treats a non-empty
post-write AXValue as success, which misses cases where AXSelectedText
appended/inserted into an existing value but the new characters weren't actually
applied; update the logic in TypeTextTool (around
AXInput.stringAttribute("AXValue", of: element) and the block that returns
CallTool.Result) to perform a stricter verification: capture the pre-write
AXValue, then after the write compare pre/post values to ensure the inserted
substring or change occurred (e.g., check that the new value contains the
written text at the expected position or that the length increased
appropriately); if the comparison fails, treat it as a silent-accept and fall
through to the CGEvent fallback path.
- Around line 1-225: Add a brief inline comment in TrajectoryLoader.swift at the
isClickOrTypeClassTool function where the switch/case includes "type_text_chars"
to explain why it's retained for backward compatibility with pre-v0.1.4 recorded
trajectories; reference the case line that lists "click", "double_click",
"right_click", "type_text", "type_text_chars" and add a note like
"type_text_chars kept for pre-v0.1.4 trajectory replay" so future maintainers
won't remove it.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py`:
- Around line 132-141: The blanket except in _tick swallows all exceptions;
change it to catch Exception as e and log the exception details (including
traceback via traceback.format_exc()) without crashing the monitor thread;
implement a throttled or first-time-only emission (e.g. add a boolean flag like
self._logged_tick_exception or a timestamp check on the instance) so the
exception is printed to stderr or processLogger only once or at a controlled
rate while keeping the existing calls to _check_cursor, _check_frontmost and
_check_overlay_z intact.

In
`@plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_launch_safari_bg.py`:
- Around line 57-60: The test should also skip when the hardcoded hermes binary
path is missing to avoid FileNotFoundError: add a check for
os.path.exists(_HERMES_BIN) and call pytest.skip with a clear message before
calling _run_hermes (or at the test module/setup start), similar to the existing
skip for missing ANTHROPIC_API_KEY; update the places referencing _HERMES_BIN
and the test that invokes _run_hermes (also at the later block around the same
logic at lines corresponding to 127-138) so tests cleanly skip when the binary
is not present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5917169b-3b24-458a-9552-63d47a91ad1a

📥 Commits

Reviewing files that changed from the base of the PR and between 6628e77 and 5f35504.

📒 Files selected for processing (69)
  • docs/issues/cua-driver-v014-sync/plan.md
  • docs/issues/cua-driver-v014-sync/spec.md
  • docs/issues/cua-driver-v014-sync/tasks.md
  • plugins/cua/plugin.json
  • plugins/cua/policies/tool-policy.json
  • plugins/cua/skills/cua-driver/SKILL.md
  • plugins/cua/vendor/cua-driver/source/.bumpversion.cfg
  • plugins/cua/vendor/cua-driver/source/README.md
  • plugins/cua/vendor/cua-driver/source/Skills/cua-driver/RECORDING.md
  • plugins/cua/vendor/cua-driver/source/Skills/cua-driver/SKILL.md
  • plugins/cua/vendor/cua-driver/source/Skills/cua-driver/TESTS.md
  • plugins/cua/vendor/cua-driver/source/Skills/cua-driver/WEB_APPS.md
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CallCommand.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CuaDriverCommand.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/Docs/CLIDocExtractor.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/DumpDocsCommand.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/AppState/AppState.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Apps/AppLauncher.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Capture/WindowCapture.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/CuaDriverCore.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Focus/FocusGuard.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/FocusWithoutRaise.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/KeyboardInput.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/MouseInput.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/SkyLightEventPost.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ClaudeCodeComputerUseCompatTools.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/CuaDriverMCPServer.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Docs/MCPDocExtractor.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/ToolRegistry.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/ClickTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/GetAccessibilityTreeTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/GetWindowStateTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/HotkeyTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/LaunchAppTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/PressKeyTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/ScreenshotTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/SetRecordingTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/SetValueTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextCharsTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextTool.swift
  • plugins/cua/vendor/cua-driver/source/Tests/integration/assets/test_page.html
  • plugins/cua/vendor/cua-driver/source/Tests/integration/conftest.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/driver_client.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/fixtures/form_all_inputs.html
  • plugins/cua/vendor/cua-driver/source/Tests/integration/harness/__init__.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/harness/cv.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/harness/driver.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/harness/tree.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_background_focus.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_background_menu_shortcut.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_blender.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_blender_background.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_chrome.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_chrome_minimized_nav.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_double_click_delivery.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_electron.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_chrome_form_fill.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_form_fill_qwen.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_launch_safari_bg.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_launch_app_visible.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_pixel_click_delivery.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_safari.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_tauri.py
  • plugins/cua/vendor/cua-driver/source/scripts/install-local.sh
  • plugins/cua/vendor/cua-driver/source/scripts/install.sh
  • plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh
  • plugins/cua/vendor/cua-driver/upstream.json
💤 Files with no reviewable changes (7)
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_double_click_delivery.py
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_blender_background.py
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/TypeTextCharsTool.swift
  • plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/GetAccessibilityTreeTool.swift
  • plugins/cua/vendor/cua-driver/source/Tests/integration/test_pixel_click_delivery.py
  • plugins/cua/policies/tool-policy.json
  • plugins/cua/plugin.json

Comment on lines +111 to +205
if [[ -f "$CLAUDE_JSON" ]] && command -v python3 >/dev/null 2>&1; then
PY_OUTPUT="$(
CLAUDE_JSON="$CLAUDE_JSON" python3 <<'PY'
import json
import os
import shutil
import sys
import tempfile
import time

path = os.environ["CLAUDE_JSON"]

try:
with open(path, "r", encoding="utf-8") as f:
data = json.load(f)
except Exception as exc:
print(f"could not read Claude config {path}: {exc}", file=sys.stderr)
raise SystemExit(0)

removed = []

def text_parts(value):
if isinstance(value, str):
return [value]
if isinstance(value, list):
return [item for item in value if isinstance(item, str)]
return []

def invokes_cua_driver(server):
if not isinstance(server, dict):
return False
parts = []
parts.extend(text_parts(server.get("command")))
parts.extend(text_parts(server.get("args")))
joined = " ".join(parts)
return "cua-driver" in joined or "CuaDriver.app" in joined

def should_remove(name, server):
return name in {"cua-driver", "cua-computer-use"} or invokes_cua_driver(server)

def scrub_servers(servers, scope):
if not isinstance(servers, dict):
return
for name in list(servers.keys()):
if should_remove(name, servers[name]):
del servers[name]
removed.append(f"{scope}:{name}")

scrub_servers(data.get("mcpServers"), "user")

projects = data.get("projects")
if isinstance(projects, dict):
for project in projects.values():
if isinstance(project, dict):
scrub_servers(project.get("mcpServers"), "project")

if not removed:
raise SystemExit(0)

backup = f"{path}.bak-cua-driver-uninstall-{int(time.time())}"
shutil.copy2(path, backup)

directory = os.path.dirname(path) or "."
fd, tmp_path = tempfile.mkstemp(
prefix=".claude.json.",
suffix=".tmp",
dir=directory,
text=True,
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2, ensure_ascii=False)
f.write("\n")
os.replace(tmp_path, path)
except Exception:
try:
os.unlink(tmp_path)
except OSError:
pass
raise

print(f"removed Claude MCP registration(s): {', '.join(removed)}")
print(f"backed up Claude config to {backup}")
PY
)"
if [[ -n "$PY_OUTPUT" ]]; then
while IFS= read -r line; do
log "$line"
done <<< "$PY_OUTPUT"
else
log "no Claude MCP registrations for cua-driver found in $CLAUDE_JSON"
fi
else
log "no Claude config cleanup via python3 (missing $CLAUDE_JSON or python3)"
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm `set -e` is active in this script and that PY_OUTPUT assignment is the only guard.
rg -nP -C2 '^\s*set\s+-' plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh
rg -nP -C2 'PY_OUTPUT=' plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh

Repository: ThinkInAIXYZ/deepchat

Length of output: 476


🏁 Script executed:

# Check the exact structure of the Python error handling and the fallback mechanism
sed -n '111,220p' plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh | cat -n

Repository: ThinkInAIXYZ/deepchat

Length of output: 3989


Python write failures can abort the uninstaller under set -e.

With set -euo pipefail (line 14), PY_OUTPUT="$(... python3 ...)" propagates a non-zero exit from the Python heredoc. The script handles read errors via raise SystemExit(0) (line 128), but the write path on lines 185–190 re-raises the original exception on any failure (e.g., disk full, permissions on $HOME, ENOSPC during os.replace). That bubbles out as a non-zero python3 exit, which under set -e can terminate the uninstaller before the CLI fallback (lines 207–220) and the final user-facing message run — leaving a partial uninstall.

Since this whole block is intentionally best-effort (note the || true-style fallthroughs elsewhere and the comment at lines 207–209 calling the CLI cleanup "best-effort"), it would be safer to swallow non-zero exits from the Python step at the shell boundary, e.g.:

🛡️ Suggested guard at the shell boundary
-    PY_OUTPUT="$(
-        CLAUDE_JSON="$CLAUDE_JSON" python3 <<'PY'
+    PY_OUTPUT="$(
+        CLAUDE_JSON="$CLAUDE_JSON" python3 <<'PY' || true
@@
 PY
-    )"
+    )"

Alternatively, wrap the write block in Python with a try/except that prints the failure via log-equivalent stderr and exits 0, so the uninstaller continues to the CLI fallback.

Note this is vendored upstream code, so the fix likely belongs upstream in trycua/cua — worth flagging there as part of the v0.1.4 sync.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/cua/vendor/cua-driver/source/scripts/uninstall.sh` around lines 111 -
205, The Python heredoc can exit non‑zero on write failures (e.g., during
shutil.copy2/os.replace) which makes the shell's PY_OUTPUT assignment fail under
set -e and abort the uninstaller; modify the shell wrapper so the python3 call
never returns a non‑zero exit (e.g., catch exceptions inside the heredoc around
the backup/write/replace sequence and ensure it exits 0 even on failure, or
alternately run the python heredoc in a subshell and append || true to the
PY_OUTPUT assignment) so that failures in shutil.copy2, tempfile.mkstemp, or
os.replace do not propagate as a non‑zero exit from python3 and the script
continues to the CLI fallback and final messages.

Comment on lines +154 to +157
case "capture_mode":
print(config.captureMode.rawValue)
case "max_image_dimension":
print(config.maxImageDimension)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check that applyConfigKey dispatches capture_mode and max_image_dimension.
rg -n "applyConfigKey" --type swift -A 40 | grep -E "(capture_mode|max_image_dimension|func applyConfigKey)"

Repository: ThinkInAIXYZ/deepchat

Length of output: 1899


🏁 Script executed:

find . -name "*.swift" -type f | xargs grep -l "applyConfigKey" | head -20

Repository: ThinkInAIXYZ/deepchat

Length of output: 234


🏁 Script executed:

rg "func applyConfigKey" --type swift -B 2 -A 30

Repository: ThinkInAIXYZ/deepchat

Length of output: 4262


🏁 Script executed:

rg "func applyConfigKey" --type swift -A 100 plugins/cua/vendor/cua-driver/source/Sources/CuaDriverServer/Tools/SetConfigTool.swift

Repository: ThinkInAIXYZ/deepchat

Length of output: 4129


🏁 Script executed:

sed -n '180,190p' plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift

Repository: ThinkInAIXYZ/deepchat

Length of output: 503


Update doc comment on ConfigSetCommand to document capture_mode and max_image_dimension

The implementation correctly handles both new keys: applyConfigKey in SetConfigTool.swift validates and applies capture_mode and max_image_dimension without issue. However, the doc comment on ConfigSetCommand (lines 183–185) only lists agent_cursor.* leaf paths and omits the two new keys. Update it to include:

  • capture_mode (string: vision, ax, or som)
  • max_image_dimension (non-negative integer)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/ConfigCommand.swift`
around lines 154 - 157, Update the doc comment for ConfigSetCommand to include
the two new config leaf keys: add `capture_mode` (string, allowed values
"vision", "ax", or "som") and `max_image_dimension` (non-negative integer)
alongside the existing `agent_cursor.*` entries so the command documentation
matches applyConfigKey in SetConfigTool.swift; ensure both keys are documented
with their types and allowed/validation constraints in the ConfigSetCommand
comment block.

"update",
"diagnose",
"doctor",
"dump-docs",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add dump-docs to telemetryEntryEvent(for:) too.

Registering the new subcommand here is only half the change: telemetryEntryEvent(for:) still falls through to guiLaunch for dump-docs, so usage for this command will be misclassified until the switch gets a dedicated case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CuaDriverCommand.swift`
at line 252, The telemetryEntryEvent(for:) switch currently falls through to
guiLaunch for the new "dump-docs" subcommand, misclassifying its usage; add a
dedicated case for the "dump-docs" command in the telemetryEntryEvent(for:)
function (the same switch that handles other subcommands) and return the
appropriate telemetry event (e.g., a dumpDocs or equivalent enum/case) instead
of letting it hit guiLaunch so telemetry for dump-docs is recorded correctly.

Comment on lines +53 to +86
/// Extracts CLI documentation from command definitions.
/// This uses a combination of ArgumentParser's configuration metadata
/// and manually maintained help text to provide accurate documentation.
enum CLIDocExtractor {
/// Extract documentation from all registered commands
static func extractAll() -> CLIDocumentation {
return CLIDocumentation(
name: "cua-driver",
version: CuaDriverCore.version,
abstract: "macOS Accessibility-driven computer-use agent — MCP stdio server.",
commands: allCommandDocs
)
}

// MARK: - Command Documentation Definitions

private static var allCommandDocs: [CommandDoc] {
return [
mcpDoc,
callDoc,
listToolsDoc,
describeDoc,
serveDoc,
stopDoc,
statusDoc,
recordingDoc,
configDoc,
mcpConfigDoc,
updateDoc,
diagnoseDoc,
doctorDoc,
dumpDocsDoc,
]
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm the actual CLI commands referenced here (mcp, call, list-tools, …) exist
# and surface their declared flags so we can spot near-term drift.
fd -e swift '.*Command\.swift' Sources/CuaDriverCLI 2>/dev/null | head -50
ast-grep --pattern $'struct $_: ParsableCommand {
  $$$
  static let configuration = CommandConfiguration($$$)
  $$$
}'
rg -nP --type=swift '@(Flag|Option|Argument)\b' -g '**/CuaDriverCLI/**' -C1 | head -120

Repository: ThinkInAIXYZ/deepchat

Length of output: 11056


🏁 Script executed:

# First, get the full CLIDocExtractor.swift file to see if there's any ArgumentParser introspection
wc -l plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/Docs/CLIDocExtractor.swift

Repository: ThinkInAIXYZ/deepchat

Length of output: 154


🏁 Script executed:

# Read the complete CLIDocExtractor.swift file
cat plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/Docs/CLIDocExtractor.swift

Repository: ThinkInAIXYZ/deepchat

Length of output: 20454


🏁 Script executed:

# Read CallCommand.swift to see actual flags/options and compare with what CLIDocExtractor documents
cat plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/CallCommand.swift | head -150

Repository: ThinkInAIXYZ/deepchat

Length of output: 6310


Documentation will drift from the real CLI definitions; the docstring is factually incorrect.

The class docstring claims this enum uses "a combination of ArgumentParser's configuration metadata and manually maintained help text," but the implementation is entirely hand-maintained — there is no introspection of ParsableCommand types, CommandConfiguration, or ArgumentSet. All 14 commands and their flags, options, and arguments are hardcoded as static computed properties.

While the current documentation matches the actual CLI (verified against CallCommand), there is no mechanism to catch drift. When a real command gains a flag, gets renamed, or changes a default, this file silently goes stale and dump-docs output diverges from --help.

Two options, in increasing order of effort:

  1. Update the docstring to "Manually maintained mirror of the CLI definitions; keep in sync when commands change," and add a test or CI check that compares each command's declared @Flag/@Option/@Argument against the entries here. Low effort, catches drift fast.
  2. Drive CommandDoc from ArgumentParser — walk ParsableCommand.configuration and Arguments via reflection to synthesize entries, with manual overrides only for fields ArgumentParser can't model (like discussion text).

For this PR, option 1 is strongly recommended so the docstring matches reality.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCLI/Docs/CLIDocExtractor.swift`
around lines 53 - 86, Update the CLIDocExtractor documentation and add a guard
test: change the enum docstring to state that CLIDocExtractor is a manually
maintained mirror of the CLI (not derived from ArgumentParser) and must be kept
in sync with real commands; then add a unit/CI test that iterates over each
ParsableCommand (e.g., CallCommand, MCPCommand, ServeCommand) and compares their
declared `@Flag/`@Option/@Argument metadata against the corresponding static
CommandDoc entries (mcpDoc, callDoc, listToolsDoc, etc.), failing the build if
any flag/option/argument names or defaults differ so drift is detected early.

Comment on lines +183 to +201
// Make target WindowServer-frontmost (without raising windows).
let activated = targetPSN.withUnsafeBytes { psnRaw in
SkyLightEventPost.setFrontProcessNoWindows(
psn: psnRaw.baseAddress!, windowID: UInt32(targetWid))
}

// Run the action (key post) then restore prior frontmost, regardless of
// whether the action throws — defer guarantees synchronous restoration.
defer {
if prevOk {
_ = prevPSN.withUnsafeBytes { psnRaw in
SkyLightEventPost.setFrontProcessNoWindows(
psn: psnRaw.baseAddress!, windowID: 0)
}
}
}

try action()
return activated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't post the key when frontmost activation failed.

activated is computed here, but action() still runs even when it is false. In the window_id shortcut path that means we can still send the unauthenticated key while the sentinel app remains frontmost, which can misroute the keystroke instead of failing safely.

Suggested fix
         let activated = targetPSN.withUnsafeBytes { psnRaw in
             SkyLightEventPost.setFrontProcessNoWindows(
                 psn: psnRaw.baseAddress!, windowID: UInt32(targetWid))
         }
+        guard activated else { return false }

         // Run the action (key post) then restore prior frontmost, regardless of
         // whether the action throws — defer guarantees synchronous restoration.
         defer {
             if prevOk {
@@
         }

         try action()
-        return activated
+        return true
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Make target WindowServer-frontmost (without raising windows).
let activated = targetPSN.withUnsafeBytes { psnRaw in
SkyLightEventPost.setFrontProcessNoWindows(
psn: psnRaw.baseAddress!, windowID: UInt32(targetWid))
}
// Run the action (key post) then restore prior frontmost, regardless of
// whether the action throws — defer guarantees synchronous restoration.
defer {
if prevOk {
_ = prevPSN.withUnsafeBytes { psnRaw in
SkyLightEventPost.setFrontProcessNoWindows(
psn: psnRaw.baseAddress!, windowID: 0)
}
}
}
try action()
return activated
let activated = targetPSN.withUnsafeBytes { psnRaw in
SkyLightEventPost.setFrontProcessNoWindows(
psn: psnRaw.baseAddress!, windowID: UInt32(targetWid))
}
guard activated else { return false }
// Run the action (key post) then restore prior frontmost, regardless of
// whether the action throws — defer guarantees synchronous restoration.
defer {
if prevOk {
_ = prevPSN.withUnsafeBytes { psnRaw in
SkyLightEventPost.setFrontProcessNoWindows(
psn: psnRaw.baseAddress!, windowID: 0)
}
}
}
try action()
return true
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Sources/CuaDriverCore/Input/FocusWithoutRaise.swift`
around lines 183 - 201, The code currently computes activated but still calls
action() even when activation failed; change the flow so that after computing
activated (from targetPSN.withUnsafeBytes →
SkyLightEventPost.setFrontProcessNoWindows) you return early if activated is
false (e.g., if !activated { return false }) so the unauthenticated key is not
posted; keep the existing defer that restores the prior frontmost (using prevOk
and prevPSN with SkyLightEventPost.setFrontProcessNoWindows) so cleanup still
runs synchronously, then only call try action() when activated is true and
finally return activated.

Comment on lines +256 to +259
subprocess.run(
["launchctl", "setenv", "CUA_LOAD_URL", test_page_url],
check=False,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unset CUA_LOAD_URL during Electron fixture teardown.

launchctl setenv mutates the user's launchd environment, and this fixture never clears it. A stale CUA_LOAD_URL can bleed into later GUI launches, and it will also survive if setup fails before yield. Wrap the launch in try/finally and call launchctl unsetenv CUA_LOAD_URL on teardown.

🧰 Tools
🪛 Ruff (0.15.12)

[error] 256-256: subprocess call: check for execution of untrusted input

(S603)


[error] 257-257: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/conftest.py` around
lines 256 - 259, Wrap the launchctl setenv call in a try/finally around the
fixture's yield so CUA_LOAD_URL is always removed on teardown: call
subprocess.run(["launchctl", "setenv", "CUA_LOAD_URL", test_page_url],
check=False) before entering try, then yield to the rest of the fixture, and in
the finally block run subprocess.run(["launchctl", "unsetenv", "CUA_LOAD_URL"],
check=False) to ensure the env var is cleared even if setup fails before yield;
update the fixture in conftest.py (the Electron/GU I fixture containing the
yield) accordingly.

Comment on lines +93 to +100
def start(self) -> "UXMonitor":
if not _PYOBJC_AVAILABLE:
return self
self._baseline_x, self._baseline_y = self._cursor_pos()
self._stop_event.clear()
self._thread = threading.Thread(target=self._run, daemon=True, name="ux-monitor")
self._thread.start()
return self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

check_frontmost is permanently disabled after first violation.

Line 188 sets self.check_frontmost = False and the comment promises "pause checking until next start" — but start() (lines 93-100) never re-arms it. If a caller reuses the same UXMonitor instance across stop() / start() cycles, frontmost checking stays off after the first hit. Either re-initialize the gate in start() or update the comment to "until a new instance is constructed".

🛡️ Proposed fix
     def start(self) -> "UXMonitor":
         if not _PYOBJC_AVAILABLE:
             return self
         self._baseline_x, self._baseline_y = self._cursor_pos()
+        # Re-arm gates that may have been latched off by a previous run.
+        self._sentinel_seen = False
+        if self.sentinel_pid is not None:
+            self.check_frontmost = _PYOBJC_AVAILABLE
         self._stop_event.clear()

Also applies to: 187-188

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/harness/monitor.py`
around lines 93 - 100, The monitor's frontmost-check gate (self.check_frontmost)
is being disabled in UXMonitor._run/stop and never re-enabled, so reuse of the
same UXMonitor instance leaves frontmost checking permanently off; fix by
reinitializing the gate in UXMonitor.start() (set self.check_frontmost = True)
so start() re-arms frontmost checking before starting the thread, or
alternatively update the comment near where check_frontmost is set to False to
state that it only resets on new instance construction; reference
UXMonitor.start(), UXMonitor.stop()/_run, and the attribute check_frontmost when
making the change.

Comment on lines +187 to +195
ref_path = os.path.join(_ASSETS_BLENDER, "toolbar_ref.png")
os.makedirs(_ASSETS_BLENDER, exist_ok=True)
save_reference(ref_crop, ref_path)
assert os.path.exists(ref_path), "Reference image was not saved"

# Load it back and verify it's identical to the crop
loaded = load_reference(ref_path)
ratio = diff_ratio(ref_crop, loaded)
assert ratio < 0.001, f"Reference image roundtrip diff too high: {ratio:.4f}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't overwrite the golden before diffing it.

This rewrites toolbar_ref.png on every run, so the later load_reference() check only proves PNG round-tripping. It can never catch a Blender UI regression after the first run. Only create the reference when it is missing; otherwise compare the current crop against the stored file.

Suggested fix
         ref_path = os.path.join(_ASSETS_BLENDER, "toolbar_ref.png")
         os.makedirs(_ASSETS_BLENDER, exist_ok=True)
-        save_reference(ref_crop, ref_path)
-        assert os.path.exists(ref_path), "Reference image was not saved"
+        if not os.path.exists(ref_path):
+            save_reference(ref_crop, ref_path)
+            pytest.skip(f"Created reference image at {ref_path}; re-run to validate it")
 
-        # Load it back and verify it's identical to the crop
         loaded = load_reference(ref_path)
         ratio = diff_ratio(ref_crop, loaded)
         assert ratio < 0.001, f"Reference image roundtrip diff too high: {ratio:.4f}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ref_path = os.path.join(_ASSETS_BLENDER, "toolbar_ref.png")
os.makedirs(_ASSETS_BLENDER, exist_ok=True)
save_reference(ref_crop, ref_path)
assert os.path.exists(ref_path), "Reference image was not saved"
# Load it back and verify it's identical to the crop
loaded = load_reference(ref_path)
ratio = diff_ratio(ref_crop, loaded)
assert ratio < 0.001, f"Reference image roundtrip diff too high: {ratio:.4f}"
ref_path = os.path.join(_ASSETS_BLENDER, "toolbar_ref.png")
os.makedirs(_ASSETS_BLENDER, exist_ok=True)
if not os.path.exists(ref_path):
save_reference(ref_crop, ref_path)
pytest.skip(f"Created reference image at {ref_path}; re-run to validate it")
loaded = load_reference(ref_path)
ratio = diff_ratio(ref_crop, loaded)
assert ratio < 0.001, f"Reference image roundtrip diff too high: {ratio:.4f}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/cua/vendor/cua-driver/source/Tests/integration/test_blender.py`
around lines 187 - 195, The test is currently overwriting the golden
toolbar_ref.png before diffing; change the logic so you only call
save_reference(ref_crop, ref_path) when the reference file does not already
exist (use os.path.exists(ref_path)); ensure the directory _ASSETS_BLENDER is
created with os.makedirs(_ASSETS_BLENDER, exist_ok=True) before attempting to
save on the first run, otherwise call load_reference(ref_path) and compute ratio
= diff_ratio(ref_crop, loaded) and assert ratio < 0.001 as before; keep
references to ref_path, ref_crop, save_reference, load_reference, diff_ratio and
toolbar_ref.png to locate the code to modify.

Comment on lines +345 to +358
result = subprocess.run(
[
_HERMES_BIN, "chat",
"--provider", "anthropic",
"-m", _MODEL,
"--toolsets", "computer_use",
"--yolo",
"-Q",
"--max-turns", str(max_turns),
"-q", prompt,
],
env=env, capture_output=True, text=True, timeout=timeout,
)
return (result.stdout + "\n" + result.stderr).strip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail the test when Hermes exits non-zero.

Right now _run_hermes() ignores result.returncode, so a run that mutates the field and then errors can still produce a green test. Treat a non-zero exit as a failure before returning the combined output.

Suggested fix
         result = subprocess.run(
             [
                 _HERMES_BIN, "chat",
@@
             ],
             env=env, capture_output=True, text=True, timeout=timeout,
         )
+        if result.returncode != 0:
+            raise AssertionError(
+                f"hermes exited {result.returncode}\n"
+                f"stdout:\n{result.stdout}\n"
+                f"stderr:\n{result.stderr}"
+            )
         return (result.stdout + "\n" + result.stderr).strip()
🧰 Tools
🪛 Ruff (0.15.12)

[error] 345-345: subprocess call: check for execution of untrusted input

(S603)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_chrome_form_fill.py`
around lines 345 - 358, The _run_hermes function currently ignores subprocess
return codes; update it to check result.returncode after subprocess.run and
treat any non-zero exit as a failure by raising an exception (e.g.,
RuntimeError) or calling pytest.fail with a message that includes
result.returncode, result.stdout, and result.stderr; keep the combined output
for debugging but do not return it when returncode != 0. Locate the call in
_run_hermes and add the conditional check using the existing result variable to
surface errors before returning the output.

Comment on lines +113 to +123
def setUpClass(cls) -> None:
# Verify the local server is reachable before spending time on setup.
import urllib.request, urllib.error
try:
urllib.request.urlopen(_ENDPOINT.rstrip("/") + "/models", timeout=3)
except urllib.error.URLError as e:
raise unittest.SkipTest(
f"Local mlx-vlm server not reachable at {_ENDPOINT}: {e}. "
"Start it with: mlx_vlm.server --model mlx-community/Qwen3.6-35B-A3B-4bit"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Skip cleanly when Hermes is not installed.

This preflight checks the Qwen endpoint, but it never verifies _HERMES_BIN exists before _run_hermes() uses it. On a machine without Hermes, the first test fails with FileNotFoundError instead of being skipped, unlike the Chrome Hermes suite in this PR.

Suggested fix
     def setUpClass(cls) -> None:
+        if not os.path.exists(_HERMES_BIN):
+            raise unittest.SkipTest(
+                f"hermes not found at {_HERMES_BIN} — install hermes first"
+            )
         # Verify the local server is reachable before spending time on setup.
         import urllib.request, urllib.error
         try:
             urllib.request.urlopen(_ENDPOINT.rstrip("/") + "/models", timeout=3)
🧰 Tools
🪛 Ruff (0.15.12)

[error] 117-117: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


[warning] 119-122: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@plugins/cua/vendor/cua-driver/source/Tests/integration/test_hermes_form_fill_qwen.py`
around lines 113 - 123, The setUpClass preflight checks the Qwen endpoint but
doesn't verify the Hermes binary before _run_hermes is invoked, causing a
FileNotFoundError instead of skipping; update setUpClass to check for the
existence/executability of _HERMES_BIN (or use shutil.which on _HERMES_BIN) and
call unittest.SkipTest with a clear message if it's missing, ensuring the tests
are skipped cleanly; reference the setUpClass method, _HERMES_BIN symbol,
_run_hermes helper, and use unittest.SkipTest to perform the skip.

@zerob13 zerob13 merged commit 6a9d1db into dev May 7, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant