Skip to content

[not ready for merge] feat: chain batching, mouse easing, screenshot dedup, network idle#328

Open
petehunt wants to merge 1 commit intogarrytan:mainfrom
petehunt:feat/browse-chain-batching
Open

[not ready for merge] feat: chain batching, mouse easing, screenshot dedup, network idle#328
petehunt wants to merge 1 commit intogarrytan:mainfrom
petehunt:feat/browse-chain-batching

Conversation

@petehunt
Copy link

@petehunt petehunt commented Mar 22, 2026

This PR is not ready to be merged yet. Submitting my first pass of bringing over some of BangOnIt's performance enhancements over to gstack. If you like this direction I can test it more thoroughly and clean up the PR to be ready to merge.

The big idea is to push agents to favor batched interactions with the browser rather than single tool invocations. The reason for this is that every round trip to the model can be expensive and that latency quickly adds up. So most tools have now been combined into chain. chain now also auto-observes the state of the browser at the end of the batch, saving another round-trip (waiting for the network to settle, with a timeout).

I have also implemented change detection for screenshots which helps a lot when the agent misses its click targets when automating canvas-heavy aps apps.

Finally, the way that Playwright automates user interactions like typing and clicking differs substantially with how humans actually use apps. I've brought over the BangOnIt functionality that keeps a current mouse position and animates the mouse usage in the way a human would

Raw summary below:

  • chain is now the primary interface for browser interactions: executes actions sequentially, waits for network idle, then auto-appends an observation block (snapshot -i + page state: URL, title, viewport, focus, dialog state)
  • Write commands (click, fill, goto, etc.) removed from standalone dispatch — only callable through chain. Server rejects standalone writes with guidance to use chain.
  • click/hover use cubic-eased mouse movement (3-40 steps, 30-300ms) from tracked position to element center, with locator fallback if bounding box unavailable
  • Screenshot SHA-256 dedup: returns "unchanged" if identical to previous capture. Hash clears on navigation.
  • Network idle auto-wait (1.5s cap) after click
  • Read commands deprioritized in docs ("rarely needed — chain auto-observes")
  • 13 new tests covering all four features

- chain is now the primary interface for browser interactions: executes
  actions sequentially, waits for network idle, then auto-appends an
  observation block (snapshot -i + page state: URL, title, viewport,
  focus, dialog state)
- Write commands (click, fill, goto, etc.) removed from standalone
  dispatch — only callable through chain. Server rejects standalone
  writes with guidance to use chain.
- click/hover use cubic-eased mouse movement (3-40 steps, 30-300ms)
  from tracked position to element center, with locator fallback if
  bounding box unavailable
- Screenshot SHA-256 dedup: returns "unchanged" if identical to
  previous capture. Hash clears on navigation.
- Network idle auto-wait (1.5s cap) after click
- Read commands deprioritized in docs ("rarely needed — chain
  auto-observes")
- 13 new tests covering all four features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@petehunt petehunt changed the title [not ready for review] feat: chain batching, mouse easing, screenshot dedup, network idle [not ready for merge] feat: chain batching, mouse easing, screenshot dedup, network idle Mar 22, 2026
@petehunt petehunt marked this pull request as ready for review March 22, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant