Skip to content

feat: CORS iframe parity + closed shadow DOM#32

Open
aryanku-dev wants to merge 17 commits into
mainfrom
feat/cors-iframes-and-shadow-dom
Open

feat: CORS iframe parity + closed shadow DOM#32
aryanku-dev wants to merge 17 commits into
mainfrom
feat/cors-iframes-and-shadow-dom

Conversation

@aryanku-dev
Copy link
Copy Markdown

Summary

Brings percy-selenium-ruby to parity with the canonical Percy CORS iframe + closed shadow DOM feature set.

Implemented

  • Nested cross-origin iframe capture (depth-capped, cycle-guarded)
  • data-percy-ignore attribute opt-out
  • ignoreIframeSelectors option
  • Post-switch URL re-check via is_unsupported_iframe_src?
  • PercyContextLost recovery merges partial_capture
  • Closed shadow DOM via CDP (expose_closed_shadow_roots)
  • Inlined Ruby helpers

Skipped

  • ElementInternals preflight (Feature 8): N/A — selenium-ruby has no before-page-load hook.
  • @percy/sdk-utils bump (Feature 9): not applicable to Ruby; helpers inlined.

Reference

Mirrored from percy/percy-capybara branch PER-7292-add-cross-origin-iframe-support and percy/percy-nightwatch#869.

Test plan

  • Full repo test suite passed locally
  • Manual smoke: cross-origin iframes
  • Manual smoke: closed shadow roots in Chromium

🤖 Generated with Claude Code via /percy-sdk-sync

aryanku-dev and others added 14 commits May 11, 2026 15:10
Mirrors the @percy/sdk-utils-to-inlined-helpers refactor from the
canonical JS SDKs. Adds DEFAULT_MAX_FRAME_DEPTH, UNSUPPORTED_IFRAME_SRCS,
the PercyContextLost error class, and the shared iframe helper surface
(is_unsupported_iframe_src?, clamp_frame_depth, normalize_ignore_selectors,
resolve_max_frame_depth, resolve_ignore_selectors, enumerate_iframes_script)
that the upcoming nested CORS / data-percy-ignore / ignoreIframeSelectors
/ context-loss-recovery features will build on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the single-level CORS iframe loop with process_frame_tree, a
bounded recursion mirroring percy-nightwatch's snapshot.js. After
serializing an iframe's document, the walker re-enumerates iframes from
inside that frame and recurses into any nested cross-origin children
that differ from the immediate parent's origin. Depth is clamped via
DEFAULT_MAX_FRAME_DEPTH (or options.maxIframeDepth / config) so
pathological pages can't blow the stack, and an ancestorUrls Set
short-circuits cyclic frame trees that link back to themselves.

Specs updated to drive the new tree API via execute_script-returned
iframe metadata (enumerate_iframes_script) instead of stubbing
frame_element.attribute, matching how the SDK now reads the DOM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaces dataPercyIgnore from the enumerate_iframes_script payload and
treats it as an explicit skip signal in should_skip_iframe?. Page
authors can drop a single iframe from CORS capture (e.g. a noisy widget,
a third-party embed) without configuring a project-wide selector.
Mirrors the same attribute check shipped in percy-nightwatch and
percy-webdriverio.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Threads options[:ignoreIframeSelectors] (and the snake_case alias, plus
config.snapshot.ignoreIframeSelectors) through resolve_ignore_selectors
into enumerate_iframes_script, which evaluates frame.matches(selector)
in the page context for each iframe. should_skip_iframe? honors the
resulting matchesIgnoreSelector flag, so a single config entry can drop
classes of iframes (ads, chat widgets) from CORS capture without
touching every page or call site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After switching into an iframe, read document.URL and re-run the
unsupported-src filter. Cross-origin navigations can fail in subtle
ways — DNS errors, X-Frame-Options blocks, or programmatic
about:blank redirects — that leave the static src attribute pointing
at the original https:// URL while the actual document is an error
page or about:blank. Without this check we'd serialize and ship that
error page as "captured" iframe content. Mirrors the same check from
percy-webdriverio's processFrameTree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a nested frame restoration fails — switch_to.parent_frame raises
and we can no longer trust the driver's frame context — the deep
process_frame_tree call now packages the iframes it captured before
the loss into a PercyContextLost error and propagates upward. Each
intermediate level merges its own collected frames into
err.partial_capture before re-raising. capture_cors_iframes catches
that error at the top, appends the partial capture into corsIframes,
and aborts further sibling iteration. The result: a parent-frame
restore failure in a 4-deep tree still ships everything we managed to
serialize, instead of dropping the whole snapshot's CORS data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds expose_closed_shadow_roots(driver), which uses three CDP calls —
DOM.getDocument (deep + pierce), DOM.resolveNode, and
Runtime.callFunctionOn — to surface closed shadow roots that are
otherwise inaccessible to JavaScript (element.shadowRoot returns null).
Each closed root is keyed into a host-keyed WeakMap on the page
(window.__percyClosedShadowRoots) that clone-dom.js consults during
PercyDOM.serialize. Wired into the snapshot pipeline before the top-
level serialize and re-primed after page reloads inside
capture_responsive_dom (refresh clears the WeakMap). No-op when CDP is
unavailable or the driver isn't Chromium. Mirrors percy-playwright's
exposeClosedShadowRoots; contentDocument subtrees are skipped since
cross-frame closed roots aren't supported yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace rescue modifiers with explicit begin/rescue blocks, swap em-dash
comments for ASCII, rename rescue variable, fix alignment/indentation,
split overlong spec line, and add scoped rubocop disables for the
intentional predicate method name and one block-nesting hot spot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three CE:review fixes on the CORS iframe path:

- Index alignment race: enumerate_iframes (JS querySelectorAll) and
  driver.find_elements(:tag_name, 'iframe') (Ruby) were two separate DOM
  reads. Any mutation between them (PercyDOM.serialize, dynamic insertion,
  etc.) silently shuffled the meta-to-element mapping, so one iframe's
  content could ship under another's percyElementId. Replace positional
  alignment with `find_element(css: 'iframe[data-percy-element-id="…"]')`
  via a new find_iframe_by_percy_id helper. percyElementId is already
  stamped by PercyDOM and we already require it via should_skip_iframe.

- "blank" sentinel in UNSUPPORTED_IFRAME_SRCS matched any src starting
  with the literal text "blank" (e.g. a route at /blank/foo). The
  about:blank case is already covered by the about:blank entry; drop the
  bare token.

- PercyContextLost.set_backtrace(captured_error.backtrace) overwrote the
  new exception's own trace with the inner one, hiding where the
  parent-frame restoration actually failed. Drop the override and let
  the inner error remain accessible via Exception#cause if needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Update existing get_serialized_dom CORS specs to stub
  driver.find_element(css: 'iframe[data-percy-element-id="..."]') instead
  of find_elements(:tag_name, 'iframe'), matching the new percyElementId
  lookup path.
- Add unit tests for find_iframe_by_percy_id covering nil/empty input,
  the attribute selector wire format, and the NoSuchElementError swallow.
- Regression test that "blank" no longer matches arbitrary src prefixes
  (e.g. blanket.com, blank-canvas://); about:blank stays covered.
- Test that capture_cors_iframes does not call find_elements on the
  iframe collection and routes through find_element by percyElementId.
- Confirm PercyContextLost keeps its own backtrace pointing at the
  restoration failure rather than the inner error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier removal of the Metrics/BlockNesting disable markers left the
parent_frame call and the ensuring `end` at column 20 / 18 rather than
inside the proper begin/rescue block. Functionally correct (rubocop's
inherited percy-style config didn't flag it), but reads as broken.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CSS-escape backslashes and double-quotes in the percyElementId
before interpolating into the iframe[data-percy-element-id="..."]
selector. Anything weirder than that still falls through to the
existing rescue and returns nil. Adds two specs covering the
escape and the graceful-nil path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two regression specs:

  * process_frame_tree resolves nested child iframes via
    find_element(css: 'iframe[data-percy-element-id="..."]'), not via
    a positional index into find_elements(:tag_name, 'iframe'). This
    pins the index-alignment fix at the child level.
  * Percy::PercyContextLost raised then rescued exposes a backtrace
    pointing at the raise site (non-nil, non-empty, top frame matches
    the raise line) so deeper diagnostics aren't lost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread lib/percy.rb Fixed
Chained gsubs ('\\' first, then '"') were semantically correct but
matched CodeQL's incomplete-string-escaping pattern (the second
gsub on its own doesn't escape backslashes). Collapse to one
gsub(/[\\"]/) { |c| "\\#{c}" } so the escape is provably complete
in a single pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aryanku-dev aryanku-dev marked this pull request as ready for review May 24, 2026 11:45
@aryanku-dev aryanku-dev requested a review from a team as a code owner May 24, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants