Skip to content

feat: nested CORS iframes, ignore controls, and closed shadow DOM#312

Draft
aryanku-dev wants to merge 7 commits intomasterfrom
feat/cors-iframes-and-shadow-dom
Draft

feat: nested CORS iframes, ignore controls, and closed shadow DOM#312
aryanku-dev wants to merge 7 commits intomasterfrom
feat/cors-iframes-and-shadow-dom

Conversation

@aryanku-dev
Copy link
Copy Markdown

Summary

Brings percy-selenium-java to parity with the canonical Percy CORS iframe + closed shadow DOM feature set.

Implemented

  • Nested cross-origin iframe capture (depth-capped, cycle-guarded)
  • data-percy-ignore attribute opt-out
  • ignoreIframeSelectors option
  • Post-switch URL re-check via isUnsupportedIframeSrc
  • PercyContextLostException recovery merges partialCapture
  • Closed shadow DOM via CDP (exposeClosedShadowRoots)
  • Inlined Java helpers (clampFrameDepth, normalizeIgnoreSelectors, resolveMaxFrameDepth, resolveIgnoreSelectors)

Skipped

  • ElementInternals preflight (Feature 8): N/A — selenium-java has no before-page-load hook.
  • @percy/sdk-utils bump (Feature 9): not applicable to Java; helpers inlined.

Reference

Mirrored from percy/percy-nightwatch#869 (PER-7292-add-cors-iframe-support); CDP from percy/percy-playwright#609.

Test plan

  • New IframeFeatureTest (11 tests) and existing CacheTest (3 tests) pass under mvn test.
  • SdkTest's integration tests require a local Firefox binary (@BeforeAll instantiates FirefoxDriver). They were not exercised in the sync environment because Firefox is not installed; this matches the pre-existing baseline on master and is not a regression introduced by this PR.
  • Manual smoke: cross-origin iframes
  • Manual smoke: closed shadow roots in Chrome

🤖 Generated with Claude Code via /percy-sdk-sync

aryanku-dev and others added 7 commits May 11, 2026 15:17
Replace the flat top-level iframe loop with a recursive `processFrameTree`
that switches into each cross-origin iframe, captures its DOM, and
descends into any further cross-origin iframes nested inside it (up to a
configurable depth). Cycles are detected by tracking the chain of
ancestor frame URLs and skipping any frame whose `src` already appears in
the chain — without this guard, pages that link to each other could
produce up to `maxIframeDepth` duplicate corsIframes entries.

The depth cap defaults to 5 (matching the canonical Percy SDK behaviour)
and is configurable per-snapshot via `maxIframeDepth` or via
`cliConfig.snapshot.maxIframeDepth`. Inputs are clamped to a 1..10 range
through `clampFrameDepth`.

Nested-frame origin is compared against the IMMEDIATE PARENT origin (not
the top page origin) so a same-origin grandchild inside a cross-origin
parent is correctly inlined by PercyDOM and a cross-origin grandchild
inside a same-origin parent is still captured.

Mirrors percy/percy-nightwatch#869 and percy/percy-playwright#609.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skip iframes that carry the `data-percy-ignore` boolean attribute when
enumerating both top-level and nested cross-origin iframes. Customers
add this attribute to opt out of CORS iframe capture for a specific
frame without having to maintain a selector list — useful for ad slots
or analytics iframes whose contents are noisy.

Selenium's `getAttribute` returns an empty string for boolean attributes
with no value, so a non-null result is treated as a positive hit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Customers can now pass an `ignoreIframeSelectors` list (either in the
per-snapshot options Map or via `cliConfig.snapshot.ignoreIframeSelectors`)
to skip any cross-origin iframe whose element matches one of the supplied
CSS selectors. Matching is performed in-browser via `Element.matches` so
any selector the browser accepts is valid; invalid selectors are tolerated
without aborting the snapshot.

Inputs go through `normalizeIgnoreSelectors` which accepts a List<String>,
a single String, or null and yields a sanitised List<String> with empty/
whitespace-only entries removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After switching into a cross-origin iframe, read `document.URL` and run
the unsupported-src check again. The parent-side `src` attribute can be
stale or misleading — the frame may have failed to load (leaving an
about:blank document), or been navigated by script after attach to a
data:/javascript: URL. Skipping these post-switch avoids attempting to
serialize a placeholder document.

When a post-switch URL is available it is also reported as the captured
`frameUrl` and used as the parent context for any nested CORS iframe
enumeration. Falls back to the parent-side `src` when the executor
returns a non-String value (e.g. under mocking).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ostException

When the driver fails to step back to a parent frame after recursing into
a nested cross-origin iframe, we previously lost everything captured so
far (a flaky network call inside a depth-3 frame would forfeit even the
depth-1 snapshot). Introduce `PercyContextLostException` which carries a
`partialCapture` list of every iframe snapshot collected before the
failure; each recursion layer appends its own captures to the carried
list and re-throws, and the top-level loop in `getSerializedDOM` merges
the recovered captures into the snapshot and falls back to default
content before aborting further sibling enumeration.

Mirrors the `percyContextLost` flag in percy/percy-nightwatch#869 and
percy/percy-webdriverio#... so the wire-format output stays consistent
across SDKs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closed shadow roots (`{mode: 'closed'}`) are invisible to JavaScript —
`element.shadowRoot` is `null` and there is no API that returns the
underlying ShadowRoot object. The PercyDOM serializer can pierce them
through a window-bound `__percyClosedShadowRoots` WeakMap (host element
→ shadow root) populated before serialization, but Selenium has no way
to obtain the closed shadow root from page script.

Use Chrome DevTools Protocol to discover and resolve them:
  1. `DOM.getDocument {depth: -1, pierce: true}` to walk the entire DOM
     tree including closed shadow subtrees.
  2. For each closed shadow root, `DOM.resolveNode` on the host and the
     shadow root to obtain JS object handles.
  3. `Runtime.callFunctionOn` to write the pair into the WeakMap.

`contentDocument` nodes are skipped because their execution context is
distinct and has no WeakMap. Non-Chromium drivers are detected with a
single `instanceof ChromeDriver` check and silently fall through, so the
SDK keeps working with Firefox/WebKit without changes.

Mirrors percy/percy-playwright#609.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add JUnit + Mockito unit tests for the new helper methods and the
nested cross-origin iframe capture flow:

- `clampFrameDepth` bounds + defaults
- `normalizeIgnoreSelectors` accepts List<String> / String / null
- `resolveMaxFrameDepth` precedence (option > cliConfig > default)
- `resolveIgnoreSelectors` precedence
- `data-percy-ignore` iframes are skipped without `switchTo`
- `ignoreIframeSelectors` matches are skipped without `switchTo`
- `processFrame` bails after switch when document.URL is unsupported
- `PercyContextLostException.partialCapture` round-trips
- `getSerializedDOM` recovers partial captures on context loss
- `exposeClosedShadowRoots` is a no-op for non-Chrome drivers
- `collectClosedShadowPairs` walks the CDP tree and skips iframes

Tests live in a separate `IframeFeatureTest` class to avoid being
blocked by `SdkTest`'s `@BeforeAll` Firefox initialisation in
environments without a Firefox binary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant