Problem Statement
When the Page Scanner (Geo or A11y check) is opened on a traditional page, the URL the editor sends to the scanner is built from the dotCMS admin's origin (window.location.origin) plus the page path, rather than the page's actual host.
In dotCMS the same path can exist on multiple hosts and represent different pages — e.g. /about-us on siteA.example.com is a different page from /about-us on siteB.example.com. The current behavior collapses every traditional page's URL to "admin origin + path", so the external scanner fetches and analyzes the wrong URL.
The bug is silent: the request succeeds, the scanner returns a real-looking report, but it's describing a different page than the one the user is editing. There's no error or warning in the UI.
Where it happens
core-web/libs/portlets/edit-ema/portlet/src/lib/dot-ema-shell/dot-ema-shell.component.ts — line 325, URL assembled as ${requestHostName}${currentUrl} before being passed to pageScanner.open.
core-web/libs/portlets/edit-ema/portlet/src/lib/utils/index.ts — line 797, getRequestHostName(params) returns params.clientHost || window.location.origin. For headless pages clientHost is set, so the URL is correct. For traditional pages clientHost is undefined and the function falls back to window.location.origin.
Scope
- Affects traditional pages only. Headless pages send
clientHost and are not affected.
- Affects every multi-host dotCMS install where pages share paths across hosts (very common).
- Affects both the Geo Check and A11y Check tools, since both go through the same URL-construction path.
Browser / OS: Any.
Steps to Reproduce
- In a dotCMS instance with at least two hosts (
siteA and siteB) accessed via a single admin host (e.g. admin.example.com).
- Create a traditional page at
/about-us on siteB.
- Open it in the UVE editor and trigger Page Scanner → Geo Check (or A11y Check).
- Inspect the network request to
/api/v1/page-scanner/geo/check. The url field in the body will be https://admin.example.com/about-us instead of https://siteB.example.com/about-us.
- Observe that the scan response describes the wrong page (whatever lives at the admin host's
/about-us, or a 404).
Expected: The URL submitted to the scanner identifies the actual page being edited, including its real host — so the external scanner fetches the same page the editor is showing.
Actual: For traditional pages the URL is built from the dotCMS admin's origin, which has no relation to the page's host. Two pages on different hosts that share a path become indistinguishable to the scanner.
Acceptance Criteria
dotCMS Version
Latest from main branch (reproduced on issue-35514-uve-iframe-sizing-phase-1; same code path is on main).
Severity
Medium - Some functionality impacted
Links
NA
Notes for triage
- Backend endpoint (
PageScannerResource.geoCheck / a11yCheck) is a proxy: it forwards whatever URL the FE provides to the upstream scanner SaaS without modification. The current host is read from the request, but only to look up the Page Scanner app's secrets — it does not influence the URL that gets scanned.
- Open question for the fix discussion (not part of the bug itself): whether the right URL should be constructed client-side from page-asset data, or whether the BE proxy should resolve the host server-side from a host identifier in the request body — different trade-offs around trust boundary and validation.
Problem Statement
When the Page Scanner (Geo or A11y check) is opened on a traditional page, the URL the editor sends to the scanner is built from the dotCMS admin's origin (
window.location.origin) plus the page path, rather than the page's actual host.In dotCMS the same path can exist on multiple hosts and represent different pages — e.g.
/about-usonsiteA.example.comis a different page from/about-usonsiteB.example.com. The current behavior collapses every traditional page's URL to "admin origin + path", so the external scanner fetches and analyzes the wrong URL.The bug is silent: the request succeeds, the scanner returns a real-looking report, but it's describing a different page than the one the user is editing. There's no error or warning in the UI.
Where it happens
core-web/libs/portlets/edit-ema/portlet/src/lib/dot-ema-shell/dot-ema-shell.component.ts— line 325, URL assembled as${requestHostName}${currentUrl}before being passed topageScanner.open.core-web/libs/portlets/edit-ema/portlet/src/lib/utils/index.ts— line 797,getRequestHostName(params)returnsparams.clientHost || window.location.origin. For headless pagesclientHostis set, so the URL is correct. For traditional pagesclientHostis undefined and the function falls back towindow.location.origin.Scope
clientHostand are not affected.Browser / OS: Any.
Steps to Reproduce
siteAandsiteB) accessed via a single admin host (e.g.admin.example.com)./about-usonsiteB./api/v1/page-scanner/geo/check. Theurlfield in the body will behttps://admin.example.com/about-usinstead ofhttps://siteB.example.com/about-us./about-us, or a 404).Expected: The URL submitted to the scanner identifies the actual page being edited, including its real host — so the external scanner fetches the same page the editor is showing.
Actual: For traditional pages the URL is built from the dotCMS admin's origin, which has no relation to the page's host. Two pages on different hosts that share a path become indistinguishable to the scanner.
Acceptance Criteria
urlsent in the body of/api/v1/page-scanner/geo/checkand/api/v1/page-scanner/a11y/checkresolves to the page's actual host (not the dotCMS admin's origin).clientHost-based URL construction continues to work for headless pages and the URL sent matches the page being edited.dotCMS Version
Latest from main branch (reproduced on
issue-35514-uve-iframe-sizing-phase-1; same code path is onmain).Severity
Medium - Some functionality impacted
Links
NA
Notes for triage
PageScannerResource.geoCheck/a11yCheck) is a proxy: it forwards whatever URL the FE provides to the upstream scanner SaaS without modification. The current host is read from the request, but only to look up the Page Scanner app's secrets — it does not influence the URL that gets scanned.