Skip to content

Fix xss matcher catastrophic backtracking#30

Open
andymac4182 wants to merge 2 commits into
vercel-labs:mainfrom
andymac4182:fix/xss-matcher-backtracking
Open

Fix xss matcher catastrophic backtracking#30
andymac4182 wants to merge 2 commits into
vercel-labs:mainfrom
andymac4182:fix/xss-matcher-backtracking

Conversation

@andymac4182
Copy link
Copy Markdown

What changed

Replaced the XSS matcher’s template-literal HTML regex with a small linear scanner, while keeping the existing direct DOM sink regexes. Added coverage for template interpolation mixed with HTML and for a long generated HTML/report line that previously triggered catastrophic backtracking.

Why

The previous matcher used a greedy regex:

/\$\{.*\}.*<\/?\w+>|<\w+[^>]*\$\{/

That works on ordinary short source files, but it can hang on long generated lines that contain many ${...}-shaped strings without a matching HTML tag. We hit this when scanning generated report-style HTML: the regex engine spent seconds backtracking through a no-match case. This change preserves the intended XSS candidate detection while making the template-literal heuristic run in predictable linear time.

Verification

  • pnpm test passes
  • pnpm lint passes
  • pnpm knip passes
  • If this adds a matcher: ran it against at least one real repo and confirmed the candidate count is sane
    • Ran the XSS matcher against /Users/amcclenaghan/atlassian/nfs-restricted
    • Scanned 2 matching web-template files
    • Found 1 candidate in tools/9p-browser-demo/index.html
    • Candidate pattern: innerHTML assignment

Notes for reviewer

The failure is input-shape dependent, which is why the old regex can appear fine on normal source files but hang on generated report artifacts. The pathological case is a long single line with many ${...}-looking strings and no eventual HTML tag match.

This PR avoids the risky overlapping .* regex entirely for the template-literal heuristic. The replacement scans each line with indexOf/character checks, so no regex backtracking is possible for that path. Existing sink detections like dangerouslySetInnerHTML, .innerHTML =, document.write, Vue v-html, and Angular [innerHTML] remain regex-based and unchanged.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

@andymac4182 is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

@cramforce
Copy link
Copy Markdown
Contributor

Please repush with a signed commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants