Skip to content

Conversation

@harlan-zw
Copy link
Collaborator

@harlan-zw harlan-zw commented Jan 20, 2026

πŸ”— Linked issue

Resolves #335

❓ Type of change

  • πŸ“– Documentation (updates to the documentation or readme)
  • 🐞 Bug fix (a non-breaking change that fixes an issue)
  • πŸ‘Œ Enhancement (improving an existing functionality)
  • ✨ New feature (a non-breaking change that adds functionality)
  • 🧹 Chore (updates to the build process or auxiliary tools and libraries)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)

πŸ“š Description

Third-party embed scripts (Twitter widgets, Instagram embeds) hurt performance and leak user data. Following the Cloudflare Zaraz approach, we now fetch embed data server-side and proxy all assets through your domain.

Added two headless components with scoped slots for full styling control:

<ScriptXEmbed tweet-id="1754336034228171055">
  <template #default="{ userName, text, likesFormatted, photos }">
    <!-- Style however you want -->
  </template>
</ScriptXEmbed>

<ScriptInstagramEmbed post-url="https://instagram.com/p/ABC123/">
  <template #default="{ html, shortcode }">
    <div v-html="html" />
  </template>
</ScriptInstagramEmbed>

What's included:

  • ScriptXEmbed - fetches from X syndication API, exposes tweet data via slots
  • ScriptInstagramEmbed - fetches embed HTML, rewrites asset URLs to proxy
  • Server routes for data fetching + image/asset proxying
  • 10-minute caching at server level
  • Docs, playground examples, and e2e tests
  • Showcase on docs home page

Privacy benefits:

  • Zero third-party JavaScript loaded
  • No cookies set by X/Instagram
  • User IPs not shared with third parties
  • All content served from your domain

Add privacy-first social media embed components that fetch data
server-side and proxy all assets through your domain. Zero
client-side API calls to third-party services.

- ScriptXEmbed: Headless component with scoped slots for X/Twitter
- ScriptInstagramEmbed: Renders proxied Instagram embed HTML
- Server routes for data fetching and image/asset proxying
- 10-minute server-side caching
- Full documentation and e2e tests

Resolves #335

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link
Contributor

vercel bot commented Jan 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
scripts-docs Error Error Jan 24, 2026 2:35pm
scripts-playground Ready Ready Preview, Comment Jan 24, 2026 2:35pm

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 20, 2026

Open in StackBlitz

npm i https://pkg.pr.new/nuxt/scripts/@nuxt/scripts@590

commit: f8fa05c

harlan-zw and others added 2 commits January 20, 2026 19:59
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Comment on lines 31 to 35
const shortcode = extractInstagramShortcode(props.postUrl)
const { data: html, status, error } = useAsyncData<string>(
`instagram-embed-${shortcode}`,
() => $fetch(`${props.apiEndpoint}?url=${encodeURIComponent(props.postUrl)}&captions=${props.captions}`),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shortcode is computed once at component initialization but never updated if the postUrl prop changes. Additionally, if an invalid Instagram URL is passed (one that doesn't match the extraction regex), the shortcode becomes undefined, causing multiple invalid URLs to share the same cache key instagram-embed-undefined, which will incorrectly return cached results from one invalid URL when fetching another.

View Details
πŸ“ Patch Details
diff --git a/src/runtime/components/ScriptInstagramEmbed.vue b/src/runtime/components/ScriptInstagramEmbed.vue
index 0d25caa..7765864 100644
--- a/src/runtime/components/ScriptInstagramEmbed.vue
+++ b/src/runtime/components/ScriptInstagramEmbed.vue
@@ -1,5 +1,6 @@
 <script setup lang="ts">
 import type { HTMLAttributes } from 'vue'
+import { computed } from 'vue'
 import { useAsyncData } from 'nuxt/app'
 import { extractInstagramShortcode } from '../registry/instagram-embed'
 
@@ -28,10 +29,16 @@ const props = withDefaults(defineProps<{
   apiEndpoint: '/api/_scripts/instagram-embed',
 })
 
-const shortcode = extractInstagramShortcode(props.postUrl)
+const shortcode = computed(() => extractInstagramShortcode(props.postUrl))
+
+const cacheKey = computed(() => {
+  const code = shortcode.value
+  // Use shortcode if available, otherwise use a hash of the URL to avoid collisions
+  return `instagram-embed-${code || btoa(props.postUrl).substring(0, 16)}`
+})
 
 const { data: html, status, error } = useAsyncData<string>(
-  `instagram-embed-${shortcode}`,
+  cacheKey,
   () => $fetch(`${props.apiEndpoint}?url=${encodeURIComponent(props.postUrl)}&captions=${props.captions}`),
 )
 

Analysis

Non-reactive cache key and cache collision in ScriptInstagramEmbed

What fails: ScriptInstagramEmbed component has two related caching bugs:

  1. Non-reactive cache key: When the postUrl prop changes, the component fails to refetch data because the cache key is static. The shortcode variable is computed once at component initialization and never updates, causing useAsyncData to reuse the old cached result instead of fetching new data for a different Instagram post.

  2. Cache collision for invalid URLs: When an invalid Instagram URL is passed (one that doesn't match the extraction regex), extractInstagramShortcode() returns undefined. Multiple invalid URLs all receive the cache key instagram-embed-undefined, causing the first invalid URL's error response to be cached and incorrectly returned for all subsequent invalid URLs.

How to reproduce:

  1. Problem 1 - Non-reactive cache key:

    • Create a component instance with postUrl="https://www.instagram.com/p/ABC123/"
    • Change the prop to postUrl="https://www.instagram.com/p/XYZ789/"
    • Expected: Component refetches the new post
    • Actual: Component returns the cached result from the first post
  2. Problem 2 - Cache collision:

    • Create two component instances with different invalid URLs:
      • Instance 1: postUrl="https://example.com/invalid"
      • Instance 2: postUrl="https://another-invalid.com/url"
    • Both receive cache key instagram-embed-undefined
    • First instance's error response is cached
    • Second instance incorrectly returns the first instance's cached error

Root cause:

At line 31, shortcode was computed once from props.postUrl using a regular variable assignment:

const shortcode = extractInstagramShortcode(props.postUrl)

This is not a reactive computed property. According to Nuxt 4.x documentation, the cache key for useAsyncData can be reactive (using computed refs, refs, or getter functions). When a cache key doesn't change, useAsyncData doesn't refetch data. Additionally, when extractInstagramShortcode() returns undefined for invalid URLs, all invalid URLs collapse into a single cache entry.

Solution: Made shortcode a reactive computed property and created a reactive cacheKey computed property that includes a URL hash when the shortcode is undefined, ensuring:

  1. Cache key updates when postUrl prop changes, triggering refetch
  2. Each unique URL gets a unique cache key, eliminating collision for invalid URLs

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 24, 2026

πŸ“ Walkthrough

Walkthrough

Adds server-rendered social embed support for X (Twitter) and Instagram. Introduces ScriptXEmbed and ScriptInstagramEmbed Vue components that fetch embed data server-side and expose tweet/post data, HTML, status, and errors. Adds server handlers and proxies under /api/_scripts/* for embed payloads, images, and assets with domain whitelisting, header shaping, URL rewriting, and caching. Adds runtime registry modules (options, types, helpers), module wiring to register handlers, playground/example pages, documentation pages, test fixtures, and E2E tests. Some documentation and example blocks are duplicated.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

πŸš₯ Pre-merge checks | βœ… 5
βœ… Passed checks (5 passed)
Check name Status Explanation
Title check βœ… Passed The title clearly and concisely describes the main feature added: SSR social media embeds for X and Instagram, which is the primary focus of the PR.
Description check βœ… Passed The description follows the template structure, includes linked issue (#335), specifies the type of change (New feature), and provides detailed explanation of the implementation and privacy benefits.
Linked Issues check βœ… Passed The PR fully implements the requirements of issue #335: server-side rendering of social media embeds for X and Instagram using a Cloudflare Zaraz-like approach with asset proxying and privacy-first design.
Out of Scope Changes check βœ… Passed All changes are scoped to the stated objectives: new components (ScriptXEmbed, ScriptInstagramEmbed), server routes for proxying, documentation, tests, and playground examples. No unrelated changes detected.
Docstring Coverage βœ… Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • πŸ“ Generate docstrings
πŸ§ͺ Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/social-media-embeds

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

πŸ€– Fix all issues with AI agents
In `@docs/app/pages/index.vue`:
- Around line 485-487: The external icon-only anchor using tweetUrl should
include rel="noopener noreferrer" to prevent tabnabbing and an accessible label
(e.g., aria-label="Share on X" or add sr-only text) so screen readers can
describe the link; update the <a :href="tweetUrl" ...> element to include
rel="noopener noreferrer" and either aria-label with a clear description or add
visually-hidden text inside the anchor while keeping the SVG.

In `@docs/content/docs/1.guides/6.cors.md`:
- Around line 36-38: The fenced code block containing the message "Cross-Origin
Request Blocked: The Same Origin Policy disallows reading the remote resource"
is missing a language tag; update that fence in
docs/content/docs/1.guides/6.cors.md by changing the opening backticks from ```
to ```text so the block is explicitly marked as plain text (i.e., replace the
unnamed fence around that string with a ```text fence).

In `@playground/pages/third-parties/x-embed/default.vue`:
- Around line 31-35: The anchor element rendering the external tweet link (the
<a :href="tweetUrl" target="_blank" class="text-gray-400 hover:text-gray-600">
element) needs a rel attribute to prevent tabnabbing; add rel="noopener
noreferrer" to that <a> tag so the external page cannot access window.opener and
to improve security when using target="_blank".

In `@src/runtime/components/ScriptInstagramEmbed.vue`:
- Around line 18-29: Update the apiEndpoint docstring in
ScriptInstagramEmbed.vue so the documented default matches the actual default
value; change the comment for the apiEndpoint property (the JSDoc above
apiEndpoint?: string) from '/_scripts/instagram-embed' to
'/api/_scripts/instagram-embed' to match the default defined in the component
options (the object where captions and apiEndpoint are set).

In `@src/runtime/registry/x-embed.ts`:
- Around line 68-69: proxyXImageUrl currently concatenates the query string
blindly which breaks when proxyEndpoint already contains query parameters;
update the proxyXImageUrl function to detect whether proxyEndpoint includes a
'?' and append the encoded url using '&url=' when it does and '?url=' when it
doesn’t (ensure encodeURIComponent(url) is still used). Reference:
proxyXImageUrl(url: string, proxyEndpoint = '/api/_scripts/x-embed-image').

In `@src/runtime/server/instagram-embed-asset.ts`:
- Around line 16-17: new URL(url) in instagram-embed-asset can throw for
malformed input, causing a 500; wrap the URL parsing (the new URL(url) call that
produces parsedUrl) in a try/catch, validate the input, and when parsing fails
return an HTTP 400 with a clear message (e.g., "invalid asset URL"); update the
same request handler function in this file to catch the error, log/record the
bad input, and short-circuit with a 400 response instead of letting the
exception propagate.

In `@src/runtime/server/instagram-embed-image.ts`:
- Around line 15-34: Wrap the URL parsing in a try/catch and explicitly validate
scheme (only allow http/https) before any network call to avoid throwing on
invalid input; keep the existing hostname check on parsedUrl.hostname
(endsWith('.cdninstagram.com') or equals 'scontent.cdninstagram.com') and reject
otherwise with createError; call $fetch.raw with redirect: 'manual' (use
FetchOptions support in ofetch) so redirects are not followed, and after the
fetch check for 3xx response.status or presence of a Location header and throw
createError if a redirect was returned (to prevent redirect-based allowlist
bypass/SSRF).

In `@src/runtime/server/instagram-embed.ts`:
- Around line 44-51: The two inline arrow functions used as callbacks to the
.replace calls (the first replacing /https:\/\/scontent\.cdninstagram\.com.../
and the second replacing /https:\/\/static\.cdninstagram\.com.../) use
unnecessary parentheses around the single parameter `match`; remove the
parentheses so the callbacks become `match => ...` to satisfy the
`@stylistic/arrow-parens` lint rule in the instagram-embed replacement logic.
- Around line 16-18: The code calls new URL(postUrl) (creating parsedUrl) which
throws for malformed input and currently bubbles as a 500; wrap the URL parsing
in a try/catch around the new URL(postUrl) call inside the handler in
instagram-embed.ts, and on catch return/res.status(400) with a clear message
(e.g., "invalid postUrl") instead of allowing the exception to propagate; keep
the existing hostname check against parsedUrl.hostname after successful parsing.

In `@src/runtime/server/x-embed-image.ts`:
- Around line 5-28: Wrap the URL parsing in a try/catch around new URL(url)
(used where parsedUrl is created) and return a 400 error if parsing fails; after
successful parsing enforce the scheme is either http or https (e.g., check
parsedUrl.protocol === 'http:' || 'https:') before checking allowedDomains, and
treat any non-HTTP(S) or malformed URLs as bad requests; update the existing
error thrown (createError with statusCode 400) for these cases so domain checks
(allowedDomains array and includes(parsedUrl.hostname)) only run on valid,
HTTP(S) URLs.
- Around line 30-35: The image fetch using $fetch.raw (producing const response)
lacks a timeout and can hang SSR workers; add the ofetch timeout option
(milliseconds) to the options passed to $fetch.raw so the request is
automatically aborted after the specified duration (e.g., 5000ms), keeping the
existing headers and error handling intact; update the $fetch.raw call site in
x-embed-image.ts to include timeout and adjust any catch logic if needed to
handle the abort error.

In `@src/runtime/server/x-embed.ts`:
- Around line 47-64: The code currently interpolates the raw query value tweetId
(from getQuery(event)) directly into the fetch URL; validate that tweetId is
present and is a numeric string (or matches the expected id pattern) and return
a 400 if it fails, then build the request URL using URLSearchParams (or
encodeURIComponent) to safely encode id and the generated randomToken before
calling $fetch<TweetData>, referencing the existing tweetId, randomToken and
$fetch call so you only replace the direct template string with a
constructed/encoded query string and the added validation check.
♻️ Duplicate comments (1)
src/runtime/components/ScriptInstagramEmbed.vue (1)

31-36: Make the useAsyncData key reactive and unique

Line 31 computes shortcode once and Line 34 uses a static cache key, so changes to postUrl, captions, or apiEndpoint won’t refetch, and invalid URLs collide on instagram-embed-undefined. This matches earlier feedback and still applies.

πŸ” Suggested fix
-import type { HTMLAttributes } from 'vue'
+import type { HTMLAttributes } from 'vue'
+import { computed } from 'vue'
@@
-const shortcode = extractInstagramShortcode(props.postUrl)
+const shortcode = computed(() => extractInstagramShortcode(props.postUrl))
+const cacheKey = computed(() =>
+  `instagram-embed-${shortcode.value || encodeURIComponent(props.postUrl)}-${props.captions}-${props.apiEndpoint}`,
+)
 
 const { data: html, status, error } = useAsyncData<string>(
-  `instagram-embed-${shortcode}`,
+  cacheKey,
   () => $fetch(`${props.apiEndpoint}?url=${encodeURIComponent(props.postUrl)}&captions=${props.captions}`),
 )
🧹 Nitpick comments (2)
src/runtime/registry/instagram-embed.ts (1)

27-29: Consider URL parsing to avoid false positives in shortcode extraction.
The regex can match instagram.com/... inside unrelated URLs. Parsing with URL and validating hostname makes it more robust.

♻️ Suggested refactor
 export function extractInstagramShortcode(url: string): string | undefined {
-  const match = url.match(/instagram\.com\/(?:p|reel|tv)\/([^/?]+)/)
-  return match?.[1]
+  try {
+    const parsed = new URL(url)
+    if (!/(^|\.)instagram\.com$/.test(parsed.hostname))
+      return
+    const match = parsed.pathname.match(/^\/(p|reel|tv)\/([^/?#]+)/)
+    return match?.[2]
+  }
+  catch {
+    return
+  }
 }
playground/pages/third-parties/x-embed/default.vue (1)

44-46: Add an alt attribute for tweet photos (a11y).

Even decorative images should include alt="" to avoid screen reader noise.

βœ… Proposed fix
-              <img v-for="photo in photos" :key="photo.url" :src="photo.proxiedUrl" class="w-full">
+              <img v-for="photo in photos" :key="photo.url" :src="photo.proxiedUrl" alt="" class="w-full">

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

πŸ€– Fix all issues with AI agents
In `@src/runtime/registry/x-embed.ts`:
- Around line 76-85: formatTweetDate uses the runtime's local timezone via
toLocaleString which can differ between SSR and client and cause hydration
mismatches; update formatTweetDate to produce a deterministic timezone (e.g.,
use a fixed zone like 'UTC' or a chosen IANA zone) by passing a timeZone option
to all Intl calls (or by constructing an Intl.DateTimeFormat with timeZone and
reusing it) so both the time and month parts use the same fixed timezone and
options (hour12, hour/minute) to ensure identical SSR and client output.
♻️ Duplicate comments (1)
src/runtime/server/instagram-embed.ts (1)

52-64: Harden script-tag stripping to avoid leaking third‑party JS.
Line 63–64 uses a brittle regex that misses variants like </script >, so some scripts may slip through and violate the β€œno third‑party JS” claim.

πŸ”§ Suggested fix (more tolerant regex)
-    .replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi, '')
+    .replace(/<script\b[^>]*>[\s\S]*?<\/script\s*>/gi, '')

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

πŸ€– Fix all issues with AI agents
In `@src/runtime/server/instagram-embed.ts`:
- Around line 89-104: The rewrittenHtml pipeline is missing removal of <script>
tags so Instagram's embed.js (and any inline/remote scripts) still remain;
update the chain that builds rewrittenHtml in instagram-embed.ts to also strip
all script tags by adding a replace for scripts (e.g., a global,
case-insensitive regex that matches <script ...>...</script> including optional
whitespace in the closing tag) after the noscript removal so both inline and
external scripts are removed and encoded URLs rewrites remain unchanged; target
the rewrittenHtml variable in this file when making the change.
🧹 Nitpick comments (2)
src/runtime/server/instagram-embed-asset.ts (1)

36-46: Consider adding a fetch timeout for resilience.

The fetch has no explicit timeout, which could cause the request to hang if the upstream CDN is slow or unresponsive. While Instagram's CDN is generally reliable, adding a timeout improves resilience.

♻️ Suggested improvement
   const response = await $fetch.raw(url, {
+    timeout: 10000, // 10 second timeout
     headers: {
       'Accept': '*/*',
       'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36',
     },
   }).catch((error: any) => {
src/runtime/server/instagram-embed.ts (1)

52-72: CSS link extraction regex is fragile.

Using two separate regex patterns to handle different attribute orderings is brittle. A single regex with a more flexible pattern would be more robust. Also, silently swallowing CSS fetch errors may mask issues in production.

♻️ Suggested improvement
-  // Extract CSS URLs from link tags
-  const cssUrls: string[] = []
-  const linkRegex = /<link[^>]+rel=["']stylesheet["'][^>]+href=["']([^"']+)["'][^>]*>/gi
-  let match
-  while ((match = linkRegex.exec(html)) !== null) {
-    cssUrls.push(match[1])
-  }
-  // Also check href before rel
-  const linkRegex2 = /<link[^>]+href=["']([^"']+)["'][^>]+rel=["']stylesheet["'][^>]*>/gi
-  while ((match = linkRegex2.exec(html)) !== null) {
-    cssUrls.push(match[1])
-  }
+  // Extract CSS URLs from link tags (handles any attribute order)
+  const cssUrls: string[] = []
+  const linkRegex = /<link[^>]*>/gi
+  let match
+  while ((match = linkRegex.exec(html)) !== null) {
+    const tag = match[0]
+    if (/rel=["']stylesheet["']/i.test(tag)) {
+      const hrefMatch = tag.match(/href=["']([^"']+)["']/i)
+      if (hrefMatch)
+        cssUrls.push(hrefMatch[1])
+    }
+  }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

πŸ€– Fix all issues with AI agents
In `@src/runtime/server/instagram-embed.ts`:
- Around line 113-117: The current CSS injection into rewrittenHtml using
rewrittenHtml.replace(... `<style>${baseStyles}\n${combinedCss}</style>`) can be
broken if combinedCss contains a literal `</style>`; sanitize the CSS before
injection by escaping or neutralizing any `</style>` occurrences (e.g., replace
`</style>` with `<\/style>` or an equivalent safe token) so the injected string
remains a single safe style block; update the code that builds the injected
string (the place using baseStyles and combinedCss and the rewrittenHtml replace
call) to perform this replacement on combinedCss (and baseStyles if dynamic)
prior to concatenation.
🧹 Nitpick comments (2)
src/runtime/server/instagram-embed.ts (2)

52-65: Consider deduplicating CSS URLs.

The two regex patterns may capture the same URL if a link tag appears in both formats, resulting in duplicate fetches. A simple deduplication would improve efficiency:

Suggested fix
  // Extract CSS URLs from link tags
- const cssUrls: string[] = []
+ const cssUrlsSet = new Set<string>()
  const linkRegex = /<link[^>]+rel=["']stylesheet["'][^>]+href=["']([^"']+)["'][^>]*>/gi
  let match
  while ((match = linkRegex.exec(html)) !== null) {
    if (match[1])
-     cssUrls.push(match[1])
+     cssUrlsSet.add(match[1])
  }
  // Also check href before rel
  const linkRegex2 = /<link[^>]+href=["']([^"']+)["'][^>]+rel=["']stylesheet["'][^>]*>/gi
  while ((match = linkRegex2.exec(html)) !== null) {
    if (match[1])
-     cssUrls.push(match[1])
+     cssUrlsSet.add(match[1])
  }
+ const cssUrls = [...cssUrlsSet]

28-33: Consider supporting mobile Instagram URLs.

The hostname check allows only instagram.com and www.instagram.com, but users might paste mobile URLs from m.instagram.com. Consider adding it to the allowlist if this is a valid use case.

Comment on lines +113 to +117
// Inject inlined CSS into head
rewrittenHtml = rewrittenHtml.replace(
'</head>',
`<style>${baseStyles}\n${combinedCss}</style></head>`,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

CSS injection could be susceptible to injection if CSS contains </style>.

If the fetched CSS contains a </style> string (e.g., in a comment or malformed content), it would break the style block and potentially allow HTML injection. Consider escaping or using a safer injection method.

Suggested fix
+ // Escape any </style> sequences in CSS to prevent injection
+ const safeCss = combinedCss.replace(/<\/style/gi, '<\\/style')
  // Inject inlined CSS into head
  rewrittenHtml = rewrittenHtml.replace(
    '</head>',
-   `<style>${baseStyles}\n${combinedCss}</style></head>`,
+   `<style>${baseStyles}\n${safeCss}</style></head>`,
  )
πŸ€– Prompt for AI Agents
In `@src/runtime/server/instagram-embed.ts` around lines 113 - 117, The current
CSS injection into rewrittenHtml using rewrittenHtml.replace(...
`<style>${baseStyles}\n${combinedCss}</style>`) can be broken if combinedCss
contains a literal `</style>`; sanitize the CSS before injection by escaping or
neutralizing any `</style>` occurrences (e.g., replace `</style>` with
`<\/style>` or an equivalent safe token) so the injected string remains a single
safe style block; update the code that builds the injected string (the place
using baseStyles and combinedCss and the rewrittenHtml replace call) to perform
this replacement on combinedCss (and baseStyles if dynamic) prior to
concatenation.

Comment on lines +91 to +93
let rewrittenHtml = html
// Remove all scripts - embed works without JS via Googlebot UA
.replace(/<script[\s\S]*?<\/script>/gi, '')

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 1 day ago

In general, the problem is that a single pass of a multi-character regex to remove <script>...</script> blocks can, in contrived cases, leave behind a new <script sequence after removal. To fix this, we should repeatedly apply the removal until no more <script>...</script> blocks remain, as recommended in the background, thereby ensuring that no script block can re-form from the remnants of a previous match.

The best targeted fix here is to wrap the script-removal replacement in a small loop: keep calling .replace(/<script[\s\S]*?<\/script>/gi, '') on the HTML until the call no longer changes the string (i.e., a fixpoint). This preserves all existing behavior (we’re still using the same regex and still ultimately stripping the same things) while closing off the multi-character sanitization gap. To keep the surrounding chain of .replace calls readable, we can first normalize html into a local mutable variable (which is already done as let rewrittenHtml = html), and then replace the single .replace(/<script... call with a small block that repeatedly strips scripts before continuing with the rest of the chained replacements.

Concretely, within src/runtime/server/instagram-embed.ts around lines 91–98, we will:

  • Initialize let rewrittenHtml = html as before.
  • Add a do { ... } while loop or while (true) loop that:
    • Stores the previous value.
    • Applies .replace(/<script[\s\S]*?<\/script>/gi, '').
    • Breaks when the value no longer changes.
  • Then continue the existing chain of .replace calls (for <link>, <noscript>, URL rewriting, etc.) starting from this fully script-stripped rewrittenHtml.
    No imports or external libraries are required for this change.
Suggested changeset 1
src/runtime/server/instagram-embed.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/runtime/server/instagram-embed.ts b/src/runtime/server/instagram-embed.ts
--- a/src/runtime/server/instagram-embed.ts
+++ b/src/runtime/server/instagram-embed.ts
@@ -89,8 +89,18 @@
   `
 
   let rewrittenHtml = html
-    // Remove all scripts - embed works without JS via Googlebot UA
-    .replace(/<script[\s\S]*?<\/script>/gi, '')
+
+  // Remove all scripts - embed works without JS via Googlebot UA
+  // Apply repeatedly to avoid incomplete multi-character sanitization
+  while (true) {
+    const previous = rewrittenHtml
+    rewrittenHtml = rewrittenHtml.replace(/<script[\s\S]*?<\/script>/gi, '')
+    if (rewrittenHtml === previous) {
+      break
+    }
+  }
+
+  rewrittenHtml = rewrittenHtml
     // Remove link tags (we're inlining CSS)
     .replace(/<link[^>]+rel=["']stylesheet["'][^>]*>/gi, '')
     .replace(/<link[^>]+href=["'][^"']+\.css[^"']*["'][^>]*>/gi, '')
EOF
@@ -89,8 +89,18 @@
`

let rewrittenHtml = html
// Remove all scripts - embed works without JS via Googlebot UA
.replace(/<script[\s\S]*?<\/script>/gi, '')

// Remove all scripts - embed works without JS via Googlebot UA
// Apply repeatedly to avoid incomplete multi-character sanitization
while (true) {
const previous = rewrittenHtml
rewrittenHtml = rewrittenHtml.replace(/<script[\s\S]*?<\/script>/gi, '')
if (rewrittenHtml === previous) {
break
}
}

rewrittenHtml = rewrittenHtml
// Remove link tags (we're inlining CSS)
.replace(/<link[^>]+rel=["']stylesheet["'][^>]*>/gi, '')
.replace(/<link[^>]+href=["'][^"']+\.css[^"']*["'][^>]*>/gi, '')
Copilot is powered by AI and may make mistakes. Always verify output.

let rewrittenHtml = html
// Remove all scripts - embed works without JS via Googlebot UA
.replace(/<script[\s\S]*?<\/script>/gi, '')

Check failure

Code scanning / CodeQL

Bad HTML filtering regexp High

This regular expression does not match script end tags like </script >.

Copilot Autofix

AI 1 day ago

In general, to fix this issue you must either (a) stop trying to strip scripts with hand-written regexes and instead use a proper HTML parser/sanitizer, or (b) if you must stay with regex, at least make the pattern robust against the common browser-tolerated variants of script end tags. The goal is to ensure that all <script> blocks are removed, including those whose closing tags contain extra whitespace or attributes.

The least invasive, single best fix hereβ€”without changing overall behaviorβ€”is to strengthen the existing regex so that it matches script end tags that have optional whitespace before the name, optional additional junk/attributes after script, and optional whitespace before the closing >. For example: </script>, </script >, </script foo="bar">, etc. We can do this by replacing </script> with a more permissive pattern such as </\s*script[^>]*>. The rest of the expression (<script[\s\S]*?) remains as-is, still doing a non-greedy match across any content.

Concretely, in src/runtime/server/instagram-embed.ts, update the .replace call on line 93 to use a more robust expression: change /\<script[\s\S]*?<\/script>/gi to /\<script[\s\S]*?<\/\s*script[^>]*>/gi. No additional imports or helper methods are required.

Suggested changeset 1
src/runtime/server/instagram-embed.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/runtime/server/instagram-embed.ts b/src/runtime/server/instagram-embed.ts
--- a/src/runtime/server/instagram-embed.ts
+++ b/src/runtime/server/instagram-embed.ts
@@ -90,7 +90,7 @@
 
   let rewrittenHtml = html
     // Remove all scripts - embed works without JS via Googlebot UA
-    .replace(/<script[\s\S]*?<\/script>/gi, '')
+    .replace(/<script[\s\S]*?<\/\s*script[^>]*>/gi, '')
     // Remove link tags (we're inlining CSS)
     .replace(/<link[^>]+rel=["']stylesheet["'][^>]*>/gi, '')
     .replace(/<link[^>]+href=["'][^"']+\.css[^"']*["'][^>]*>/gi, '')
EOF
@@ -90,7 +90,7 @@

let rewrittenHtml = html
// Remove all scripts - embed works without JS via Googlebot UA
.replace(/<script[\s\S]*?<\/script>/gi, '')
.replace(/<script[\s\S]*?<\/\s*script[^>]*>/gi, '')
// Remove link tags (we're inlining CSS)
.replace(/<link[^>]+rel=["']stylesheet["'][^>]*>/gi, '')
.replace(/<link[^>]+href=["'][^"']+\.css[^"']*["'][^>]*>/gi, '')
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

πŸ€– Fix all issues with AI agents
In `@src/runtime/server/instagram-embed.ts`:
- Around line 91-98: The current script-stripping chain that builds
rewrittenHtml using a single .replace(/<script[\s\S]*?<\/script>/gi, '') is
unsafe; update the logic that constructs rewrittenHtml so script tags are
removed defensively: either run the script-removal regex in a loop until no
change (using a flexible closing-tag pattern that tolerates whitespace like
</script\s*> and case/attribute trickery) to handle nested/obfuscated fragments,
or better yet replace the ad-hoc removals with a proper HTML sanitizer (e.g.,
DOMPurify or sanitize-html) applied to the original html before the
link/noscript removals; keep the existing link and noscript removals but perform
them after sanitization/iterative script-removal to ensure all script nodes are
eliminated.
♻️ Duplicate comments (1)
src/runtime/server/instagram-embed.ts (1)

115-119: CSS injection vulnerable if fetched CSS contains </style>.

If Instagram's CSS response contains a </style> string (in comments or malformed content), it breaks out of the style block and could enable HTML injection. This concern was previously raised but appears unaddressed.

πŸ”’ Proposed fix to escape closing style tags
+ // Escape any </style> sequences in CSS to prevent injection
+ const safeCss = combinedCss.replace(/<\/style/gi, '<\\/style')
  // Inject inlined CSS into head
  rewrittenHtml = rewrittenHtml.replace(
    '</head>',
-   `<style>${baseStyles}\n${combinedCss}</style></head>`,
+   `<style>${baseStyles}\n${safeCss}</style></head>`,
  )
🧹 Nitpick comments (2)
src/runtime/server/instagram-embed.ts (2)

52-65: Minor: CSS link extraction may miss edge cases.

The two regex patterns cover rel before/after href, but could miss links with additional attributes between them (e.g., <link type="text/css" href="..." rel="stylesheet">). This is low-risk since missing CSS only affects styling, not security.

♻️ More robust alternative using a single flexible pattern
- const linkRegex = /<link[^>]+rel=["']stylesheet["'][^>]+href=["']([^"']+)["'][^>]*>/gi
- let match
- while ((match = linkRegex.exec(html)) !== null) {
-   if (match[1])
-     cssUrls.push(match[1])
- }
- // Also check href before rel
- const linkRegex2 = /<link[^>]+href=["']([^"']+)["'][^>]+rel=["']stylesheet["'][^>]*>/gi
- while ((match = linkRegex2.exec(html)) !== null) {
-   if (match[1])
-     cssUrls.push(match[1])
- }
+ const linkRegex = /<link[^>]+>/gi
+ let match
+ while ((match = linkRegex.exec(html)) !== null) {
+   const tag = match[0]
+   if (/rel=["']stylesheet["']/i.test(tag)) {
+     const hrefMatch = /href=["']([^"']+)["']/i.exec(tag)
+     if (hrefMatch?.[1])
+       cssUrls.push(hrefMatch[1])
+   }
+ }

121-125: Consider adding security headers for the HTML response.

Since this endpoint returns HTML that will be rendered, consider adding security headers to mitigate potential risks from any unintended content:

πŸ›‘οΈ Optional security headers
  // Cache for 10 minutes
  setHeader(event, 'Content-Type', 'text/html')
  setHeader(event, 'Cache-Control', 'public, max-age=600, s-maxage=600')
+ setHeader(event, 'X-Content-Type-Options', 'nosniff')

@harlan-zw harlan-zw merged commit aa542c0 into main Jan 24, 2026
9 of 11 checks passed
@harlan-zw harlan-zw deleted the feat/social-media-embeds branch January 24, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Social Media Embeds

2 participants