Skip to content

fix(mdxish): <HTMLBlocks> inside <Table> not rendering#1484

Open
eaglethrost wants to merge 15 commits into
nextfrom
dimas/rm-16726-htmlblock-not-rendering-in-tables
Open

fix(mdxish): <HTMLBlocks> inside <Table> not rendering#1484
eaglethrost wants to merge 15 commits into
nextfrom
dimas/rm-16726-htmlblock-not-rendering-in-tables

Conversation

@eaglethrost
Copy link
Copy Markdown
Contributor

@eaglethrost eaglethrost commented May 25, 2026

🎫 Resolve RM-16726

🎯 What does this PR do?

To try to fix an issue where <HTMLBlock> is not rendering inside JSX <Table>, this PR makes substantial changes to how we parse HTMLBlocks syntax by moving away from the string-level content protection we've been doing and reusing the existing MDX tokenizer for it.

Root cause of rendering issue: We have a preprocessing step in the pipeline where HTMLBlock bodies encoded into an HTML-comment marker (<!--RDMX_HTMLBLOCK:…-->) in preprocessJSXExpressions, then decoded back further down the pipeline to be transformed to HTMLBlock nodes. When the <HTMLBlock> is inside a <Table>, the table transformer which still has the encoded HTMLBlock fails to parse it since it uses remarkMdx which turns out rejects HTML comments, making the table never parsed. The blocks were encoded because we didn't want its content to be modified by other preprocessing steps & it's usage of the curly braces could cause expression parsing issues.

Approach: We now actually can stop protecting and decoding. Now that the mdxComponent tokenizer can capture component bodies, including multiline {} template literals, thanks to the brace-aware body states added in #1455, we can now let the tokenizer claim <HTMLBlock> and read its body straight from the parsed template-literal expression. No marker round-trip, no comment for remarkMdx to choke on. (This is the same direction as @maximilianfalco's HTMLBlock-tokenizer work in #1439.)

What changed:

  • Tokenizer claims <HTMLBlock>. Split the exclusion set so the micromark mdxComponent construct captures <HTMLBlock> (new TOKENIZER_MDX_COMPONENT_EXCLUDED_TAGS), while the remark string-reparse transforms still leave it alone — re-parsing it there is what would mangle bodies containing unbalanced-looking braces.
  • Adjust the html block transformer(mdxish-html-blocks.ts) Now the transformer deals with different input data to extract:
    1. JSX element (mdxJsxFlowElement/mdxJsxTextElement) — block context (e.g. <Callout>) and table cells (after their remarkMdx re-parse);
    2. Raw HTML blob — single-line top-level, or nested in raw HTML like an inline <div> (CommonMark slurps these whole, so we split them back out);
    3. Inline-in-paragraph<HTMLBlock> open/close arriving as separate siblings around the expression.
  • mdxish-tables keeps a table as a JSX <Table> when a cell contains an <HTMLBlock> (block-level content a GFM cell can't represent).
  • Removed the marker machinery entirely: protectHTMLBlockContent + the RDMX_HTMLBLOCK markers, the base64 encode/decode paths, and the table-specific comment-neutralization workaround. HTMLBlock handling collapses from four locations down to one.

🧪 QA tips

  • Render an <HTMLBlock> inside a <Table> cell and confirm the HTML renders without breaking the table, and sibling cells still get markdown:
    <Table>
      <tbody>
        <tr>
          <td>**bold** still works</td>
          <td><HTMLBlock>{`<div style="color: red;">Hello</div>`}</HTMLBlock></td>
        </tr>
      </tbody>
    </Table>
  • Confirm safeMode/runScripts survive, and multiple HTMLBlocks in one table all render.
  • Confirm top-level <HTMLBlock> and <HTMLBlock> in a generic <div> still render as before.
  • New coverage added in __tests__/lib/mdxish/html-blocks.test.ts.

Demo (before & after):

Screen.Recording.2026-05-25.at.7.32.39.pm.mov

@eaglethrost eaglethrost changed the title fix: render HTMLBlocks nested inside JSX blocks (tables) fix(mdxish): <HTMLBlocks> inside <Table> not rendering May 25, 2026
Comment thread processor/transform/mdxish/rehype-html-blocks-in-jsx.ts Fixed
@eaglethrost eaglethrost marked this pull request as ready for review May 25, 2026 09:36
@eaglethrost eaglethrost requested a review from kevinports May 26, 2026 12:21
Copy link
Copy Markdown
Contributor

@kevinports kevinports left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most of the complexity in this PR is dealing with the marker that protectHTMLBlockContent adds.

But we recently merged #1455 which shields multi-line template literal content in component bodies from html parsing. I wonder if it's possible to just remove protectHTMLBlockContent to significantly simplify everything here? I definitely didn't consider the HTMLBlock use case when working on #1455 and never tried removing that pre-processor.

@maximilianfalco
Copy link
Copy Markdown
Contributor

i have this work in #1439 to create a tokenizer for our HTMLBlock? maybe that can help with eliminating the need to protect html block content altogether? @eaglethrost @kevinports

@eaglethrost
Copy link
Copy Markdown
Contributor Author

eaglethrost commented May 27, 2026

Yeah @kevinports @maximilianfalco I've had a rethink of the approach and I find that we can reuse both Kevin's work in #1455 and falco's tokenizer work in #1439:

  • fix(mdxish): terminal custom component breaks page rendering #1455 actually allows to reuse the MDX tokenizer to tokenise <HTMLBlock>, so we don't fully need the lone tokenizer. The main thing is as Kevin mentioned, we won't need to protect the html blocks & solves the fundamental issue of the block not rendering in tables
  • This means we can also simplify the html block transformer
  • I think it's worth combine both of your works so we can get the refactor + fixing this original ticket
  • I also found a bug when integrating to the main app where the content inside the HTMLBlock placed inside a table gets unexpectedly indented in serialisation. This could be an issue in the editor serialization

Will move this to draft first to consolidate the combined logic & investigating the bug

@eaglethrost eaglethrost marked this pull request as ready for review May 28, 2026 08:42
@eaglethrost eaglethrost marked this pull request as draft May 28, 2026 12:26
@eaglethrost eaglethrost marked this pull request as ready for review May 28, 2026 13:06
@eaglethrost
Copy link
Copy Markdown
Contributor Author

Having this fix working & now allowing html blocks inside table uncovered a a bug where the block content in the table gets indented in the editor, but actually renders fine in view mode. Interestingly this also happens in the old editor in MDX, so it doesn't look like an issue from this PR specifically and would be a separate fix I'll investigate.

Demo of this in old editor & MDX project, notice how the block content gets indented after round trips:

Screen.Recording.2026-05-28.at.11.10.10.pm.mov

It looks like it's taking the indented space literally in the deserialization, might be an editor side fix required.

Copy link
Copy Markdown
Contributor

@kevinports kevinports left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very glad to drop all the preprocessing with this revised approach. Lgtm 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants