fix(mdxish): <HTMLBlocks> inside <Table> not rendering#1484
fix(mdxish): <HTMLBlocks> inside <Table> not rendering#1484eaglethrost wants to merge 15 commits into
Conversation
There was a problem hiding this comment.
It seems like most of the complexity in this PR is dealing with the marker that protectHTMLBlockContent adds.
But we recently merged #1455 which shields multi-line template literal content in component bodies from html parsing. I wonder if it's possible to just remove protectHTMLBlockContent to significantly simplify everything here? I definitely didn't consider the HTMLBlock use case when working on #1455 and never tried removing that pre-processor.
|
i have this work in #1439 to create a tokenizer for our HTMLBlock? maybe that can help with eliminating the need to protect html block content altogether? @eaglethrost @kevinports |
|
Yeah @kevinports @maximilianfalco I've had a rethink of the approach and I find that we can reuse both Kevin's work in #1455 and falco's tokenizer work in #1439:
Will move this to draft first to consolidate the combined logic & investigating the bug |
|
Having this fix working & now allowing html blocks inside table uncovered a a bug where the block content in the table gets indented in the editor, but actually renders fine in view mode. Interestingly this also happens in the old editor in MDX, so it doesn't look like an issue from this PR specifically and would be a separate fix I'll investigate. Demo of this in old editor & MDX project, notice how the block content gets indented after round trips: Screen.Recording.2026-05-28.at.11.10.10.pm.movIt looks like it's taking the indented space literally in the deserialization, might be an editor side fix required. |
kevinports
left a comment
There was a problem hiding this comment.
Very glad to drop all the preprocessing with this revised approach. Lgtm 👍
🎯 What does this PR do?
To try to fix an issue where
<HTMLBlock>is not rendering inside JSX<Table>, this PR makes substantial changes to how we parse HTMLBlocks syntax by moving away from the string-level content protection we've been doing and reusing the existing MDX tokenizer for it.Root cause of rendering issue: We have a preprocessing step in the pipeline where HTMLBlock bodies encoded into an HTML-comment marker (
<!--RDMX_HTMLBLOCK:…-->) inpreprocessJSXExpressions, then decoded back further down the pipeline to be transformed to HTMLBlock nodes. When the<HTMLBlock>is inside a<Table>, the table transformer which still has the encoded HTMLBlock fails to parse it since it uses remarkMdx which turns out rejects HTML comments, making the table never parsed. The blocks were encoded because we didn't want its content to be modified by other preprocessing steps & it's usage of the curly braces could cause expression parsing issues.Approach: We now actually can stop protecting and decoding. Now that the
mdxComponenttokenizer can capture component bodies, including multiline{…}template literals, thanks to the brace-aware body states added in #1455, we can now let the tokenizer claim<HTMLBlock>and read its body straight from the parsed template-literal expression. No marker round-trip, no comment for remarkMdx to choke on. (This is the same direction as @maximilianfalco's HTMLBlock-tokenizer work in #1439.)What changed:
<HTMLBlock>. Split the exclusion set so the micromarkmdxComponentconstruct captures<HTMLBlock>(newTOKENIZER_MDX_COMPONENT_EXCLUDED_TAGS), while the remark string-reparse transforms still leave it alone — re-parsing it there is what would mangle bodies containing unbalanced-looking braces.mdxish-html-blocks.ts) Now the transformer deals with different input data to extract:mdxJsxFlowElement/mdxJsxTextElement) — block context (e.g.<Callout>) and table cells (after their remarkMdx re-parse);<div>(CommonMark slurps these whole, so we split them back out);<HTMLBlock>open/close arriving as separate siblings around the expression.mdxish-tableskeeps a table as a JSX<Table>when a cell contains an<HTMLBlock>(block-level content a GFM cell can't represent).protectHTMLBlockContent+ theRDMX_HTMLBLOCKmarkers, the base64 encode/decode paths, and the table-specific comment-neutralization workaround. HTMLBlock handling collapses from four locations down to one.🧪 QA tips
<HTMLBlock>inside a<Table>cell and confirm the HTML renders without breaking the table, and sibling cells still get markdown:safeMode/runScriptssurvive, and multiple HTMLBlocks in one table all render.<HTMLBlock>and<HTMLBlock>in a generic<div>still render as before.__tests__/lib/mdxish/html-blocks.test.ts.Demo (before & after):
Screen.Recording.2026-05-25.at.7.32.39.pm.mov