fix(stripComments): preserve HTMLBlock template literals in mdxish mode#1410
fix(stripComments): preserve HTMLBlock template literals in mdxish mode#1410eaglethrost wants to merge 8 commits intonextfrom
Conversation
maximilianfalco
left a comment
There was a problem hiding this comment.
nice thanks for finding this out! logic looks good to me but i do wonder that this feels more like a bandaid than an actual fix. l wonder if we can do something like I did in #1371 where we make a tokenizer specifically just to keep the <HTMLBlock> token from being parsed by another transformer
the actual logic can still just live in mdxish-html-blocks.ts but we create a tokenizer essentially to guard the <HTMLBlock> from being parsed by the mdxJsxExpression tokenizer.
i prefer moving away from the whole protect and restore paradigm we have but i do understand if we want to get a bandaid fix rolled out for the meantime.
Yeah I agree, a tokenizer for HTMLBlock is definitely the way to go in the future and should just follow the example in Tables. Though, a consideration I had is that doing the tokenizer to fix this issue would I think be overkill since if we were to integrate the tokenizer, it would create quite a lot of changes with big radius. We have quite a lot of code to handle html blocks in mdxish and quite a few things would change, so it will require more time & I want to get this fix out as soon as possible. Also the extractor code change in this PR should persist since it's a cleanup. I will definitely do that in a follow up! And there we can remove the extract & replace in the stripComments, but in the meantime the issue would be already fixed. Thanks for the suggestion. |
Summary
There was an mdxish doc that was creating a messy markdown (from the "View as markdown" option), this usually happens if there's an error in the pipeline rendering:
Found the issue to be with
<HTMLBlock>elements.<HTMLBlock>{backtick...backtick}</HTMLBlock>expression causedstripCommentsto error with "Unexpected end of file in expression" whenmdxish: true, because themdxExpressionmicromark parser can't handle JS template literals at the text level. This is fine with MDX because it usesremarkMdxwhich takes care of these JSX syntaxes.To fix this, I considered that I don't think we actually need to touch HTMLBlocks in strip comments, because I believe it should be untouched, and if we're already not touching HTML magic blocks, it would make sense to not touch MDX magic blocks. Given that, the simplest fix is to just extract & replace HTMLBlocks like we do for magic blocks, for mdxish specifically.
Let me know if we don't actually want that and should not preprocess the HTMLBlock though.
Changes:
preprocess-jsx-expressions.tsandmdxish-html-blocks.tsinto a sharedlib/utils/extractors/html-blocks.tsmodulestripCommentsas well, for mdxishTesting