Skip to content

feat: export translateMagicBlocks helper#1473

Open
RohanADev01 wants to merge 3 commits into
nextfrom
rohan/rm-16598-export-magic-block-extractors-from-readmemarkdown-so-gitto
Open

feat: export translateMagicBlocks helper#1473
RohanADev01 wants to merge 3 commits into
nextfrom
rohan/rm-16598-export-magic-block-extractors-from-readmemarkdown-so-gitto

Conversation

@RohanADev01
Copy link
Copy Markdown

@RohanADev01 RohanADev01 commented May 15, 2026

📖 PR App 🎫 RM-16598

Summary

  • New public helper: Exports translateMagicBlocks from @readme/markdown so consumers (e.g. gitto's AI lint preprocessor) can translate legacy [block:image] syntax into <Image /> JSX without vendoring their own magic-block parser.
  • Scoped translation: Only [block:image] blocks are translated; all other magic block types and malformed blocks pass through unchanged.
  • Line-count preserving: The serialized replacement is padded so the output's total line count matches the input — keeps downstream line-number-sensitive consumers (lint UI) accurate.

Changes

New helper module

  • lib/translateMagicBlocks.ts: drives the existing extractor → tokenizer → transformer pipeline for [block:image] spans only, then serializes the transformed MDAST into <Image /> JSX. Captioned and uncaptioned arms are both handled; the uncaptioned arm preserves align / border / width that mdast-util-to-markdown's default image handler would otherwise drop.
  • JSX attribute values containing ambiguous characters (", \, <, >, {, }, control chars) are emitted in JSX-expression-attribute form (key={JSON.stringify(value)}) so the output is parseable MDX.
  • Per-span try/catch: malformed JSON or transformer exceptions leave the original raw block in place rather than throwing.
  • Hard-codes safeMode: true internally so <<variables>> and {user.*} round-trip as literal text.

Public export

  • lib/index.ts: re-exports the helper adjacent to stripComments, matching the existing barrel convention.

Tests

  • __tests__/lib/translateMagicBlocks.test.ts: covers captioned/uncaptioned image translation, attribute preservation, first-image-wins, line-count preservation, malformed-JSON fall-through, unknown-block-type pass-through, and MDX-ambiguous attribute escaping.

Customer Impact

Internal/library-only. This PR is purely additive — it ships a new named export with no changes to existing behavior. Customer-facing impact lands when downstream consumers (gitto) swap their vendored helper for this export in a follow-up PR.

QA & Testing

  • npm test -- translateMagicBlocks passes
  • Existing test suite passes unchanged (no behavior change to other exports)
  • Manual smoke: import translateMagicBlocks and confirm [block:image] content with align / border / width round-trips to <Image /> JSX with attributes preserved
  • Verify non-image magic blocks ([block:callout], [block:embed], etc.) and malformed [block:image] JSON pass through unchanged

@RohanADev01 RohanADev01 marked this pull request as ready for review May 15, 2026 16:04
Copy link
Copy Markdown
Contributor

@eaglethrost eaglethrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start! Using the magic block tokenizer & transformer is definitely the right call. The main thing is I think the serialisation code can be made simpler & we won't need to create as this much helper functions, see my comments!

Comment thread lib/translateMagicBlocks.ts Outdated

type AttributeValue = boolean | number | string;

interface SerializableNode {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, i've switched to the existing types so that image case uses MagicBlockImage directly and the generic walker uses MdastNode

Comment thread lib/translateMagicBlocks.ts Outdated
*/
export default function translateMagicBlocks(content: string) {
MAGIC_BLOCK_REGEX.lastIndex = 0;
return content.replace(MAGIC_BLOCK_REGEX, match => translateImageBlock(match));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious on how we can make this scalable for other magic blocks. If we want to translate more types, is it as simple as to just extend this to include more translators?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep that's the shape now. there's a translators map keyed by block type, so a new one is a function plus one line to register it. each translator still owns its own parse + validate + AST construction since the block shapes are all different

Comment thread lib/translateMagicBlocks.ts Outdated
return image ? serializeImage(image, caption || undefined) : null;
}

function serializeTransformedNode(node: SerializableNode): string | null {
Copy link
Copy Markdown
Contributor

@eaglethrost eaglethrost May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're doing too much manual serialization. I wonder if you can just directly use the readmeToMdx function, or maybe mdxishMdastToMd.

So instead of manual serializers, you can extend the processor to use more transformers & stringify-ers to convert the image MDAST back to string.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've moved off the string templating, we're building the mdxJsxFlowElement and letting mdxishMdastToMd do the actual stringify pass. didn't pull in the full readmeToMdx since this path only needs the image branch, but the serializer is the same one now

Comment thread lib/translateMagicBlocks.ts Outdated
return null;
}

function serializeTransformedBlock(tree: unknown) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function gets retained, we should use Tree as the type

Copy link
Copy Markdown
Author

@RohanADev01 RohanADev01 May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep we can switch to Tree from unist if we'd rather keep it generic

@RohanADev01 RohanADev01 requested a review from eaglethrost May 18, 2026 18:24
Copy link
Copy Markdown

@Theskrtnerd Theskrtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than what Dimas said, code lgtm. But just commenting since I dont have much context about this codebase

Copy link
Copy Markdown
Contributor

@maximilianfalco maximilianfalco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a rough skim of things but just wondering if it would be better to just generalise this to support all magic block types anyway? the magic block tokenizer and transformer already handles this so i dont think itll be big to extend this a bit more

iirc the main reason we are doing this is because image magic block syntax was not known generally and Image JSX was more well known. wont this scenario also come up eventually when we want to "read" other magic block types? wdyt? @eaglethrost @RohanADev01

Copy link
Copy Markdown
Contributor

@eaglethrost eaglethrost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating, looks better!

wondering if it would be better to just generalise this to support all magic block types anyway?

  • I agree with @maximilianfalco on this! I don't mind having this PR focus on image first, but this should be explained in the PR desc, and also the code should be extensible & designed with this in mind (already good in that regard, maybe add more comments about it)

  • @maximilianfalco Looked through the code and it's not possible to get serialised JSX syntax if the nodes are not mdxFlow / Text elements, so Rohan's transformers make sense here. We're able to do it in the editor because of the pmToMdast code, which also manually creates JSX syntax. I think you should put a comment to explain this @RohanADev01 on what the to mdx transformers are for.

  • Also have a comment about code structure & cleanliness!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file mixes a lot of different code & use cases, so I think we should split up this file & move the utilities, transformers elsewhere. One idea is to have a lib/translate folder, under which you can have various block translators, utility functions, etc. Currently it's a bit confusing to read this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree on this!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea or maybe you can just create a general translator class for magic blocks and do like a decorator pattern for each magic block type


type MagicBlockImageNode = MagicBlockImage & MdastNode;

const MAGIC_BLOCK_OPEN_RE = /^\[block:([^\]]{1,100})\]/;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this regex is part of the MAGIC_BLOCK_REGEX, can you add a comment mentioning that?

Copy link
Copy Markdown
Contributor

@maximilianfalco maximilianfalco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall nice! just some nits and questions. thanks heaps for iterating on this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree on this!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea or maybe you can just create a general translator class for magic blocks and do like a decorator pattern for each magic block type

hProperties?: {
align?: string;
border?: string;
border?: boolean;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait why do we change this? this looks like it has a pretty large blast radius


function translateMagicBlock(raw: string) {
const blockType = raw.match(MAGIC_BLOCK_OPEN_RE)?.[1];
const translator = blockType ? translators[blockType] : undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i love this, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants