Skip to content

perf(super-converter): replace xml-js with direct sax tree builder for DOCX import (SD-2291)#2514

Closed
caio-pizzol wants to merge 1 commit intomainfrom
caio/sd-2291-replace-xml-js-with-native-domparser-for-import
Closed

perf(super-converter): replace xml-js with direct sax tree builder for DOCX import (SD-2291)#2514
caio-pizzol wants to merge 1 commit intomainfrom
caio/sd-2291-replace-xml-js-with-native-domparser-for-import

Conversation

@caio-pizzol
Copy link
Copy Markdown
Contributor

Bypass xml-js's overhead (options validation, generic dispatch, JSON stringify+parse round-trip) by using sax.js directly with a purpose-built tree builder that produces the same non-compact JSON format.

Benchmarks on customer's 11MB document.xml:

  • Old (JSON.parse(xml2json())): ~1,980ms (browser)
  • New (direct sax builder): ~810ms (browser)
  • ~2x faster across all document sizes

The new parser produces identical output — verified by 13 unit tests including head-to-head comparison against xml-js on realistic DOCX fragments.

@caio-pizzol caio-pizzol self-assigned this Mar 22, 2026
@linear
Copy link
Copy Markdown

linear Bot commented Mar 22, 2026

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9caeb608c8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/super-editor/src/core/super-converter/xml-parser.js
…r DOCX import

Bypass xml-js's overhead (options validation, generic dispatch, JSON
stringify+parse round-trip) by using sax.js directly with a purpose-built
tree builder that produces the same non-compact JSON format.

Benchmarks on customer's 11MB document.xml:
- Old (JSON.parse(xml2json())): ~1,980ms (browser)
- New (direct sax builder):     ~810ms  (browser)
- ~2x faster across all document sizes

The new parser produces identical output — verified by 13 unit tests
including head-to-head comparison against xml-js on realistic DOCX
fragments.

SD-2291
@caio-pizzol caio-pizzol force-pushed the caio/sd-2291-replace-xml-js-with-native-domparser-for-import branch from 9caeb60 to b6f254a Compare March 22, 2026 12:10
@caio-pizzol caio-pizzol marked this pull request as draft March 23, 2026 14:46
@caio-pizzol caio-pizzol added the don't merge Don't merge yet label Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant