-
Notifications
You must be signed in to change notification settings - Fork 665
feat(xml/unstable): add XML parsing and serialization module #6942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
could we get some benchmarks comparing to other parsers? |
Yes, that's a good idea. I'll look into it. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6942 +/- ##
==========================================
+ Coverage 94.08% 94.21% +0.12%
==========================================
Files 600 610 +10
Lines 43553 45516 +1963
Branches 6997 7466 +469
==========================================
+ Hits 40977 42881 +1904
- Misses 2521 2574 +53
- Partials 55 61 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Updated the description now. Let me know if you would like to benchmark against a specific package |
e20f201 to
9b1fe20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I have no idea why this formatting is happening 😅
|
Ref: denoland/deno#24995 |
Thanks, I was not aware of this discussion. Perhaps we could have it as an unstable module for now? Then we can always kick it out, should the core team decide to implement |
|
Updated again to increase streaming performance and get test coverage to 100% |
|
More perf work. Will look into using callbacks instead of arrays of objects |
|
More perf gains on non-streaming parsing. Updated benchmark results |
|
@tomas-zijdemans the perf looks really good! what machine are you running this on for reference? |
|
@crowlKats: A MacBook with M1 Max |
|
could we add https://www.w3.org/XML/Test/ ? |
Hmm, yes that would be good. It says it contains over 2000 files though, not small! 😅 |
I've looked into it, the test suite has about 2250 tests (several MB), so it's not practical to include in std. We could add a script that download it and creates a conformance report. The parser is intentionally lenient, per now it handles the valid tests (for XML 1.0), but does not handle the not-well-formed ones very well (these are mostly already documented in the PR). It seems like the usual way of dealing with the more "theoretical" edge cases is to add a strict mode that enforces XML conformance . WDYT, is that worth it? Per now, this initial PR is still open - so I don't know if there's appetite to include it in std yet. |
New XML parsing and serialization module
What @std/xml has:
What @std/xml doesn't have:
Benchmark Results
Performance work never really ends, and you often find yourself comparing apples and oranges. Anyway. Here goes.
The challengers
Error position tracking is nice for debugging, but really hurts performance. So I made it an option that defaults to true for non-streaming and false for streaming (streaming is usually for trusted data sources. Multi-GB feeds or logs where throughput is critical). The results below contain both with and without error position tracking.
Test data
I used the test files located in
testdatafor non-streaming. I used one 597MB file for the streaming benchmark (google product data), but didn't check that intotestdata. Other payloads may give different results.Small Files (<10KB) — Median Results
1 Large File (301KB) — Median Results
Streaming (a 597MB file) — Median Results