Skip to content

Performance regression: 37-79% slower parsing between v5.3.7 and v5.5.9 #816

@scolladon

Description

@scolladon

Description

After upgrading from fast-xml-parser@5.3.7 to 5.5.9, we observed a consistent 37–79% performance regression across all our benchmarks. The regression is most severe for parse-only operations (64–79% slower) and also affects full merge pipelines (37–51% slower).

Benchmark Results

Benchmark v5.3.7 (ops/sec) v5.5.9 (ops/sec) Ratio
parse-small 698 ±0.97% 390 ±0.58% 1.79x slower
parse-medium 80 ±1.23% 46 ±2.00% 1.74x slower
parse-large 18 ±1.64% 11 ±6.66% 1.64x slower
merge-small-no-conflict 317 ±3.88% 231 ±3.06% 1.37x slower
merge-small-with-conflict 353 ±1.86% 248 ±1.02% 1.42x slower
merge-medium-no-conflict 42 ±0.80% 30 ±0.79% 1.40x slower
merge-medium-with-conflict 43 ±3.65% 31 ±1.05% 1.39x slower
merge-large-no-conflict 10 ±1.78% 7 ±0.92% 1.43x slower
merge-ordered-globalvalueset 513 ±1.80% 347 ±1.82% 1.48x slower
merge-picklist-customfield 624 ±2.27% 413 ±2.34% 1.51x slower

Benchmarks run on the same CI runner (Ubuntu, Node 20), same code, only fast-xml-parser version changed.

Parser Options Used

const parserOptions = {
  cdataPropName: '#cdata',
  commentPropName: '#comment',
  ignoreAttributes: false,
  processEntities: false,
  ignoreDeclaration: true,
  numberParseOptions: { leadingZeros: false, hex: false },
  parseAttributeValue: false,
  parseTagValue: false,
  preserveOrder: true,
  trimValues: false,
}

Root Cause Analysis

After reviewing the code diff between v5.3.7 and v5.5.9, we identified several contributing factors:

1. jPath string replaced by Matcher object (highest impact)

The simple jPath string concatenation (jPath += "." + tagName) was replaced with a Matcher class from path-expression-matcher. This introduces:

  • Matcher.push() on every opening tag: creates a new object with { tag, position, counter, namespace, values }, iterates over a Map to calculate position, updates a Map for sibling tracking. Previously: single string concatenation.
  • Matcher.pop() on every closing tag: pops an array, truncates sibling stacks, returns node object. Previously: jPath.substring(0, jPath.lastIndexOf(".")).
  • Matcher.toString() called 6+ times per tag in the hot path (in parseTextData, buildAttributesMap, addChild, replaceEntitiesValue, saveTextToParentTag). Each call does this.path.map(n => n.tag).join(sep) — allocating a new array and string every time. Previously the string was already available.
  • readonlyMatcher Proxy: every property access goes through a Proxy get trap, checking MUTATING_METHODS.has(prop), then Reflect.get(). For .path and .siblingStacks, it creates frozen copies on every access.

2. Two-pass attribute parsing

buildAttributesMap() now processes attributes in two passes: first to build rawAttrsForMatcher, then again with full matcher context. Each attribute goes through resolveNameSpace(), value extraction, and replaceEntitiesValue() twice.

3. Removed indexOf('&') early-exit in replaceEntitiesValue

v5.3.7 had an early return at the top of replaceEntitiesValue:

if (val.indexOf('&') === -1) return val;

This was removed. Now every text value enters the function and evaluates the config, even when processEntities: false. The vast majority of XML text content has no &, so this early exit was highly effective.

4. Per-tag security validation

Every tag now goes through sanitizeName() (checks against criticalProperties and DANGEROUS_PROPERTY_NAMES using .includes()), transformTagName(), strictReservedNames check, extractNamespace(), and maxNestedTags depth check. Individually cheap, but they accumulate across thousands of tags.

Suggested Optimizations

  1. Cache Matcher.toString() result — compute once per push()/pop(), not on every callback invocation. This would eliminate the biggest hot-path allocation.
  2. Restore indexOf('&') early-exit in replaceEntitiesValue — no reason to remove this; it's compatible with all entity configurations.
  3. Make two-pass attribute parsing conditional — only do the second pass when PEM features (path expressions with attribute matchers) are actually in use.
  4. Avoid Proxy for readonlyMatcher when no callbacks use it — or cache the frozen copies instead of recreating them on every access.

Environment

  • Node.js: v20.20.1
  • OS: Ubuntu 24.04 (GitHub Actions runner)
  • Benchmarks: Vitest bench (powered by tinybench)

Reproducing

The benchmarks are from sf-git-merge-driver CI. The regression is 100% reproducible by swapping fast-xml-parser versions.

We understand the changes were motivated by important security fixes and the #793 O(n²) bug fix. We've accepted the regression on our side for now but wanted to report it in case optimizations can be applied without reverting the security/correctness improvements.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions