Skip to content

textmacros: TextParser instances accumulate in nested ParseOptions.parsers across html.convert() calls #1474

@mackdelany

Description

@mackdelany

Summary

In long-lived MathDocument instances, every html.convert(tex) that contains \text{...} leaves one or more TextParser instances stranded on the nested ParseOptions.parsers array stashed in tex.parseOptions.packageData.get('textmacros'). Each leaked TextParser retains its Stack, StackItems, and the parsed MathML subtree, so memory grows linearly with the number of conversions. No documented public API (html.clear(), html.reset({ all: true }), tex.reset()) clears this pool.

Consequence for downstream users: a browser app that calls html.convert() to render equations on demand (rather than batch-typesetting an HTML document) accumulates ~1 parser, ~1 Stack, and the equation's full MathML tree per \text{...} per render, indefinitely. We caught this via Chrome heap snapshots in a real product: MmlMo, MmlMi, Attributes, and TextNode counts growing ~5× over 60 seconds of UI interaction, with Stack count rising from 105 → 482 in the same window.

Affected versions

  • mathjax-full@3.2.1 — confirmed via the repro below
  • mathjax-full@3.2.2 — source diff against 3.2.1 shows no relevant changes
  • mathjax-full@4.0.0-beta.7 — same buggy code paths present in cjs/input/tex/TexParser.js and cjs/input/tex/textmacros/TextMacrosConfiguration.js (verified by inspection; couldn't run the repro against 4.x because AllPackages was reorganized, but the underlying mechanism is identical)

Reproduction

// node mathjax-textmacros-leak-repro.mjs
// npm install mathjax-full@3.2.1

import { TeX } from "mathjax-full/js/input/tex.js"
import { SVG } from "mathjax-full/js/output/svg.js"
import { liteAdaptor } from "mathjax-full/js/adaptors/liteAdaptor.js"
import { HTMLDocument } from "mathjax-full/js/handlers/html/HTMLDocument.js"
import { AllPackages } from "mathjax-full/js/input/tex/AllPackages.js"

const adaptor = liteAdaptor()
const doc = adaptor.createDocument()
adaptor.document = doc

const tex = new TeX({ packages: AllPackages })
const svg = new SVG({ fontCache: "none" })
const html = new HTMLDocument(doc, adaptor, { InputJax: tex, OutputJax: svg })

const parserCount = () => {
  const pkg = html.inputJax[0].parseOptions.packageData.get("textmacros")
  return pkg?.parseOptions?.parsers?.length ?? 0
}

const equations = [
  "\\text{Manual Override}",
  "\\text{kN \\cdot m} + 42",
  "\\frac{\\text{a}}{\\text{b}}",
  "P_{\\text{ult}} = 1.5 \\text{ kN}",
  "\\text{stress} = \\frac{F}{A}",
]

for (let cycle = 0; cycle < 3; cycle += 1) {
  for (const t of equations) html.convert(t, { display: false })
  console.log(`after ${(cycle + 1) * equations.length} converts: textmacros parsers = ${parserCount()}, html.math = ${Array.from(html.math).length}`)
  html.clear()
  console.log(`  html.clear():       textmacros parsers = ${parserCount()}`)
  html.reset({ all: true })
  console.log(`  html.reset({all}):  textmacros parsers = ${parserCount()}`)
}

Output:

after 5 converts:  textmacros parsers = 7,  html.math = 0
  html.clear():       textmacros parsers = 7
  html.reset({all}):  textmacros parsers = 7
after 10 converts: textmacros parsers = 14, html.math = 0
  html.clear():       textmacros parsers = 14
  html.reset({all}):  textmacros parsers = 14
after 15 converts: textmacros parsers = 21, html.math = 0
  html.clear():       textmacros parsers = 21
  html.reset({all}):  textmacros parsers = 21

Note that html.math stays at 0 — convert() correctly creates a local MathItem that isn't added to doc.math. The leak is in the nested package state, which neither clear() nor reset({ all: true }) touches.

Root cause

Two interacting issues:

1. Asymmetric push/pop in TexParserjs/input/tex/TexParser.js:74 (constructor):

this.configuration.pushParser(this)
this.stack = new Stack(...)
this.Parse()

pushParser runs unconditionally on construction. The matching popParser only fires on the success path of mml() (js/input/tex/TexParser.js:170-177):

TexParser.prototype.mml = function () {
    if (!this.stack.Top().isKind('mml')) {
        return null;          // ← early return, no popParser
    }
    var node = this.stack.Top().First;
    this.configuration.popParser();
    return node;
}

Any sub-parser whose stack doesn't terminate as 'mml' leaks. Identical code in 4.0.0-beta.7 at cjs/input/tex/TexParser.js:186-194.

2. textmacros stashes a private ParseOptions that no public clear() reachesjs/input/tex/textmacros/TextMacrosConfiguration.js:52-59:

var parseOptions = new ParseOptions(textConf, []);
...
parseOptions.packageData.set('textmacros', { parseOptions: parseOptions, jax: jax, texParser: null });

tex.compile() calls parseOptions.clear() on the outer ParseOptions at the start of every compile, which resets its parsers array. But the nested ParseOptions instance stored as packageData.get('textmacros').parseOptions is never cleared by anything in the public API. MathDocument.clear()this.math.clear(); MathDocument.reset({all: true})TeX.reset(tag) which only resets equation numbering. Neither cascades into packageData.

The same pattern is present in 4.0.0-beta.7 at cjs/input/tex/textmacros/TextMacrosConfiguration.js:56-63.

Suggested fixes (any one would close the leak)

  1. Pop on failure path in TexParser.mml() — call this.configuration.popParser() before the return null at line 172. This fixes the asymmetric-lifetime root cause for the parsers array directly.
  2. Make MathDocument.reset({ all: true }) cascade into packageData — iterate inputJax.parseOptions.packageData and call entry.parseOptions?.clear?.() on every entry that owns a nested ParseOptions. This gives downstream users a documented escape hatch.
  3. Have textmacros register a reset() hook — somewhere in the configuration lifecycle so the nested ParseOptions is cleared whenever the outer one is.

Probably worth doing both (1) and (2) — (1) closes the bug, (2) gives users a remediation path even if other packages adopt similar packageData stashes in the future.

Current workaround

Users who hit this can manually clear nested ParseOptions after each convert:

function resetMathJaxParserState(html) {
  for (const jax of html.inputJax) {
    jax.parseOptions.packageData.forEach((entry) => {
      entry?.parseOptions?.clear?.()
    })
  }
}

This relies on ParseOptions.clear() being public (which it is, per js/input/tex/ParseOptions.d.ts). Reaching into packageData is more invasive but the structure is stable across 3.x and 4.0.0-beta.7.

Context

Discovered while diagnosing a heap leak in SpaceProof (a TypeScript/React app that renders thousands of equations per session via html.convert()). Our shipped fix is the workaround above, applied synchronously after each convert() in our <Equation> component: https://github.com/SpaceProof/SpaceProof/pull/3119. Happy to mirror anything we learned during diagnosis if it would help a fix — let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions