Skip to content

ComplexFormat.parse exhibits inconsistent behavior due to implicit comma skipping by NumberFormat #459

@yin-mao

Description

@yin-mao

Title

ComplexFormat.parse exhibits inconsistent behavior due to implicit comma skipping by NumberFormat


Description

ComplexFormat.parse exhibits inconsistent and undocumented behavior when parsing inputs containing commas.

Commas are silently ignored in numeric components, but not in structural positions (such as between a number and '+' or 'i'). This results in context-dependent parsing behavior that is difficult to predict and may hide malformed input.


Reproducible Example

ComplexFormat format = new ComplexFormat();

System.out.println(format.parse(",,7+,,,2i"));   // 7 + 2i
System.out.println(format.parse(",8+,,3i"));     // 8 + 3i

System.out.println(format.parse(",7"));          // 7 + 0i
System.out.println(format.parse(";7"));          // null
System.out.println(format.parse("#7"));          // null

Observed Behavior

  • Commas are ignored when they appear inside numeric components:

    • ",,7" → 7
    • ",,,2" → 2
  • As a result, inputs like:

    • ",,7+,,,2i"
    • ",8+,,3i"
      are successfully parsed into valid complex numbers
  • However, commas are not accepted in structural positions:

    • e.g. between a number and '+' or 'i', parsing may fail
  • Other invalid characters (e.g. ';', '#') are not ignored and cause parsing to fail

  • Importantly, comma does not behave like a structural separator:
    Input such as:

    • "7,8"

    is parsed as:

    • 78 (single number)

    rather than being interpreted as:

    • two values (e.g. 7 + 0i and 8 + 0i)

    This further indicates that ',' is not treated as a consistent delimiter in any meaningful semantic sense.


Expected Behavior

Parsing should be consistent and predictable:

  • Either commas should be explicitly supported as valid separators and documented
  • Or invalid characters should cause parsing to fail uniformly

Silent skipping of certain characters in some contexts but not others leads to confusing and unsafe behavior.

In particular, if ',' were intended as a delimiter, inputs like "7,8" should be parsed consistently as multiple values, not collapsed into a single number.


Root Cause Analysis

The behavior originates from CompositeFormat.parseNumber:

Number number = format.parse(source, pos);

This delegates parsing to NumberFormat (typically DecimalFormat).

DecimalFormat treats ',' as a grouping separator and ignores it during parsing. For example:

",,7" → 7
"7,8" → 78

This is confirmed by observing that:

  • pos.getIndex() advances after parsing ",,7"
  • startIndex != endIndex, so parsing is considered successful

Therefore:

  • ',' is implicitly ignored inside numeric components by NumberFormat
  • but ComplexFormat does not handle ',' consistently in other parsing stages

Consequence

This leads to inconsistent parsing behavior:

  • ',' is ignored inside numeric values
  • ',' is not treated as a structural delimiter between complex numbers
  • but ',' is also rejected in structural positions (e.g. around '+' or 'i')

As a result, parsing becomes context-dependent and non-intuitive.

Additionally, malformed input may be silently accepted and interpreted as valid data, making it difficult to detect input errors.


Additional Notes

This behavior is not documented in the ComplexFormat API and may surprise users expecting strict parsing.

The issue arises from the interaction between:

  • a lenient numeric parser (NumberFormat)
  • and a stricter structural parser (ComplexFormat)

Possible Improvements

  • Disable grouping parsing in NumberFormat when used by ComplexFormat
  • Or explicitly handle separators at the ComplexFormat level
  • Or document the current behavior clearly

Providing a strict parsing mode could also help avoid ambiguity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions