Skip to content

Improve use of CharacterEncoding#1735

Draft
mmatera wants to merge 3 commits intomasterfrom
fix_ToStringEncoding
Draft

Improve use of CharacterEncoding#1735
mmatera wants to merge 3 commits intomasterfrom
fix_ToStringEncoding

Conversation

@mmatera
Copy link
Contributor

@mmatera mmatera commented Mar 15, 2026

This PR covers #1678 by

  • Make that SystemCharacterEncoding has effect on the text render
  • Make that CharacterEncoding option in ToString works as expected

}


def encode_string_value(value: str, encoding: str):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a just a proof of concept. The final version should look into the MathicsScanner tables

value = value[1:-1]

if "encoding" in options and options["encoding"] != "Unicode":
value = encode_string_value(value, options["encoding"])
Copy link
Member

@rocky rocky Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this more closely, there may be a deeper problem here.

If the Mathics3 string was encoded with Unicode under the user's control, that should remain. If Mathics3 added the Unicode because an operator appeared, that is probably wrong, and the code that added the Unicode should be fixed.

So, what is a specific scenario or situation where line 200 is triggered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 200 is triggered when the required encoding is not the standard Unicode. It happens when the SystemCharacterEncoding is not Unicode (for example by setting MATHICS_CHARACTER_ENCODING="ASCII") or when it is call from ToString with a specific CharacterEncoding option.

Copy link
Member

@rocky rocky Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 200 is triggered when the required encoding is not the standard Unicode. It happens when the SystemCharacterEncoding is not Unicode (for example by setting MATHICS_CHARACTER_ENCODING="ASCII") or when it is call from ToString with a specific CharacterEncoding option.

This paraphrases the if condition. I meant, what is it that is causing an operator to get converted before ToString was called. This, I think, is the real source of the problem.

@rocky
Copy link
Member

rocky commented Mar 16, 2026

A suggestion for a check that things are fixed would be to run pytest without setting MATHICS_CHARACTER_ENCODING, but changing pytest/helper.py so that an encoding of ASCII in the ToString calls does not cause tests to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants