Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 19, 2026

What does this changes

Moves global declarations before variable use in pythainlp/tokenize/budoux.py, pythainlp/tokenize/oskut.py, pythainlp/tokenize/sefr_cut.py, and pythainlp/tokenize/wtsplit.py.

What was wrong

Python 3.13 enforces stricter scoping rules: using a variable before declaring it global in the same scope is now a SyntaxError. The coverage parser failed with errors in multiple files:

SyntaxError: name '_parser' is used prior to global declaration at line 54

in budoux.py, and similar errors for _DEFAULT_ENGINE in oskut.py and sefr_cut.py, and _MODEL, _MODEL_NAME in wtsplit.py.

The original code checked variables (e.g., if _parser is None:, if engine != _DEFAULT_ENGINE:) before declaring them as global, which violated Python 3.13's scoping rules. These issues were introduced by the thread-safe improvement PR #1213.

How this fixes it

Move all global declarations from inside the lock blocks to before the lock blocks. The variables must be declared global before any use in the function scope.

# Before
# Thread-safe lazy initialization
with _parser_lock:
    if _parser is None:
        global _parser  # Too late - already used above
        _parser = _init_parser()

# After
# Thread-safe lazy initialization
global _parser  # Declared before use
with _parser_lock:
    if _parser is None:
        _parser = _init_parser()

The same pattern was applied to:

  • budoux.py: global _parser
  • oskut.py: global _DEFAULT_ENGINE
  • sefr_cut.py: global _DEFAULT_ENGINE
  • wtsplit.py: global _MODEL, _MODEL_NAME

Your checklist for this pull request

  • Passed code styles and structures
  • Passed code linting checks and unit test
Original prompt

Fix this error from unittest CI (ubuntun-latest; Python 3.13):

Submitting coverage to coveralls.io...
Error running coveralls: Got coverage library error: Couldn't parse '/home/runner/work/pythainlp/pythainlp/pythainlp/tokenize/budoux.py' as Python source: "name '_parser' is used prior to global declaration" at line 54
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 266, in parse_source
self._raw_parse()
~~~~~~~~~~~~~~~^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 203, in _raw_parse
byte_parser = ByteParser(self.text, filename=self.filename)
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 432, in init
self.code = compile(text, filename, "exec", dont_inherit=True)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/pythainlp/pythainlp/pythainlp/tokenize/budoux.py", line 54
global _parser
^^^^^^^^^^^^^^
SyntaxError: name '_parser' is used prior to global declaration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/reporter.py", line 41, in report
for (fr, analysis) in get_analysis_to_report(cov, None):
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/report_core.py", line 98, in get_analysis_to_report
analysis = coverage._analyze(morf)
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/control.py", line 1013, in _analyze
return analysis_from_file_reporter(data, self.config.precision, file_reporter, filename)
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/results.py", line 30, in analysis_from_file_reporter
statements = file_reporter.lines()
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/python.py", line 197, in lines
return self.parser.statements
^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/python.py", line 192, in parser
self._parser.parse_source()
~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 272, in parse_source
raise NotPython(
...<2 lines>...
) from err
coverage.exceptions.NotPython: Couldn't parse '/home/runner/work/pythainlp/pythainlp/pythainlp/tokenize/budoux.py' as Python source: "name '_parser' is used prior to global declaration" at line 54

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/cli.py", line 98, in main
result = coverallz.wear()
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/api.py", line 272, in wear
json_string = self.create_report()
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/api.py", line 344, in create_report
data = self.create_data()
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/api.py", line 403, in create_data
self._data = {'source_files': self.get_coverage()}
~~~~~~~~~~~~~~~~~^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/api.py", line 425, in get_coverage
return CoverallReporter(work, base_dir, src_dir).coverage
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/reporter.py", line 29, in init
self.report(cov)
~~~~~~~~~~~^^^^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coveralls/reporter.py", line 48, in report
raise CoverallsException(f'Got coverage library error: {e}') from e
coveralls.exception.CoverallsException: Got coverage library error: Couldn't parse '/home/runner/work/pythainlp/pythainlp/pythainlp/tokenize/budoux.py' as Python source: "name '_parser' is used prior to global declaration" at line 54


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@bact bact added the bug bugs in the library label Jan 19, 2026
Co-authored-by: bact <128572+bact@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix coverage library error in unittest CI Fix Python 3.13 SyntaxError: global declaration before variable use in budoux.py Jan 19, 2026
Copilot AI requested a review from bact January 19, 2026 19:04
@bact bact marked this pull request as ready for review January 19, 2026 19:04
@bact
Copy link
Member

bact commented Jan 19, 2026

@copilot Got similar issues in other files as well. Looks like the thread-safe improvement PR at #1213 introduces few of this "used prior to global declaration" bug. Please fix them all.

Error running coveralls: Got coverage library error: Couldn't parse '/home/runner/work/pythainlp/pythainlp/pythainlp/tokenize/oskut.py' as Python source: "name '_DEFAULT_ENGINE' is used prior to global declaration" at line 44
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 266, in parse_source
self._raw_parse()
~~~~~~~~~~~~~~~^^
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 203, in _raw_parse
byte_parser = ByteParser(self.text, filename=self.filename)
File "/opt/hostedtoolcache/Python/3.13.11/x64/lib/python3.13/site-packages/coverage/parser.py", line 432, in init
self.code = compile(text, filename, "exec", dont_inherit=True)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/pythainlp/pythainlp/pythainlp/tokenize/oskut.py", line 44
global _DEFAULT_ENGINE
^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: name '_DEFAULT_ENGINE' is used prior to global declaration

Co-authored-by: bact <128572+bact@users.noreply.github.com>
@sonarqubecloud
Copy link

Copy link
Contributor Author

Copilot AI commented Jan 19, 2026

@copilot Got similar issues in other files as well. Looks like the thread-safe improvement PR at #1213 introduces few of this "used prior to global declarati...

Fixed all similar issues in oskut.py, sefr_cut.py, and wtsplit.py. The global declarations have been moved before any variable use in all affected files. (commit 3ed7a6c)

Copilot AI changed the title Fix Python 3.13 SyntaxError: global declaration before variable use in budoux.py Fix Python 3.13 SyntaxError: global declaration before variable use in tokenizer modules Jan 19, 2026
Copilot AI requested a review from bact January 19, 2026 19:23
@bact bact merged commit 65e6c54 into dev Jan 19, 2026
44 checks passed
@bact bact deleted the copilot/fix-coverage-parser-error branch January 19, 2026 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug bugs in the library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants