-
Notifications
You must be signed in to change notification settings - Fork 47
feat: Add Python 3.10-3.14 syntax support #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -64,6 +64,7 @@ compound_stmt | |||||||||
| | TRY COLON suite (except_clause+ else_clause? finally_clause? | finally_clause) #try_stmt | ||||||||||
| | ASYNC? WITH with_item (COMMA with_item)* COLON suite #with_stmt | ||||||||||
| | decorator* (classdef | funcdef) #class_or_func_def_stmt | ||||||||||
| | match_stmt #match_stmt_wrapper | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| suite | ||||||||||
|
|
@@ -87,23 +88,199 @@ finally_clause | |||||||||
| : FINALLY COLON suite | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| // Python 3.10+ match statement (PEP 634) | ||||||||||
| match_stmt | ||||||||||
| : MATCH subject_expr COLON LINE_BREAK INDENT case_block+ DEDENT | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| subject_expr | ||||||||||
| : star_named_expression COMMA star_named_expressions? | ||||||||||
| | named_expression | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| star_named_expressions | ||||||||||
| : (COMMA star_named_expression)+ COMMA? | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| star_named_expression | ||||||||||
| : STAR expr | ||||||||||
| | named_expression | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| named_expression | ||||||||||
| : name ASSIGN test | ||||||||||
| | test | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| case_block | ||||||||||
| : CASE pattern guard? COLON suite | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| guard | ||||||||||
| : IF test | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| // Pattern matching patterns | ||||||||||
| pattern | ||||||||||
| : or_pattern | ||||||||||
| | as_pattern | ||||||||||
| ; | ||||||||||
|
Comment on lines
+124
to
+127
|
||||||||||
|
|
||||||||||
| as_pattern | ||||||||||
| : or_pattern AS pattern_capture_target | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| or_pattern | ||||||||||
| : closed_pattern (OR_OP closed_pattern)* | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| closed_pattern | ||||||||||
| : literal_pattern | ||||||||||
| | capture_pattern | ||||||||||
| | wildcard_pattern | ||||||||||
| | class_pattern | ||||||||||
| | sequence_pattern | ||||||||||
| | mapping_pattern | ||||||||||
| | group_pattern | ||||||||||
| ; | ||||||||||
|
Comment on lines
+137
to
+145
|
||||||||||
|
|
||||||||||
| literal_pattern | ||||||||||
| : MINUS? number | ||||||||||
| | STRING+ | ||||||||||
| | NONE | ||||||||||
| | TRUE | ||||||||||
| | FALSE | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| capture_pattern | ||||||||||
| : pattern_capture_target | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| pattern_capture_target | ||||||||||
| : name | ||||||||||
| ; | ||||||||||
|
|
||||||||||
| wildcard_pattern | ||||||||||
| : NAME // Matches '_' specifically, handled semantically | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||||||
| : NAME // Matches '_' specifically, handled semantically | |
| : NAME // Matches a NAME; treating '_' as the wildcard pattern is done in later semantic analysis (e.g. in listeners/visitors processing match patterns) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wildcard_pattern rule is defined as just NAME, with a comment stating it "Matches '_' specifically, handled semantically". This implementation is incorrect because it will match ANY name token, not just the underscore wildcard.
This means patterns like case foo: where foo is intended to be a capture pattern would incorrectly be parsed as a wildcard pattern. The wildcard pattern should either:
- Be checked semantically to ensure the NAME is specifically '_', or
- Use a more specific lexer token or parser predicate to match only '_'
The current implementation could lead to incorrect AST generation where any capture pattern is treated as a wildcard.
| : NAME // Matches '_' specifically, handled semantically | |
| : {getCurrentToken().getText().equals("_")}? NAME // Matches '_' specifically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n chapi-ast-python/src/main/antlr/PythonParser.g4 | sed -n '137,165p'Repository: phodal/chapi
Length of output: 701
🏁 Script executed:
# Check for wildcard and capture pattern usage in visitors and analyzers
rg -n "wildcard_pattern|capture_pattern" -A 3 -B 1 --type java --type pythonRepository: phodal/chapi
Length of output: 142
🏁 Script executed:
# Check if there's semantic handling for _ in the codebase
rg -n "equals.*_|_.*equals" --type java -A 2 -B 1 | head -50Repository: phodal/chapi
Length of output: 1795
🏁 Script executed:
rg -n "^\s*name\s*:" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 5Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
fd -t f "PythonParserBase\|PythonVisitor" chapi-ast-python/Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
fd -t f "\.g4" chapi-ast-python/src/main/antlr/Repository: phodal/chapi
Length of output: 150
🏁 Script executed:
# Search for any semantic predicate or special handling of underscore in Python parser
rg -n "TEXT\|semantic" chapi-ast-python/src/main/antlr/PythonParser.g4 -i | head -20Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Search in entire PythonParser.g4 for name rule definition
rg -n "^name\s*:" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 5Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Also search with whitespace variations
grep -n "^\s*name\s*:" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 5Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Get entire PythonParser.g4 to understand structure
wc -l chapi-ast-python/src/main/antlr/PythonParser.g4Repository: phodal/chapi
Length of output: 107
🏁 Script executed:
# Search for pattern_capture_target definition and what it actually calls
grep -n "pattern_capture_target\|NAME\|IDENTIFIER" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -40Repository: phodal/chapi
Length of output: 341
🏁 Script executed:
# Get lines around pattern definitions
cat -n chapi-ast-python/src/main/antlr/PythonParser.g4 | sed -n '155,170p'Repository: phodal/chapi
Length of output: 427
🏁 Script executed:
# Get the name rule definition
cat -n chapi-ast-python/src/main/antlr/PythonParser.g4 | sed -n '488,495p'Repository: phodal/chapi
Length of output: 232
🏁 Script executed:
# Check if there are any visitor classes generated and how they handle wildcard_pattern
find chapi-ast-python -name "*Visitor*.java" -o -name "*BaseVisitor*.java" | head -5Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Check for any Python files in the codebase that might handle AST traversal
find chapi-ast-python -type f -name "*.py" | head -10Repository: phodal/chapi
Length of output: 646
🏁 Script executed:
# Search for wildcard_pattern references in Java source code
find chapi-ast-python/src -name "*.java" -type f -exec grep -l "wildcard_pattern" {} \;Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Search more broadly for any references to wildcard in the codebase
rg "wildcard" chapi-ast-python/src --type java -iRepository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Check if there's a base listener/visitor pattern in chapi for Python
find chapi-ast-python/src/main -type f -name "*.java" | head -20Repository: phodal/chapi
Length of output: 258
🏁 Script executed:
cat chapi-ast-python/src/main/java/chapi/ast/antlr/PythonParserBase.javaRepository: phodal/chapi
Length of output: 741
🏁 Script executed:
# Search for any pattern handling in the entire module
rg "pattern" chapi-ast-python/src/main/java -i -lRepository: phodal/chapi
Length of output: 121
Wildcard pattern is unreachable due to capture_pattern precedence.
Both capture_pattern and wildcard_pattern ultimately match the NAME token (capture_pattern via pattern_capture_target → name → NAME), and since capture_pattern appears first in the ordered choice on line 139, it will always match before wildcard_pattern gets evaluated. This prevents wildcard patterns (specifically _) from ever being parsed as wildcard_pattern nodes. Python 3.10+ match statements require proper wildcard pattern support, so downstream code expecting wildcard_pattern AST nodes will never receive them.
Reorder wildcard_pattern before capture_pattern in the closed_pattern rule and add a semantic predicate to wildcard_pattern to restrict it to the _ identifier:
Proposed fix
-closed_pattern
- : literal_pattern
- | capture_pattern
- | wildcard_pattern
+closed_pattern
+ : literal_pattern
+ | wildcard_pattern
+ | capture_pattern
| class_pattern
| sequence_pattern
| mapping_pattern
| group_pattern
;
-wildcard_pattern
- : NAME // Matches '_' specifically, handled semantically
+wildcard_pattern
+ : name=NAME { $name.text.equals("_") }?
;🤖 Prompt for AI Agents
In `@chapi-ast-python/src/main/antlr/PythonParser.g4` around lines 137 - 165, The
closed_pattern rule currently places capture_pattern before wildcard_pattern
causing NAME-based captures to shadow the '_' wildcard; fix this by moving
wildcard_pattern above capture_pattern in closed_pattern and restrict
wildcard_pattern to only match the underscore identifier by adding a semantic
predicate or explicit text check (e.g., in wildcard_pattern ensure the token
matched by NAME equals "_" instead of any NAME). Update the rules referenced:
closed_pattern, wildcard_pattern, capture_pattern (and its components
pattern_capture_target → name → NAME) so wildcard_pattern is evaluated first and
only produces a wildcard_pattern node when the NAME token is "_" .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Autodetect mode, CheckVersion(2) succeeds, so except ValueError, TypeError: is likely to match the Python 2 COMMA name branch (and call SetVersion(2)) instead of the new except_types alternative. That would misclassify Python 3.14 bracketless-except as Python 2 and change its meaning.
🤖 Was this useful? React with 👍 or 👎
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The except_clause grammar has a potential ambiguity issue. Both alternatives start with EXCEPT test which could cause parsing conflicts:
- First alternative:
EXCEPT (test ...)? COLON suite- allows zero or one test - Second alternative:
EXCEPT except_types COLON suite- requirestest (COMMA test)+
When the parser sees EXCEPT test COMMA test COLON, it could match either:
- The first alternative with the Python 2 path:
EXCEPT test COMMA name(if test is followed by COMMA name) - The second alternative:
EXCEPT except_types(if test is followed by COMMA test)
This ambiguity could lead to incorrect parsing. Consider reordering the alternatives or making the second alternative more specific to avoid conflicts. The more specific alternative (except_types requiring at least two exceptions) should typically come before the more general one.
| : EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite | |
| | EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed) | |
| : EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed) | |
| | EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Search for version gating patterns in the grammar file
rg -n "CheckVersion|SetVersion|supportsPython314|PythonVersion" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -50Repository: phodal/chapi
Length of output: 704
🏁 Script executed:
# Get context around the except_clause and except_types rules
sed -n '230,260p' chapi-ast-python/src/main/antlr/PythonParser.g4Repository: phodal/chapi
Length of output: 963
🏁 Script executed:
# Check if there are any lexer/semantic action files
find chapi-ast-python -name "*.java" -o -name "*.py" -o -name "*.g4" | grep -i "listener\|visitor\|action\|semantic" | head -20Repository: phodal/chapi
Length of output: 38
🏁 Script executed:
# Search for CheckVersion implementation and how it's used
find chapi-ast-python -name "*.java" | head -20Repository: phodal/chapi
Length of output: 258
🏁 Script executed:
# Look for PythonParser listener or base classes
find chapi-ast-python -name "*Listener*" -o -name "*Visitor*" -o -name "Python*" | grep -E "\.(java|py)$" | head -20Repository: phodal/chapi
Length of output: 258
🏁 Script executed:
# Search for other Python 3.14+ specific rules to see if they have version guards
rg -n "3\.14|PEP 758|type_params" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -30Repository: phodal/chapi
Length of output: 574
🏁 Script executed:
# Check if there are any other rules with semantic actions but no CheckVersion
rg -n "EXCEPT|except_types" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 3 -B 3Repository: phodal/chapi
Length of output: 676
🏁 Script executed:
# Examine CheckVersion implementation in PythonParserBase
cat chapi-ast-python/src/main/java/chapi/ast/antlr/PythonParserBase.javaRepository: phodal/chapi
Length of output: 741
🏁 Script executed:
# Check if type_params rule has any version guards
rg -n "type_params" chapi-ast-python/src/main/antlr/PythonParser.g4 -B 2 -A 5 | head -40Repository: phodal/chapi
Length of output: 730
🏁 Script executed:
# Check if there are other unguarded Python 3.12+ or 3.14+ features
rg -n "Python 3\.(12|13|14)\+|PEP (695|758)" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -20Repository: phodal/chapi
Length of output: 591
🏁 Script executed:
# Look for how type_params, type_alias_stmt are used and if they're version-guarded at call sites
rg -n "type_params|type_alias_stmt" chapi-ast-python/src/main/antlr/PythonParser.g4 -B 1 | head -40Repository: phodal/chapi
Length of output: 698
🏁 Script executed:
# Check if there's any validation or version checking outside the grammar
rg -n "supportsPython|checkVersion|validateVersion" chapi-ast-python/src/main/java -iRepository: phodal/chapi
Length of output: 295
🏁 Script executed:
# Check PythonVersion.java to see the version enum values
cat chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.javaRepository: phodal/chapi
Length of output: 1374
🏁 Script executed:
# Look at how the grammar file is structured and check for other rules without CheckVersion that should be version-specific
rg -n "EXCEPT except_types|except_types" chapi-ast-python/src/main/antlr/PythonParser.g4 -B 5 -A 5Repository: phodal/chapi
Length of output: 729
Add version guard to except_types alternative.
The except_types rule (lines 250–252) lacks a CheckVersion predicate, allowing Python 3.14+ bracketless exception syntax to parse in Python 2 and 3 modes. Add {CheckVersion(314)}? before the second except_clause alternative to prevent unintended matches in earlier versions, consistent with the version-gating pattern used in the first alternative (line 245).
🤖 Prompt for AI Agents
In `@chapi-ast-python/src/main/antlr/PythonParser.g4` around lines 241 - 252, The
grammar allows the Python 3.14+ bracketless exception form to match in older
modes because the second alternative of except_clause lacks a version predicate;
add the version guard by prepending the predicate {CheckVersion(314)}? to the
alternative that references except_types so that except_types is only used when
CheckVersion(314) is true, keeping the existing CheckVersion/SetVersion pattern
used in the first except_clause alternative (refer to except_clause,
except_types, CheckVersion and SetVersion).
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,9 +1,20 @@ | ||||||||||||||||||||||||||||||||||
| package chapi.ast.antlr; | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||
| * Python version enumeration for syntax compatibility. | ||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||
| * Version history: | ||||||||||||||||||||||||||||||||||
| * - Python 3.10: Pattern matching (match/case) | ||||||||||||||||||||||||||||||||||
| * - Python 3.12: Type parameter syntax (PEP 695), type statement | ||||||||||||||||||||||||||||||||||
| * - Python 3.14: Template strings (t-strings), bracketless except (PEP 758) | ||||||||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||||||||
| public enum PythonVersion { | ||||||||||||||||||||||||||||||||||
| Autodetect(0), | ||||||||||||||||||||||||||||||||||
| Python2(2), | ||||||||||||||||||||||||||||||||||
| Python3(3); | ||||||||||||||||||||||||||||||||||
| Python3(3), | ||||||||||||||||||||||||||||||||||
| Python310(310), // Pattern matching | ||||||||||||||||||||||||||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This adds 🤖 Was this useful? React with 👍 or 👎 |
||||||||||||||||||||||||||||||||||
| Python312(312), // Type parameter syntax | ||||||||||||||||||||||||||||||||||
| Python314(314); // Template strings, bracketless except | ||||||||||||||||||||||||||||||||||
|
Comment on lines
3
to
+17
|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| private final int value; | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
|
@@ -14,4 +25,25 @@ public enum PythonVersion { | |||||||||||||||||||||||||||||||||
| public int getValue() { | ||||||||||||||||||||||||||||||||||
| return value; | ||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||
| * Check if this version supports Python 3.10+ features (pattern matching) | ||||||||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||||||||
| public boolean supportsPatternMatching() { | ||||||||||||||||||||||||||||||||||
| return this == Autodetect || value >= 310 || this == Python3; | ||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||
| return this == Autodetect || value >= 310 || this == Python3; | |
| return this == Autodetect || value >= 310; |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The supportsTypeParameters() method doesn't include || this == Python3 unlike supportsPatternMatching(). This means:
Python3version will returnfalsefor type parameters support (since value=3 < 312)Python3version will returntruefor pattern matching support (due to special case on line 33)
This inconsistency is confusing. If Python3 is meant to represent "generic Python 3.x" that should support all features (as suggested by line 33), then this method should also include || this == Python3. If Python3 is meant to represent specifically Python 3.0-3.9, then line 33's special case for pattern matching is incorrect (as noted in a separate comment).
The behavior should be consistent across all feature support methods.
| return this == Autodetect || value >= 312; | |
| } | |
| /** | |
| * Check if this version supports Python 3.14+ features (t-strings, bracketless except) | |
| */ | |
| public boolean supportsPython314Features() { | |
| return this == Autodetect || value >= 314; | |
| return this == Autodetect || value >= 312 || this == Python3; | |
| } | |
| /** | |
| * Check if this version supports Python 3.14+ features (t-strings, bracketless except) | |
| */ | |
| public boolean supportsPython314Features() { | |
| return this == Autodetect || value >= 314 || this == Python3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for PythonVersion.Python3 usage and feature gate method calls
rg -n "PythonVersion\.Python3\b|supportsPatternMatching|supportsTypeParameters|supportsPython314Features" -A 2 -B 2Repository: phodal/chapi
Length of output: 2213
🏁 Script executed:
#!/bin/bash
# Find and examine the PythonVersion enum definition
fd -e java | xargs rg -l "enum PythonVersion|class PythonVersion" | head -5Repository: phodal/chapi
Length of output: 121
🏁 Script executed:
#!/bin/bash
# Read the PythonVersion.java file
cat -n chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.javaRepository: phodal/chapi
Length of output: 1717
Align Python3 semantics across feature gates to avoid inconsistent feature availability.
supportsPatternMatching() treats Python3 (value=3) as supporting 3.10+ features with an explicit check, but supportsTypeParameters() and supportsPython314Features() exclude it entirely. Since Python3 represents a generic/unknown Python 3.x version (as seen in PythonParserBase.java line 22), this creates an inconsistency: either all three methods should include Python3, or none should. Decide on the intended semantics—does generic Python 3 assume modern feature support, or should version checks be strict?—and align all three methods accordingly.
🤖 Prompt for AI Agents
In `@chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.java` around
lines 29 - 48, The three feature-gate methods (supportsPatternMatching,
supportsTypeParameters, supportsPython314Features) are inconsistent about
treating the enum constant Python3 as “generic Python 3.x”; align their
semantics by either always treating Python3 as supporting modern 3.x features or
never doing so—pick one. Concretely, update the implementations in
PythonVersion: if you choose to treat Python3 as generic modern 3.x, add the
same check (this == Python3) to supportsTypeParameters() and
supportsPython314Features(); if you choose strict numeric checks, remove the
this == Python3 clause from supportsPatternMatching(). Keep the Autodetect
handling (this == Autodetect) unchanged and use the existing value field for
numeric comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
named_expressionis defined asname ASSIGN test, butASSIGNis=in the lexer, so valid walrus subjects (:=) inmatchstatements won’t parse and invalidname = exprcould be accepted in expression position.🤖 Was this useful? React with 👍 or 👎