PHQL parser fixes#5
Merged
niden merged 4 commits intophalcon:1.0.xfrom Apr 9, 2026
Merged
Conversation
Jeckerson
approved these changes
Apr 9, 2026
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 1.0.x #5 +/- ##
========================================
Coverage ? 32.90%
Complexity ? 5600
========================================
Files ? 7
Lines ? 8683
Branches ? 0
========================================
Hits ? 2857
Misses ? 5826
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pure-PHP PHQL scanner and parser port (
phalcon-phql) had several bugs preventing correct tokenization and AST generation. This PR fixes all of them; all 136@group phqltests now pass (1 skipped) and the full MySQL database suite (775 tests) is unaffected.Fixes
1. Scanner — PHP 8 loose comparison: digit
'0'–'8'matched control-character casesFile:
src/Scanner/Scanner.phpIn PHP 8, comparing a numeric string to an integer uses numeric equality:
"0" == 0x00→true. The re2c-generated state machine used integer hex literals (case 0x00:throughcase 0x08:) in switch statements keyed on a single-character string. This caused digit characters'0'–'8'to match control-character cases instead of thedefault:fall-through, routing them to error or EOF states.Fix: Changed all
case 0x00:–case 0x08:tocase "\x00":–case "\x08":(string literals, compared lexicographically). Changedif ($yych <= 0x00)toif ($yych === "\x00").2. Scanner —
break 2in states 8 and 12 exitedwhile(true)prematurelyFile:
src/Scanner/Scanner.phpThe inner scanning loop is structured as:
States 8 and 12 (the entry points for "…" and '…' string literals) had break 2 directly inside
switch ($yystate)with no inner switch. This exited both the outerswitchandwhile(true), sending control back to the outer while(RETCODE_IMPOSSIBLE == $status)which reset$yystate = 0. The scanner restarted at state 0 with the cursor already past the opening quote, treating the string contents as a bare identifier.Fix: Changed break 2 → break in cases 2, 8, and 12 so the inner while(true) continues to the next state.
File: src/Scanner/Scanner.php
State 70 emits the PHQL_T_STRING token. It extracted the value as:
$qis the cursor position at the start ofscanForToken()— i.e. the position of the opening " or '. In the C re2c original,YYMARKER (q)is advanced past the opening quote inside pp11/pp16(ppch = *(PPMARKER = ++PPCURSOR)), so the value is extracted without the leading delimiter.The PHP equivalent sets
$yymarker = $yycursorin states 8/12 after state 0 has already advanced the cursor past the opening quote. Using$yymarkerinstead of$qgives the correct content.Fix:
SPLACEHOLDER/BPLACEHOLDERvalues included leading :File: src/Scanner/Scanner.php
Cases 142 (
PHQL_T_SPLACEHOLDER) and 194 (PHQL_T_BPLACEHOLDER) extracted the placeholder value starting from $q (the position of the leading :). Query.php prepends ":" when building the placeholder, producing::nameinstead of:name.Fix: Changed extraction offset from
$qto$q + 1and length fromPHQL_T_EQUALSandPHQL_T_LESSwere string opcodesFile: src/Scanner/Opcode.php
Query.php uses integer constants (
case self::PHQL_T_EQUALS:) in strict PHP 8 switch comparisons. String opcodes never matched, causing = and < expressions to throw "Unknown expression type".distinct_allanddistinct_or_nullempty productions returned wrong valuesFile: resources/files/parser.php
Case 9 (
distinct_all ::= — no DISTINCT/ALL in SELECT): returned[], causing isset($expr["distinct"]) to be true and prependingALLto everySELECT. Fixed to returnnull.Case 135 (
distinct_or_null ::= — no DISTINCT in function call): returned[], causing COUNT(*) to render as COUNT(DISTINCT *). Fixed to returnnull.Case 134 (
distinct_or_null ::= DISTINCT): returned0(falsy), causingCOUNT(DISTINCT col)to dropDISTINCT. Fixed to returntrue.Property
$outputtype widened from array to mixed to accommodatenullandboolreduction values.[]instead ofnullFile: resources/files/parser.php
Cases 69, 71, 78, 83, 85, 89, 91, and 137 correspond to empty optional productions:
where_clause::=,limit_clause::=,order_clause::=,group_clause::=,having_clause::=, etc. These returned[].The
phql_ret_delete_statement,phql_ret_update_statement, andphql_ret_select_statementhelpers guard with if($x !== null)before adding keys to the AST — matching the C original'sif (Z_TYPE_P(W) != IS_UNDEF). An empty array bypasses that guard, inserting"where" => []into the AST.Query.phpthen callsgetExpression([])which throws "Unknown expression".Fix: All eight cases now return null, matching cphalcon's
ZVAL_UNDEF.Test results