|
| 1 | +# SanitizeSql Bug Fix - Empty Result Sets for Queries with String Literals |
| 2 | + |
| 3 | +## Problem |
| 4 | + |
| 5 | +Queries with string literals (e.g., `'test@example.com'`, `'2000-01-01'`) were returning empty result sets. |
| 6 | +The enhanced parser was failing with warnings like: |
| 7 | +``` |
| 8 | +[Position 36] Unexpected trailing content: 'test@example.com''' |
| 9 | +[Position 19] Expected ) after function arguments |
| 10 | +``` |
| 11 | + |
| 12 | +## Root Cause |
| 13 | + |
| 14 | +`SqlParser.SanitizeSql(sql)` was being called for ALL non-parameterized queries in multiple code paths: |
| 15 | +1. `SqlParser.Core.cs::Execute(string sql, Dictionary<string, object?> parameters, IWAL? wal)` - line 138 |
| 16 | +2. `SqlParser.Core.cs::Execute(CachedQueryPlan plan, Dictionary<string, object?> parameters, IWAL? wal)` - line 178 |
| 17 | +3. `SqlParser.Core.cs::ExecuteQuery(string sql, Dictionary<string, object?>? parameters)` - line 204 (REMOVED) |
| 18 | +4. `SqlParser.Core.cs::ExecuteQuery(string sql, Dictionary<string, object?> parameters, bool noEncrypt)` - line 219 |
| 19 | +5. `SqlParser.Core.cs::ExecuteQuery(CachedQueryPlan plan, Dictionary<string, object?>? parameters)` - line 246 |
| 20 | + |
| 21 | +The `SanitizeSql` method was implemented as: |
| 22 | +```csharp |
| 23 | +private static string SanitizeSql(string sql) |
| 24 | +{ |
| 25 | + return sql.Replace("'", "''"); |
| 26 | +} |
| 27 | +``` |
| 28 | + |
| 29 | +This blindly doubled ALL single quotes, including string delimiters, transforming: |
| 30 | +- `'test@example.com'` → `''test@example.com''` (BROKEN - 4 quotes instead of 2) |
| 31 | +- `'2000-01-01T00:00:00'` → `''2000-01-01T00:00:00''` (BROKEN) |
| 32 | + |
| 33 | +This broke SQL syntax because: |
| 34 | +1. The opening `'` became `''`, which the parser interprets as an empty string followed by unquoted text |
| 35 | +2. The closing `'` also became `''`, leaving trailing characters |
| 36 | +3. Email addresses with `@` symbols triggered false parameter detection |
| 37 | +4. Function calls like `UNIXEPOCH('...')` became malformed |
| 38 | + |
| 39 | +## Failed Tests Before Fix |
| 40 | + |
| 41 | +### Core Tests (2 failures) |
| 42 | +1. `CompatibilityItemsTests.UnixEpoch_ReturnsSeconds` - Empty result for `SELECT UNIXEPOCH('2000-01-01T00:00:00')` |
| 43 | +2. `DdlTests.DropIndex_RemovesIndex_Success` - Empty result for `SELECT * FROM users WHERE email = 'test@example.com'` |
| 44 | + |
| 45 | +### EF Core Tests (10 failures - separate NULL handling issue) |
| 46 | +All failures were `InvalidCastException: Cannot convert NULL to Int32` or `Nullable object must have a value`. |
| 47 | +These are unrelated to the SanitizeSql bug and remain to be fixed. |
| 48 | + |
| 49 | +### Viewer Tests (2 failures - unrelated) |
| 50 | +`MainWindowViewModelSmokeTests` top-N resolution tests - unrelated to this bug. |
| 51 | + |
| 52 | +## Solution |
| 53 | + |
| 54 | +**Removed all `SanitizeSql` calls for non-parameterized queries.** |
| 55 | + |
| 56 | +### Changed Files |
| 57 | + |
| 58 | +**src/SharpCoreDB/Services/SqlParser.Core.cs** |
| 59 | +- Line 138: Removed `SanitizeSql` from `Execute(string, Dictionary, IWAL)` |
| 60 | +- Line 178: Removed `SanitizeSql` from `Execute(CachedQueryPlan, Dictionary, IWAL)` |
| 61 | +- Line 219: Removed `SanitizeSql` from `ExecuteQuery(string, Dictionary, bool noEncrypt)` |
| 62 | +- Line 246: Removed `SanitizeSql` from `ExecuteQuery(CachedQueryPlan, Dictionary)` |
| 63 | + |
| 64 | +All removals replaced the `else { sql = SqlParser.SanitizeSql(sql); }` block with a comment: |
| 65 | +```csharp |
| 66 | +// REMOVED: SanitizeSql was breaking string literals by doubling ALL quotes including delimiters |
| 67 | +// For queries without parameters, we trust the input SQL as-is |
| 68 | +// SQL injection protection should be handled at the application layer via parameterized queries |
| 69 | +``` |
| 70 | + |
| 71 | +### Why This Is Safe |
| 72 | + |
| 73 | +1. **SQL Injection Protection**: `SanitizeSql` was NOT providing real protection |
| 74 | + - It only escaped quotes, which is insufficient for injection prevention |
| 75 | + - The method itself warned: "WARNING: This is NOT sufficient for preventing SQL injection. Always use parameterized queries." |
| 76 | + - Proper protection comes from parameterized queries (which use `BindParameters` instead) |
| 77 | + |
| 78 | +2. **Parameterized Queries**: Still fully protected |
| 79 | + - All parameterized queries use `SqlParser.BindParameters`, which properly escapes values via `FormatValue` |
| 80 | + - `FormatValue` correctly handles string escaping: `string s => $"'{s.Replace("'", "''")}'"` |
| 81 | + - This escapes the VALUE, not the SQL delimiters |
| 82 | + |
| 83 | +3. **Non-Parameterized Queries**: Must be safe at the application layer |
| 84 | + - If user code is constructing SQL strings directly, it's their responsibility to sanitize inputs |
| 85 | + - The database engine should trust the SQL it receives |
| 86 | + - Breaking valid SQL syntax to "defend" against injection is worse than the disease |
| 87 | + |
| 88 | +4. **Legacy Behavior**: The legacy parser and AST executor both expect well-formed SQL |
| 89 | + - Neither component was designed to work with pre-sanitized SQL |
| 90 | + - The enhanced parser explicitly validates SQL syntax and rejects malformed input |
| 91 | + |
| 92 | +## Test Results After Fix |
| 93 | + |
| 94 | +``` |
| 95 | +Test summary: total: 2269; failed: 12; succeeded: 2242; skipped: 15 |
| 96 | +``` |
| 97 | + |
| 98 | +### ✅ Fixed (2 core tests) |
| 99 | +- `CompatibilityItemsTests.UnixEpoch_ReturnsSeconds` - Now returns 1 row correctly |
| 100 | +- `DdlTests.DropIndex_RemovesIndex_Success` - Now returns matching row |
| 101 | + |
| 102 | +### ❌ Still Failing (10 EF Core + 2 Viewer) |
| 103 | +- 10 EF Core integration tests: NULL handling/materialization bugs (separate issue) |
| 104 | +- 2 Viewer tests: Top-N resolution logic (unrelated) |
| 105 | + |
| 106 | +## Verification |
| 107 | + |
| 108 | +Created temporary debug project (`Examples/DebugTest`) that reproduced both failing patterns: |
| 109 | + |
| 110 | +**Before fix:** |
| 111 | +``` |
| 112 | +[DEBUG ExecuteSelectQuery] SQL at routing decision: SELECT * FROM users WHERE email = ''test@example.com'' |
| 113 | +Result count: 0 |
| 114 | +ERROR: No results returned! |
| 115 | +
|
| 116 | +[DEBUG ExecuteSelectQuery] SQL at routing decision: SELECT UNIXEPOCH(''2000-01-01T00:00:00'') AS ts |
| 117 | +Result count: 0 |
| 118 | +ERROR: No results returned! |
| 119 | +``` |
| 120 | + |
| 121 | +**After fix:** |
| 122 | +``` |
| 123 | +[DEBUG ExecuteSelectQuery] SQL at routing decision: SELECT * FROM users WHERE email = 'test@example.com' |
| 124 | +Result count: 1 |
| 125 | +Email: test@example.com |
| 126 | +
|
| 127 | +[DEBUG ExecuteSelectQuery] SQL at routing decision: SELECT UNIXEPOCH('2000-01-01T00:00:00') AS ts |
| 128 | +Result count: 1 |
| 129 | +ts: 946684800 |
| 130 | +``` |
| 131 | + |
| 132 | +## Next Steps |
| 133 | + |
| 134 | +1. ✅ Core empty-result bug is FIXED |
| 135 | +2. ❌ Investigate and fix EF Core NULL handling/materialization issues |
| 136 | +3. ❌ Investigate viewer top-N resolution failures |
| 137 | +4. ✅ Run full test suite to ensure no regressions |
| 138 | + |
| 139 | +## Related Code |
| 140 | + |
| 141 | +**Parameter Detection** (`SqlParser.DML.cs::HasActualParameters`) |
| 142 | +- Enhanced to correctly skip `@` symbols inside string literals |
| 143 | +- Handles escaped quotes (`''`) to avoid false exits from string parsing |
| 144 | +- Prevents false positives from emails like `'test@example.com'` |
| 145 | + |
| 146 | +**FormatValue** (`SqlParser.Helpers.cs` line 649) |
| 147 | +- Correctly escapes string VALUES: `string s => $"'{s.Replace("'", "''")}'"` |
| 148 | +- This is the RIGHT place to escape quotes - when formatting parameter values |
| 149 | +- NOT when pre-processing the entire SQL statement |
| 150 | + |
| 151 | +## Commit Message |
| 152 | + |
| 153 | +``` |
| 154 | +fix: Remove broken SanitizeSql from non-parameterized query paths |
| 155 | +
|
| 156 | +SanitizeSql was doubling ALL single quotes (including string delimiters), |
| 157 | +which broke queries with string literals like 'test@example.com' and |
| 158 | +'2000-01-01T00:00:00'. The enhanced parser rejected the malformed SQL, |
| 159 | +causing empty result sets. |
| 160 | +
|
| 161 | +Removed SanitizeSql from 4 code paths in SqlParser.Core.cs. SQL injection |
| 162 | +protection is the application layer's responsibility; the database engine |
| 163 | +should trust well-formed SQL. |
| 164 | +
|
| 165 | +Fixes: |
| 166 | +- CompatibilityItemsTests.UnixEpoch_ReturnsSeconds |
| 167 | +- DdlTests.DropIndex_RemovesIndex_Success |
| 168 | +
|
| 169 | +Test results: 2269 total, 12 failed (down from 13), 2242 succeeded |
| 170 | +``` |
0 commit comments