Add inline SQL support with configurable sanitization by MaksimShevtsov · Pull Request #6 · MaksimShevtsov/RowQuery

MaksimShevtsov · 2026-02-17T21:34:23Z

Summary

Inline SQL strings: All fetch/execute methods now accept raw SQL directly (e.g. engine.fetch_one("SELECT * FROM users WHERE id = ?", 1)) in addition to registry keys. A whitespace check distinguishes the two — registry keys like users.get_by_id never contain spaces.
Flexible params: The params argument now accepts a dict (named binding), tuple/list (positional binding), or a scalar that is automatically wrapped in a single-element tuple.
Configurable SQLSanitizer: A new opt-in sanitizer applied only to inline SQL (registry queries remain trusted). Three independent flags:
- strip_comments — removes -- and /* */ comments while preserving string literals
- block_multiple_statements — rejects SQL with a ; followed by more content
- allowed_verbs — restricts the leading SQL keyword to an explicit allow-list
SQLSanitizationError: New exception in the hierarchy (ExecutionError subclass).
65 new unit tests covering all sanitizer code paths, including edge cases around string literals, comment ordering, and param coercion.

Test plan

uv run pytest tests/unit/test_sanitizer.py — all 65 sanitizer tests pass
uv run pytest tests/ — full suite passes (190 tests, 1 pre-existing skip for missing aiosqlite)
Verify inline SQL with positional scalar: engine.fetch_one("SELECT * FROM t WHERE id = ?", 1)
Verify sanitizer blocks SELECT 1; DROP TABLE t when block_multiple_statements=True
Verify registry queries bypass sanitizer entirely

🤖 Generated with Claude Code

- Engine/AsyncEngine/TransactionManagers now accept raw SQL strings alongside registry keys; whitespace presence distinguishes the two - Positional params supported: scalar, list, or tuple are coerced to tuple before binding (dict remains named-param style) - New SQLSanitizer dataclass (strip_comments, block_multiple_statements, allowed_verbs) applied only to inline SQL; registry queries are trusted - SQLSanitizationError added to exception hierarchy - SQLite adapter: params or {} → params if params is not None else {} so positional tuples are not coerced to empty dict - 65 new unit tests covering all sanitizer code paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR adds inline SQL support to the row_query library, allowing users to pass raw SQL strings directly to engine/transaction methods alongside the existing registry-based approach. The implementation includes a configurable SQL sanitizer that provides defense-in-depth measures for inline SQL through comment stripping, statement blocking, and verb allow-listing. The sanitizer is only applied to inline SQL strings (detected by presence of whitespace), while registry queries remain trusted and unsanitized.

Changes:

Added SQLSanitizer class with configurable sanitization options (comment stripping, multi-statement blocking, verb restrictions)
Extended all engine and transaction methods to accept inline SQL strings in addition to registry keys
Added is_raw_sql() function to distinguish between inline SQL and registry keys based on whitespace presence
Added coerce_params() function to flexibly handle dict, tuple/list, and scalar parameter types
Added 65 comprehensive unit tests covering sanitizer functionality and edge cases

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`tests/unit/test_sanitizer.py`	Comprehensive test suite (65 tests) covering SQL sanitizer, parameter coercion, and raw SQL detection
`row_query/core/sanitizer.py`	New SQL sanitizer with tokenization, comment stripping, statement validation, and verb checking
`row_query/core/params.py`	Added `is_raw_sql()` and `coerce_params()` helpers for inline SQL support
`row_query/core/exceptions.py`	Added `SQLSanitizationError` exception as subclass of `ExecutionError`
`row_query/core/engine.py`	Updated all query methods to support inline SQL via `_resolve_sql()` helper and flexible params
`row_query/core/transaction.py`	Updated all transaction methods to support inline SQL (mirrors engine changes)
`row_query/adapters/sqlite.py`	Changed `params or {}` to `params if params is not None else {}` for correctness with falsy dicts
`row_query/__init__.py`	Exported `SQLSanitizer` and `SQLSanitizationError` in public API

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-17T21:45:44Z

    def execute(
        self,
-        query_name: str,
-        params: dict[str, Any] | None = None,
+        query: str,
+        params: Any = None,
    ) -> int:
-        """Execute a write query within this transaction."""
+        """Execute a write query within this transaction.
+
+        *query* may be a registry key or an inline SQL string.
+        *params* may be a dict, tuple/list, or scalar.
+        """
        self._check_active()
-        sql = self._registry.get(query_name)
+        sql, _label = _resolve_sql(query, self._registry, self._sanitizer)
        sql = normalize_params(sql, self._paramstyle)
-        cursor = self._adapter.execute(self._connection, sql, params)
+        cursor = self._adapter.execute(self._connection, sql, coerce_params(params))
        return int(cursor.rowcount)

    def fetch_one(
        self,
-        query_name: str,
-        params: dict[str, Any] | None = None,
+        query: str,
+        params: Any = None,
    ) -> dict[str, Any] | None:
-        """Fetch a single row within transaction context."""
+        """Fetch a single row within transaction context.
+
+        *query* may be a registry key or an inline SQL string.
+        *params* may be a dict, tuple/list, or scalar.
+        """
        self._check_active()
-        sql = self._registry.get(query_name)
+        sql, _label = _resolve_sql(query, self._registry, self._sanitizer)
        sql = normalize_params(sql, self._paramstyle)
-        cursor = self._adapter.execute(self._connection, sql, params)
+        cursor = self._adapter.execute(self._connection, sql, coerce_params(params))
        rows = _rows_to_dicts(cursor)
        if not rows:
            return None
        return rows[0]

    def fetch_all(
        self,
-        query_name: str,
-        params: dict[str, Any] | None = None,
+        query: str,
+        params: Any = None,
    ) -> list[dict[str, Any]]:
-        """Fetch all rows within transaction context."""
+        """Fetch all rows within transaction context.
+
+        *query* may be a registry key or an inline SQL string.
+        *params* may be a dict, tuple/list, or scalar.
+        """
        self._check_active()
-        sql = self._registry.get(query_name)
+        sql, _label = _resolve_sql(query, self._registry, self._sanitizer)
        sql = normalize_params(sql, self._paramstyle)
-        cursor = self._adapter.execute(self._connection, sql, params)
+        cursor = self._adapter.execute(self._connection, sql, coerce_params(params))
        return _rows_to_dicts(cursor)


The transaction methods (execute, fetch_one, fetch_all) don't wrap adapter.execute calls in try-except blocks to convert driver exceptions to ParameterBindingError, unlike the corresponding engine methods. This creates an inconsistency where driver exceptions can leak through transaction methods but are properly wrapped in engine methods. Additionally, the label returned from _resolve_sql is unused in transaction methods (prefixed with underscore), while it's used in engine methods for error messages. Consider adding the same error handling pattern used in the engine methods to maintain consistency and provide better error messages for inline SQL.

Copilot · 2026-02-17T21:45:45Z

+
+
+def _resolve_sql(
+    query: str,
+    registry: SQLRegistry,
+    sanitizer: SQLSanitizer | None = None,
+) -> tuple[str, str]:
+    """Return ``(sql_text, label)`` for *query* (raw SQL or registry key)."""
+    if is_raw_sql(query):
+        sql = sanitizer.sanitize(query) if sanitizer is not None else query
+        return sql, "<inline>"
+    return registry.get(query), query




The _resolve_sql function is duplicated in both row_query/core/engine.py and row_query/core/transaction.py. This violates the DRY (Don't Repeat Yourself) principle and makes maintenance harder. Consider moving this function to a shared location such as row_query/core/params.py or creating a new utility module, since both files need to import and use the same logic.

Suggested change

def _resolve_sql(

query: str,

registry: SQLRegistry,

sanitizer: SQLSanitizer | None = None,

) -> tuple[str, str]:

"""Return ``(sql_text, label)`` for *query* (raw SQL or registry key)."""

if is_raw_sql(query):

sql = sanitizer.sanitize(query) if sanitizer is not None else query

return sql, "<inline>"

return registry.get(query), query

from row_query.core.engine import _resolve_sql

Copilot · 2026-02-17T21:45:45Z

+def _tokenize(sql: str) -> list[tuple[str, str]]:
+    """Split *sql* into ``('string', …)`` and ``('code', …)`` tokens.
+
+    String literals (single-quoted, with ``''`` escapes) are preserved as-is.
+    Everything else is a ``'code'`` token.
+    """
+    tokens: list[tuple[str, str]] = []
+    i = 0
+    n = len(sql)
+    last = 0
+
+    while i < n:
+        if sql[i] == "'":
+            if i > last:
+                tokens.append(("code", sql[last:i]))
+            j = i + 1
+            while j < n:
+                if sql[j] == "'":
+                    j += 1
+                    if j >= n or sql[j] != "'":
+                        break  # end of literal
+                    j += 1  # '' escape — continue
+                else:
+                    j += 1
+            tokens.append(("string", sql[i:j]))
+            last = j
+            i = j
+        else:
+            i += 1
+
+    if last < n:
+        tokens.append(("code", sql[last:]))
+
+    return tokens


The _tokenize function only handles single-quoted string literals but does not handle double-quoted identifiers. In PostgreSQL, MySQL (with ANSI_QUOTES mode), and other databases, double quotes are used for identifiers (table/column names) and can contain comment-like syntax. For example, a column named my--column or my/*comment*/column would be written as "my--column" or "my/*comment*/column" and the current tokenizer would incorrectly strip these as comments. Consider extending the tokenizer to also preserve double-quoted identifiers to avoid incorrect comment stripping.

Copilot · 2026-02-17T21:45:45Z

+def _tokenize(sql: str) -> list[tuple[str, str]]:
+    """Split *sql* into ``('string', …)`` and ``('code', …)`` tokens.
+
+    String literals (single-quoted, with ``''`` escapes) are preserved as-is.
+    Everything else is a ``'code'`` token.
+    """
+    tokens: list[tuple[str, str]] = []
+    i = 0
+    n = len(sql)
+    last = 0
+
+    while i < n:
+        if sql[i] == "'":
+            if i > last:
+                tokens.append(("code", sql[last:i]))
+            j = i + 1
+            while j < n:
+                if sql[j] == "'":
+                    j += 1
+                    if j >= n or sql[j] != "'":
+                        break  # end of literal
+                    j += 1  # '' escape — continue
+                else:
+                    j += 1
+            tokens.append(("string", sql[i:j]))
+            last = j
+            i = j
+        else:
+            i += 1
+
+    if last < n:
+        tokens.append(("code", sql[last:]))
+
+    return tokens


MySQL uses backticks () for quoting identifiers, which can also contain comment-like syntax. For example, ``SELECT my--columnFROM table`` would have the identifier incorrectly processed if it contains--or/* */`. The tokenizer should also preserve backtick-quoted identifiers to ensure correct handling across different SQL dialects.

Copilot · 2026-02-17T21:45:45Z

+def is_raw_sql(query: str) -> bool:
+    """Return True if query is an inline SQL string rather than a registry key.
+
+    Registry keys use dot-notation (e.g. ``users.get_by_id``) and never
+    contain whitespace.  Any SQL statement will contain at least one space.
+    """
+    return any(c.isspace() for c in query)


The is_raw_sql function determines whether a string is inline SQL or a registry key by checking for whitespace characters. However, this approach could be fragile if a registry key accidentally contains whitespace (e.g., due to user error or copy-paste). This would cause the registry key to be treated as inline SQL and potentially sanitized, leading to unexpected behavior. Consider adding validation when registry keys are registered to ensure they don't contain whitespace, or use a more explicit mechanism to distinguish between inline SQL and registry keys (e.g., requiring a prefix or using a separate parameter).

Copilot · 2026-02-17T21:45:46Z

+def coerce_params(
+    params: dict[str, Any] | tuple[Any, ...] | list[Any] | Any,
+) -> dict[str, Any] | tuple[Any, ...] | None:
+    """Normalize *params* to a dict, tuple, or None.
+
+    * ``None`` / ``dict`` → returned as-is (named parameter binding).
+    * ``tuple`` / ``list`` → converted to ``tuple`` (positional binding).
+    * Any other scalar → wrapped in a single-element tuple.
+    """
+    if params is None or isinstance(params, dict):
+        return params
+    if isinstance(params, (tuple, list)):
+        return tuple(params)
+    return (params,)


The new inline SQL feature allows users to write SQL with ? placeholders and pass positional parameters (tuples), but the registry queries use :name placeholders. This mixing of parameter styles could be confusing for users. While technically it works (SQLite and other databases support both), it creates an inconsistency in the API where registry queries use named parameters and inline SQL can use positional parameters. Consider documenting this clearly or normalizing inline SQL to also use the same parameter style as registry queries for consistency.

Copilot · 2026-02-17T21:45:46Z

+class SQLSanitizer:
+    """Configurable sanitizer for inline SQL strings.
+
+    Applied only to raw SQL passed directly to engine/transaction methods.
+    Registry-loaded queries are always trusted and never sanitized.
+
+    Attributes:
+        strip_comments: Strip ``--`` and ``/* */`` comments before execution.
+        block_multiple_statements: Reject SQL that contains a statement-
+            terminating ``;`` followed by additional content (prevents query
+            stacking such as ``SELECT 1; DROP TABLE users``).
+        allowed_verbs: If not ``None``, only SQL statements whose first keyword
+            appears in this set are permitted.  ``None`` means no restriction.
+            Example: ``frozenset({"SELECT", "INSERT", "UPDATE", "DELETE"})``.
+    """


The SQLSanitizer class documentation should include a prominent warning that it does not protect against SQL injection if user-provided data is concatenated directly into the SQL string. Users must still use parameterized queries (with ? or :name placeholders) to prevent SQL injection. The sanitizer only provides defense-in-depth measures (comment stripping, statement blocking, verb restrictions) but is not a substitute for proper parameterization. Consider adding this warning to the docstring to prevent misuse.

Copilot · 2026-02-17T21:45:46Z

+def _tokenize(sql: str) -> list[tuple[str, str]]:
+    """Split *sql* into ``('string', …)`` and ``('code', …)`` tokens.
+
+    String literals (single-quoted, with ``''`` escapes) are preserved as-is.
+    Everything else is a ``'code'`` token.
+    """
+    tokens: list[tuple[str, str]] = []
+    i = 0
+    n = len(sql)
+    last = 0
+
+    while i < n:
+        if sql[i] == "'":
+            if i > last:
+                tokens.append(("code", sql[last:i]))
+            j = i + 1
+            while j < n:
+                if sql[j] == "'":
+                    j += 1
+                    if j >= n or sql[j] != "'":
+                        break  # end of literal
+                    j += 1  # '' escape — continue
+                else:
+                    j += 1
+            tokens.append(("string", sql[i:j]))
+            last = j
+            i = j
+        else:
+            i += 1
+
+    if last < n:
+        tokens.append(("code", sql[last:]))
+
+    return tokens


The _tokenize function does not detect or handle unterminated string literals. If a user provides SQL with an unclosed string literal (e.g., SELECT 'unclosed text), the tokenizer will scan to the end of the input looking for the closing quote, potentially treating legitimate SQL code as part of the string literal. This could bypass sanitization checks (e.g., comment stripping, statement checking) for the portion incorrectly identified as a string. Consider adding validation to detect unterminated string literals and raise an error, or at minimum, add test coverage for this edge case to document the current behavior.

codecov-commenter · 2026-02-17T21:46:20Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 61.29032% with 84 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
row_query/core/sanitizer.py	71.42%	34 Missing ⚠️
row_query/core/engine.py	40.00%	24 Missing ⚠️
row_query/core/transaction.py	37.14%	22 Missing ⚠️
row_query/core/params.py	85.71%	2 Missing ⚠️
row_query/core/registry.py	33.33%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

MaksimShevtsov · 2026-02-17T21:59:18Z

@copilot open a new pull request to apply changes based on the comments in this thread

Copilot · 2026-02-17T21:59:25Z

@MaksimShevtsov I've opened a new pull request, #7, to work on those changes. Once the pull request is ready, I'll request review from you.

…ction methods Co-authored-by: MaksimShevtsov <20194438+MaksimShevtsov@users.noreply.github.com>

Address PR review feedback: DRY violations, error handling, and sanitizer edge cases

…ting SQL execution calls

MaksimShevtsov and others added 3 commits February 17, 2026 22:30

Fix ruff formatting in sanitizer.py

a3ffdbf

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix ruff import ordering in test_sanitizer.py

cadd136

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

MaksimShevtsov requested a review from Copilot February 17, 2026 21:38

Copilot started reviewing on behalf of MaksimShevtsov February 17, 2026 21:38 View session

Copilot AI reviewed Feb 17, 2026

View reviewed changes

Initial plan

94a7b23

Copilot AI mentioned this pull request Feb 17, 2026

Address PR review feedback: DRY violations, error handling, and sanitizer edge cases #7

Merged

Copilot AI and others added 4 commits February 17, 2026 22:04

Move _resolve_sql to shared location and add error handling to transa…

8502291

…ction methods Co-authored-by: MaksimShevtsov <20194438+MaksimShevtsov@users.noreply.github.com>

Merge pull request #7 from MaksimShevtsov/copilot/sub-pr-6

c69f5e3

Address PR review feedback: DRY violations, error handling, and sanitizer edge cases

Refactor async transaction methods for improved readability by format…

9b3b86e

…ting SQL execution calls

Improve error handling in SQL sanitization and registry key validation

55c8387

MaksimShevtsov merged commit 5e44da2 into main Feb 17, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add inline SQL support with configurable sanitization#6

Add inline SQL support with configurable sanitization#6
MaksimShevtsov merged 8 commits into
mainfrom
001-sql-projection-engine

MaksimShevtsov commented Feb 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

Copilot AI Feb 17, 2026

Uh oh!

codecov-commenter commented Feb 17, 2026 •

edited

Loading

Uh oh!

MaksimShevtsov commented Feb 17, 2026

Uh oh!

Copilot AI commented Feb 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

MaksimShevtsov commented Feb 17, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MaksimShevtsov commented Feb 17, 2026

Uh oh!

Copilot AI commented Feb 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Feb 17, 2026 •

edited

Loading