Skip to content

Bump black from 22.12.0 to 24.3.0 in the pip group across 1 directory#1

Open
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/pip/pip-81350e123e
Open

Bump black from 22.12.0 to 24.3.0 in the pip group across 1 directory#1
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/pip/pip-81350e123e

Conversation

@dependabot
Copy link
Copy Markdown

@dependabot dependabot Bot commented on behalf of github Feb 28, 2025

Bumps the pip group with 1 update in the / directory: black.

Updates black from 22.12.0 to 24.3.0

Release notes

Sourced from black's releases.

24.3.0

Highlights

This release is a milestone: it fixes Black's first CVE security vulnerability. If you run Black on untrusted input, or if you habitually put thousands of leading tab characters in your docstrings, you are strongly encouraged to upgrade immediately to fix CVE-2024-21503.

This release also fixes a bug in Black's AST safety check that allowed Black to make incorrect changes to certain f-strings that are valid in Python 3.12 and higher.

Stable style

  • Don't move comments along with delimiters, which could cause crashes (#4248)
  • Strengthen AST safety check to catch more unsafe changes to strings. Previous versions of Black would incorrectly format the contents of certain unusual f-strings containing nested strings with the same quote type. Now, Black will crash on such strings until support for the new f-string syntax is implemented. (#4270)
  • Fix a bug where line-ranges exceeding the last code line would not work as expected (#4273)

Performance

  • Fix catastrophic performance on docstrings that contain large numbers of leading tab characters. This fixes CVE-2024-21503. (#4278)

Documentation

  • Note what happens when --check is used with --quiet (#4236)

24.2.0

Stable style

  • Fixed a bug where comments where mistakenly removed along with redundant parentheses (#4218)

Preview style

  • Move the hug_parens_with_braces_and_square_brackets feature to the unstable style due to an outstanding crash and proposed formatting tweaks (#4198)
  • Fixed a bug where base expressions caused inconsistent formatting of ** in tenary expression (#4154)
  • Checking for newline before adding one on docstring that is almost at the line limit (#4185)
  • Remove redundant parentheses in case statement if guards (#4214).

Configuration

... (truncated)

Changelog

Sourced from black's changelog.

24.3.0

Highlights

This release is a milestone: it fixes Black's first CVE security vulnerability. If you run Black on untrusted input, or if you habitually put thousands of leading tab characters in your docstrings, you are strongly encouraged to upgrade immediately to fix CVE-2024-21503.

This release also fixes a bug in Black's AST safety check that allowed Black to make incorrect changes to certain f-strings that are valid in Python 3.12 and higher.

Stable style

  • Don't move comments along with delimiters, which could cause crashes (#4248)
  • Strengthen AST safety check to catch more unsafe changes to strings. Previous versions of Black would incorrectly format the contents of certain unusual f-strings containing nested strings with the same quote type. Now, Black will crash on such strings until support for the new f-string syntax is implemented. (#4270)
  • Fix a bug where line-ranges exceeding the last code line would not work as expected (#4273)

Performance

  • Fix catastrophic performance on docstrings that contain large numbers of leading tab characters. This fixes CVE-2024-21503. (#4278)

Documentation

  • Note what happens when --check is used with --quiet (#4236)

24.2.0

Stable style

  • Fixed a bug where comments where mistakenly removed along with redundant parentheses (#4218)

Preview style

  • Move the hug_parens_with_braces_and_square_brackets feature to the unstable style due to an outstanding crash and proposed formatting tweaks (#4198)
  • Fixed a bug where base expressions caused inconsistent formatting of ** in tenary expression (#4154)
  • Checking for newline before adding one on docstring that is almost at the line limit (#4185)
  • Remove redundant parentheses in case statement if guards (#4214).

... (truncated)

Commits

Dependabot compatibility score

You can trigger a rebase of this PR by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
  • @dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
  • @dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
  • @dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
  • @dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions
    You can disable automated security fix PRs for this repo from the Security Alerts page.

Note
Automatic rebases have been disabled on this pull request as it has been open for over 30 days.

@dependabot dependabot Bot added the dependencies Pull requests that update a dependency file label Feb 28, 2025
Bumps the pip group with 1 update in the / directory: [black](https://github.com/psf/black).


Updates `black` from 22.12.0 to 24.3.0
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](psf/black@22.12.0...24.3.0)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:development
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot force-pushed the dependabot/pip/pip-81350e123e branch from 7784ed8 to e2533d9 Compare March 19, 2025 00:35
Ali-Razmjoo pushed a commit that referenced this pull request May 29, 2026
…st kernel via PyO3 (databricks#787)

* feat(backend/kernel): route use_sea=True through the Rust kernel

Phase 2 of the PySQL × kernel integration plan
(databricks-sql-kernel/docs/designs/pysql-kernel-integration.md).
Wires `use_sea=True` to a new `backend/kernel/` module that
delegates to the Rust kernel via the `databricks_sql_kernel` PyO3
extension (kernel PR databricks#13).

New module: `src/databricks/sql/backend/kernel/`

- `client.py` — `KernelDatabricksClient(DatabricksClient)`. Lazy-
  imports `databricks_sql_kernel` so a connector install without
  the kernel wheel doesn't `ImportError` at startup; only
  `use_sea=True` surfaces the missing-extra message. Implements
  open/close_session, sync + async execute_command (async_op=True
  goes through `Statement.submit()` and stashes the handle in a
  dict keyed on `CommandId`), cancel/close_command,
  get_query_state, get_execution_result, and the metadata calls
  (catalogs / schemas / tables / columns) via
  `Session.metadata().list_*`. Real server-issued session and
  statement IDs flow through (no synthetic UUIDs).
- `auth_bridge.py` — translate the connector's `AuthProvider`
  into kernel `Session` kwargs. PAT (including federation-wrapped
  PAT — `get_python_sql_connector_auth_provider` always wraps the
  base in `TokenFederationProvider`, so a naive isinstance check
  never matches) routes through `auth_type="pat"`. Everything
  else routes through `auth_type="external"` with a callback that
  delegates to `auth_provider.add_headers({})`. (External today
  is rejected by the kernel at `build_auth_provider`; the
  separate kernel-side enablement PR will flip it on.)
- `result_set.py` — `KernelResultSet(ResultSet)`. Duck-typed over
  `databricks_sql_kernel.ExecutedStatement` (sync execute) and
  `ResultStream` (metadata + async await_result) since both
  expose `arrow_schema()` / `fetch_next_batch()` /
  `fetch_all_arrow()` / `close()`. Same FIFO batch buffer the
  prior ADBC POC used, so `fetchmany(n)` for n smaller than the
  kernel's natural batch size doesn't re-fetch.
- `type_mapping.py` — Arrow → PEP 249 description-string mapper.
  Lifted from the prior ADBC POC; centralised here so future
  kernel-result wrappers reuse the same mapping.

Kernel errors → PEP 249 exceptions: `KernelError.code` is mapped
in a single table to `ProgrammingError` / `OperationalError` /
`DatabaseError`. The structured fields (`sql_state`,
`error_code`, `query_id`, …) are copied onto the re-raised
exception so callers can branch on them without reaching through
`__cause__`.

Routing: `Session._create_backend` flips the `use_sea=True` branch
to instantiate `KernelDatabricksClient` instead of the native
`SeaDatabricksClient`. The native `backend/sea/` module is left
in place (no users on `use_sea=True` after this PR; its long-
term fate is out of scope here).

Packaging: `[tool.poetry.extras] kernel = ["databricks-sql-kernel"]`.
`pip install 'databricks-sql-connector[kernel]'` pulls in the
kernel wheel; `use_sea=True` without the extra raises a pointed
ImportError telling the user how to install it.

Known gaps (acknowledged, will be follow-ups):
- Parameter binding (`execute_command(parameters=[...])`) raises
  NotSupportedError — PyO3 `Statement.bind_param` lands in a
  follow-up.
- Statement-level `query_tags` raises NotSupportedError.
- `get_tables(table_types=[...])` returns unfiltered rows (the
  native SEA backend's filter is keyed on `SeaResultSet`; needs
  a small port to operate on `KernelResultSet`).
- External-auth end-to-end blocked on the kernel-side
  `AuthConfig::External` enablement PR.
- Volume PUT/GET (staging operations): kernel has no Volume API.

Test plan:
- Unit: 37 new tests across
  `tests/unit/test_kernel_auth_bridge.py` (auth provider →
  kwargs mapping, including federation-wrapped PAT and the
  External trampoline call-counter check),
  `tests/unit/test_kernel_type_mapping.py` (Arrow type mapping
  + description shape), and
  `tests/unit/test_kernel_result_set.py` (buffer semantics,
  fetchmany across batch boundaries, idempotent close, close()
  swallowing handle-close failures). All pass.
- Full unit suite: 600 pre-existing tests still pass; one
  pre-existing failure (`test_useragent_header` — agent
  detection adds `agent/claude-code` in this env) was already
  failing on main, unrelated to this change.
- Live e2e against dogfood with `use_sea=True`: SELECT 1,
  `range(10000)`, `fetchmany` pacing, `fetchall_arrow`, all four
  metadata calls (returned 75 catalogs / 144 schemas in main /
  47 tables in `system.information_schema` / 15 columns),
  `session_configuration={'ANSI_MODE': 'false'}` round-trips,
  bad SQL surfaces as DatabaseError with `code='SqlError'`
  and `sql_state='42P01'` on the exception. All checks pass.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* refactor(backend/kernel): PAT-only auth, drop External trampoline

The earlier auth_bridge routed OAuth/MSAL/federation through the
kernel's External token-provider trampoline (a Python callable
the kernel invoked per HTTP request). Removing that for now.

Why: routing OAuth into the kernel inherently requires per-request
token resolution to keep refresh working during a long-running
session. Two viable mechanisms (kernel-native OAuth, or the
External callback); both have costs (duplicate OAuth flows vs
GIL-per-request). Punting the decision until there's actual
demand on use_sea=True.

Today: the bridge accepts PAT (including TokenFederationProvider-
wrapped PAT, which is how `get_python_sql_connector_auth_provider`
always shapes it). Any non-PAT auth_provider raises a clear
NotSupportedError pointing the user at use_sea=False (Thrift).

This shrinks the auth_bridge to ~50 lines and means the kernel-
side External enablement PR is no longer on the connector's
critical path — there's no kernel-side prerequisite for shipping
use_sea=True for PAT users.

Unit tests updated:
- TokenFederationProvider-wrapped PAT still routes to PAT (kept).
- Generic OAuth provider raises NotSupportedError (new).
- ExternalAuthProvider raises NotSupportedError (new).
- Silent non-PAT provider raises NotSupportedError (new) —
  reject the type itself rather than trying to extract a token
  we already know we can't use.

Live e2e against dogfood with use_sea=True (PAT): all checks
still pass (SELECT 1, range(10000), fetchmany pacing, four
metadata calls, session_configuration round-trip, structured
DatabaseError on bad SQL).

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* test(e2e): live kernel-backend (use_sea=True) suite

Moves the previously-ad-hoc /tmp/connector_smoke.py into the repo
as a real pytest module under tests/e2e/ — same convention as the
rest of the e2e suite. Uses the existing session-scoped
`connection_details` fixture from the top-level conftest so it
shares the credential surface with every other live test.

11 tests cover:
- connect() with use_sea=True opens a session.
- SELECT 1: rows + description shape (column name + dbapi type slug).
- SELECT * FROM range(10000): multi-batch drain.
- fetchmany() pacing across the buffer boundary.
- fetchall_arrow() returns a pyarrow Table.
- All four metadata methods (catalogs / schemas / tables / columns).
- session_configuration={'ANSI_MODE': 'false'} round-trips.
- Bad SQL surfaces as DatabaseError with `code='SqlError'` and
  `sql_state='42P01'` attached as exception attributes.

Module-level skips:
- `databricks_sql_kernel` not importable → whole module skipped via
  pytest.importorskip (the wheel hasn't been installed).
- Live creds missing → fixture-level skip with a pointed message.

Run: `pytest tests/e2e/test_kernel_backend.py -v`. All 11 pass
against dogfood in ~20s.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): defer databricks-sql-kernel poetry dep declaration

CI is failing across all jobs at \`poetry lock\` time:

    Because databricks-sql-connector depends on databricks-sql-kernel
    (^0.1.0) which doesn't match any versions, version solving failed.

The kernel wheel isn't yet published to PyPI — we verified the name
is available via the Databricks proxy, but the package itself hasn't
been built and uploaded yet. Declaring it as a poetry dep (even an
optional one inside an extra) requires the version to be resolvable,
and \`poetry lock\` runs as the setup step for every CI job: unit
tests, linting, type checks, all of them.

Fix: drop the \`databricks-sql-kernel\` dep declaration and the
\`[kernel]\` extra from pyproject.toml until the wheel is on PyPI.
The lazy import in \`backend/kernel/client.py\` still raises a
clear ImportError pointing at \`pip install databricks-sql-kernel\`
(or local maturin) when use_sea=True is invoked without the kernel
present.

When the kernel is published, a small follow-up will add back:

    databricks-sql-kernel = {version = "^0.1.0", optional = true}
    [tool.poetry.extras]
    kernel = ["databricks-sql-kernel"]

A pointed comment in pyproject.toml documents the deferred change.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): unit tests skip without pyarrow, mypy + black

Three CI failures after the poetry-lock fix uncovered three real
issues:

1. pyarrow is optional in the connector. The default-deps CI test
   job installs without it; the +PyArrow job installs with. The
   kernel backend's result_set.py + type_mapping.py import pyarrow
   eagerly (the kernel always returns pyarrow), and the unit tests
   import the backend at collection time — which crashes the
   default-deps job at ModuleNotFoundError.

   Fix: gate the three kernel unit tests on `pytest.importorskip(
   "pyarrow")` so they skip on default-deps and run on +PyArrow.
   Verified locally: 39 pass with pyarrow, 3 skipped without.
   No change to the backend module itself — nothing imports it
   until use_sea=True is invoked, and pyarrow is on the kernel
   wheel's runtime dep list so use_sea=True can't hit this either.

2. mypy: KernelDatabricksClient.open_session returns
   self._session_id, which mypy types as Optional[SessionId]
   because the field starts as None. Fix: bind the new id to a
   local non-Optional variable, assign to the field, return the
   local. CI's check-types runs cleanly on backend/kernel/ now;
   pre-existing mypy noise elsewhere isn't mine.

3. black --check: black 22.12.0 (the version CI pins) wants
   reformatting on result_set.py / type_mapping.py / client.py.
   Applied. Verified locally with the same black version.

All 39 kernel unit tests + 619 pre-existing unit tests pass.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): make package importable without the kernel wheel

The +PyArrow CI matrix installs pyarrow but not the
databricks-sql-kernel wheel (the wheel isn't on PyPI yet, and the
[kernel] extra is deferred — see commit 31ca581). The previous
fix gated unit tests on `pytest.importorskip("pyarrow")` but
test_kernel_auth_bridge.py was still pulled into a kernel-wheel
ImportError because:

  src/databricks/sql/backend/kernel/__init__.py
    -> from databricks.sql.backend.kernel.client import KernelDatabricksClient
        -> import databricks_sql_kernel  # ImportError on +PyArrow CI

The eager re-export from `__init__.py` was a convenience that
broke every consumer that only needed a submodule (type_mapping,
result_set, auth_bridge) — they all triggered the kernel wheel
import for no reason.

Fix:
- Drop the eager re-export from `kernel/__init__.py`. Comment
  documents why and points callers (= session.py::_create_backend,
  already this shape) at the direct `from .client import ...`.
- Drop the no-longer-needed `pytest.importorskip("pyarrow")` /
  `importorskip("databricks_sql_kernel")` from
  test_kernel_auth_bridge.py — auth_bridge.py itself has neither
  dep, so the test now runs on every CI matrix variant.
- test_kernel_result_set.py and test_kernel_type_mapping.py keep
  the pyarrow importorskip because they themselves use pyarrow.

Verified locally across the three matrix shapes:
- both pyarrow + kernel installed: 39 pass.
- pyarrow only (no kernel wheel — the +PyArrow CI shape): 39 pass.
- neither: 9 pass (auth_bridge only), 2 modules skip (the others
  use pyarrow).

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* test(e2e): skip use_sea=True parametrized cases when kernel wheel missing

The connector's coverage CI job runs the full e2e suite, several of
whose test classes parametrize ``extra_params`` over ``{}`` and
``{"use_sea": True}``. With ``use_sea=True`` now routing through
the Rust kernel via PyO3, those cases die at ``connect()`` with our
pointed ImportError because the ``databricks-sql-kernel`` wheel
isn't yet on PyPI — and that CI job (sensibly) doesn't try to
build it from a sibling repo.

Fix: ``pytest_collection_modifyitems`` hook in the top-level
``conftest.py`` that adds a ``skip`` marker to any parametrize case
with ``extra_params={"use_sea": True, ...}`` when
``importlib.util.find_spec("databricks_sql_kernel")`` returns
``None``. Behavior change is CI-only — local dev with the kernel
wheel installed (via ``maturin develop`` from the kernel repo)
runs those cases as before.

Once the kernel wheel is published, the [kernel] extra in
pyproject.toml gets enabled (see comment block there) and the
default-deps CI matrix will install it; the skip then becomes a
no-op.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* refactor(backend/kernel): address review feedback — mechanical fixes

Cleanup pass on the kernel-backend PR addressing reviewer feedback
that doesn't change observable behaviour:

- result_set.py: replace O(M²) `_buffered_rows` with running counter
  `_buffered_count` maintained by pull/take/drain (perf F6).
- result_set.py: docstring corrections — drop nonexistent
  `fetch_all_arrow` from kernel-handle contract (F20); document
  `buffer_size_bytes` as no-op on the kernel backend (F21).
- client.py: tighten `_reraise_kernel_error` signature to
  `_kernel.KernelError` only; drop dead passthrough branch and the
  defensive setattr try/except (F17).
- client.py: drop unused `_use_arrow_native_complex_types` kwarg (F18).
- client.py: collapse three `KernelResultSet(...)` construction sites
  through `_make_result_set` (renamed from `_metadata_result`) (F19).
- client.py: drop `metadata-` prefix from synthetic CommandId; use a
  plain `uuid.uuid4().hex` so anything reading `cursor.query_id`
  downstream sees a UUID-shaped string (F14).
- client.py: clear the raw access token from `_auth_kwargs` after the
  kernel session is constructed — kernel owns the credential from
  then on, no need to retain a cleartext copy on the connector
  instance (F24).
- auth_bridge.py: reject bearer tokens containing ASCII control
  characters at extraction time (defense-in-depth against header
  injection if a misbehaving HTTP stack ever places the token back
  into a header without scrubbing) (F25).
- tests/unit/test_kernel_auth_bridge.py: construct a real
  `TokenFederationProvider(http_client=Mock())` instead of bypassing
  `__init__` with `__new__` + monkey-patching `add_headers`. Exercises
  the real federation passthrough path the bridge sees in production
  (F12). Drop unused `MagicMock` import (F27).
- tests/e2e/test_kernel_backend.py: drop misleading CloudFetch claim
  on `test_drain_large_range_to_arrow` — 10000 BIGINT rows is ~80 KB,
  single inline chunk on a typical warehouse (F26).

All 39 existing kernel unit tests pass.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* feat(backend/kernel): introduce dedicated use_kernel flag + substantive review fixes

Major change: route the kernel backend through a new ``use_kernel=True``
connection kwarg instead of repurposing ``use_sea=True``. ``use_sea=True``
once again routes to the native pure-Python SEA backend (no behaviour
change); ``use_kernel=True`` routes to the Rust kernel via PyO3. The
two flags are mutually exclusive.

This addresses the largest reviewer concern from the multi-agent
review: silently hijacking a documented public flag broke OAuth /
federation / parameter-binding callers on ``use_sea=True`` who had no
opt-out. With the new flag, the kernel backend is fully opt-in and
existing ``use_sea=True`` users continue to get the native SEA backend
they signed up for.

Other substantive fixes:

- session.py: restore ``SeaDatabricksClient`` import + routing. Reject
  ``use_kernel=True`` + ``use_sea=True`` together with a clear
  ``ValueError``.
- client.py (kernel ``Cursor.columns``): update docstring to flag the
  ``catalog_name=None`` divergence — kernel requires a catalog,
  Thrift / native SEA do not (F13).
- conftest.py: drop the collection-time ``pytest_collection_modifyitems``
  hook that was skipping ``extra_params={"use_sea": True}`` cases. With
  ``use_sea=True`` back on the native SEA backend, those cases run as
  they did before this PR (F8).
- kernel/client.py: ``get_tables`` now applies the ``table_types``
  filter client-side using ``ResultSetFilter._filter_arrow_table``
  (the same helper the native SEA backend uses), wrapped in a tiny
  ``_StaticArrowHandle`` that flows the filtered table back through
  the normal ``KernelResultSet`` path. Replaces the previous
  "log a warning and return unfiltered" behaviour (F4).
- kernel/client.py: guard ``_async_handles`` with ``threading.RLock``
  so concurrent cursors on the same connection don't race on
  submit / close / close-session (F15).
- kernel/result_set.py: ``KernelResultSet.close()`` now drops the
  entry from ``backend._async_handles`` so async-submitted statements
  don't leave stale references behind (F5).
- kernel/{__init__,client,auth_bridge}.py, tests/e2e/test_kernel_backend.py:
  update docstrings, error messages, and the e2e fixture to refer to
  ``use_kernel=True`` instead of ``use_sea=True``.
- client.py (``Connection`` docstring): document the new
  ``use_kernel`` kwarg + its Phase-1 limitations.

New tests:

- tests/unit/test_kernel_client.py (38 cases): cover the 14-entry
  ``_CODE_TO_EXCEPTION`` table, ``_reraise_kernel_error`` attribute
  forwarding, the 6-entry ``_STATE_TO_COMMAND_STATE`` table, the
  no-open-session guards on every method, ``open_session`` double-open,
  ``parameters`` / ``query_tags`` rejection, ``get_columns``'
  catalog-required check, ``cancel_command`` / ``close_command``
  no-handle tolerance, ``get_query_state`` sync-path SUCCEEDED, the
  Failed-state re-raise, the synthetic-command-id UUID shape, and
  ``close_session`` cleanup even when per-handle close errors fire.
  Uses a fake ``databricks_sql_kernel`` module installed into
  ``sys.modules`` so the test runs with no Rust extension dependency
  (F9).

77/77 kernel unit tests pass.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): CI-greening — mypy + e2e module skip

- src/databricks/sql/backend/kernel/result_set.py: fix the 3 mypy
  errors at L237/239/241 by casting ``self.backend`` to
  ``KernelDatabricksClient`` (the base ``DatabricksClient`` doesn't
  declare ``_async_handles`` / ``_async_handles_lock``). Folds in
  gopalldb's nit (3249904284) — replace the explicit
  ``acquire()/try/finally/release()`` with a ``with`` block to match
  the rest of the file.
- tests/e2e/test_kernel_backend.py: harden the module-level skip so
  the suite doesn't run when the kernel wheel is absent in CI. The
  unit suite installs a fake ``databricks_sql_kernel`` ``ModuleType``
  into ``sys.modules`` so the connector's import-time ``import
  databricks_sql_kernel`` succeeds without the Rust extension; that
  fake leaks across into the same pytest session and
  ``pytest.importorskip`` happily returns it. A real wheel exposes
  ``__file__`` (compiled extension on disk); the fake does not.
  Skip the module when ``__file__`` is missing.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): address gopalldb minor review comments (m1, m4)

m1 — install hint (comment 3249904266):
  The ``databricks-sql-kernel`` wheel is not yet published on PyPI;
  ``pip install databricks-sql-kernel`` either finds nothing or
  pulls a squatted package. Drop the misleading hint from
  ``ImportError`` and from the ``use_kernel`` docstring on
  ``databricks.sql.connect``; point users at the
  ``maturin develop --release`` dev path until the wheel ships.

m4a — auth_bridge ValueError → ProgrammingError (comment 3249904276):
  Two sites in ``_extract_bearer_token`` / ``kernel_auth_kwargs``
  were raising bare ``ValueError`` for caller-misuse cases (control
  chars in the token, PAT provider that produced no Authorization
  header). The rest of the kernel-backend error surface uses PEP
  249 exception types — code paths that catch ``DatabaseError`` /
  ``ProgrammingError`` would miss these. Convert to
  ``ProgrammingError`` and update the unit test.

m4b — description null_ok (comment 3249904282):
  ``description_from_arrow_schema`` was hardcoding the 7th tuple
  element to ``None`` even though ``pyarrow.Field.nullable`` is
  available. PEP 249 §Cursor.description defines ``null_ok`` as
  "True if NULL values are allowed"; callers branching on it
  would have lost useful information the kernel already provides.
  Now emits ``field.nullable``; added a unit test covering both
  nullable and non-nullable fields; updated the two existing tests
  that asserted the old all-``None`` shape.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): substantive review fixes — M1, M2, M3, m2, m3

Addresses gopalldb's four major / two minor remaining review
comments. The shared error-mapping primitives move to a new
``_errors.py`` module so both ``client.py`` and ``result_set.py``
can use them without ``result_set.py`` importing from ``client.py``.

M1 — async handle leak in get_execution_result (3249904251):
  ``ResultStream`` from ``await_result()`` is wrapped in
  ``KernelResultSet``; the underlying ``ExecutedAsyncStatement``
  has no further role once the stream is in hand. Close it
  immediately, drop the entry from ``_async_handles``, and add the
  guid to ``_closed_commands``. A failed ``async_exec.close()`` is
  logged but doesn't break the result-set return — the kernel's
  Drop impl reaps server-side state.

M2 — PyO3 native exceptions wrapped as OperationalError (3249904255):
  Added ``kernel_call(what)`` context manager (in the new
  ``_errors.py``). ``KernelError`` flows through
  ``reraise_kernel_error`` as before; anything else
  (``TypeError`` / ``OverflowError`` / ``ValueError`` from PyO3
  argument conversion, extension-internal errors) is wrapped in
  ``OperationalError`` so DB-API callers only ever see PEP 249
  exception types. Applied at every PyO3 call site:
  ``open_session``, ``execute_command``, ``cancel_command``,
  ``close_command``, ``get_query_state``, ``get_execution_result``,
  ``get_catalogs`` / ``get_schemas`` / ``get_tables`` /
  ``get_columns``, plus ``fetch_next_batch`` /
  ``arrow_schema`` in ``KernelResultSet``.

M3 — KernelError during result-set construction (3249904259):
  ``KernelResultSet.__init__`` calls ``kernel_handle.arrow_schema()``
  which can itself raise. Every call to ``_make_result_set`` is now
  inside a ``kernel_call`` scope so the schema-fetch error becomes a
  mapped PEP 249 exception instead of leaking raw ``KernelError``.

m3 — get_query_state of a closed async command (3249904273):
  Added ``_closed_commands: Set[str]`` (guarded by the existing
  ``_async_handles_lock``). ``close_command`` records the guid;
  ``close_session`` records every swept guid;
  ``get_execution_result`` records its own command after closing
  the async_exec. ``get_query_state`` now returns
  ``CommandState.CLOSED`` instead of falling through to
  ``SUCCEEDED`` for these.

m2 — unit test for get_tables table_types client-side filter
(3249904269):
  Added ``test_get_tables_with_table_types_filters_rows`` and
  ``test_get_tables_without_table_types_returns_full_stream`` in
  ``tests/unit/test_kernel_client.py``. The first feeds a fake
  stream with mixed ``TABLE`` / ``VIEW`` rows and asserts only
  ``TABLE`` survives; the second confirms the no-filter path
  bypasses the drain-and-rewrap and returns all rows unchanged.

Plus new tests for every change above:
  - test_pyo3_native_exception_wrapped_as_operational_error (M2)
  - test_pyo3_native_exception_wrapped_for_metadata_calls (M2)
  - test_kernel_error_during_result_set_construction_is_mapped (M3)
  - test_get_execution_result_closes_async_exec_and_drops_tracking (M1)
  - test_get_execution_result_does_not_raise_on_async_exec_close_failure (M1)
  - test_get_query_state_returns_closed_after_close_command (m3)
  - test_close_session_marks_swept_handles_as_closed (m3)

87/87 kernel unit tests pass (added 9).

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* refactor(backend/kernel): replace kernel_call context manager with explicit try/except

Using a ``with`` block for error translation is bad form: ``with``
conventionally signals resource lifecycle (locks, files), so
``with kernel_call("X"):`` hides the fact that the block raises a
mapped exception. Replace with explicit ``try/except Exception as
exc: raise _wrap_kernel_exception("X", exc) from exc`` at every
PyO3 call site.

What changed:

- ``_errors.py``: drop the ``kernel_call`` context manager; export
  ``wrap_kernel_exception(what, exc)`` — a pure function that maps
  a raw exception to a PEP 249 one (KernelError → mapped class via
  ``reraise_kernel_error``; existing Error → passthrough; anything
  else → OperationalError).
- ``client.py``: replace 12 ``with _kernel_call(...):`` blocks with
  inline try/except calling the helper.
- ``result_set.py``: same for the 3 sites (arrow_schema on
  construct, fetch_next_batch in _pull_one_batch, fetch_next_batch
  in _drain).

Behaviour is unchanged — same KernelError → PEP 249 mapping, same
non-KernelError → OperationalError wrapping. Just spelled in a way
that makes control flow visible at the call site and keeps
tracebacks one frame shorter (no ``__exit__`` frame).

87/87 kernel unit tests still pass.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* style(backend/kernel): black format client.py

Fixes ``check-linting`` CI: one long line in ``execute_command``'s
async-submit branch needed to wrap (the ``CommandId.from_sea_statement_id``
call). Pure formatting; no behaviour change.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): address gopalldb's P1 review comments

Local kernel-package fixes from the follow-up review pass on PR
databricks#787 (databricks#787 (comment)).
The two cross-cutting / pre-existing issues (P0 #1 async leak in
shared ``Cursor.close()``, P1 databricks#8/databricks#9 fetch-after-close not raising
``InterfaceError``) are tracked separately as databricks#791 and databricks#792 — they
affect Thrift and SEA equally and are out of scope for this PR.

P1 databricks#2 — call ``backend.close_command`` from ``KernelResultSet.close()``:
  The override previously bypassed the base ``ResultSet.close()``
  entirely. Honour the contract by invoking
  ``backend.close_command(self.command_id)`` after the per-handle
  close + ``_async_handles`` pop. Our own ``close_command`` is
  tolerant of already-popped guids (no-op), so this is safe even
  though the per-handle close above already released server-side
  state. Doesn't go through ``super().close()`` directly because
  the base path warns when ``self.results`` is ``None`` (which it
  is for kernel result sets) — replicate the meaningful part of
  the base contract without the noisy warning.

P1 databricks#3 — case-insensitive ``Bearer`` prefix in auth_bridge:
  RFC 6750 §2.1 says the Authorization scheme is case-insensitive.
  Match leniently in case a federation proxy or future provider
  normalises the casing differently — failing closed would surface
  as a confusing ``ProgrammingError`` from the bridge.

P1 databricks#4 — drop redundant ``__cause__`` set in ``reraise_kernel_error``:
  ``raise wrap_kernel_exception(...) from exc`` already sets
  ``__cause__`` at the call site; the manual assignment in
  ``reraise_kernel_error`` was redundant. Updated the test that
  asserted on it; added ``test_kernel_error_chains_through_wrap``
  to cover the end-to-end chain.

P1 databricks#5 — ``get_tables`` filter looks up TABLE_TYPE by name:
  Replaced ``schema.field(5).name`` (positional) with the literal
  ``"TABLE_TYPE"`` plus a missing-column guard. A future kernel
  reshape of ``SHOW TABLES`` now surfaces an explicit
  ``OperationalError`` instead of silently filtering the wrong
  column. The case-sensitive contract is now documented in the
  surrounding comment (matches SEA + warehouse).

P1 databricks#6 — ``KernelResultSet.close()`` guards on ``connection.open``:
  ``__del__``-driven close arriving after the parent connection is
  already closed previously issued a kernel call into a disposed
  session. Skip the kernel call entirely in that case; still mark
  the result set ``CLOSED`` locally so ``__del__`` is idempotent.

P1 databricks#7 — defer ``kernel_auth_kwargs`` to ``open_session``:
  ``KernelDatabricksClient.__init__`` previously called
  ``kernel_auth_kwargs(auth_provider)`` and stored the bearer
  token on ``self._auth_kwargs`` indefinitely. If ``open_session``
  never ran (test paths, error paths, lazy retries) the token
  stayed resident on the connector object. Build the kwargs
  locally inside ``open_session`` now — local variable, GC-eligible
  the moment ``open_session`` returns.

Also tightened the install-hint comment in ``pyproject.toml`` to
match the rest of the codebase (the wheel isn't on PyPI; only the
``maturin develop`` path is supported today).

88/88 kernel unit tests pass (added 1).

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* fix(backend/kernel): address gopalldb's follow-up P1/P2 review comments

Review comment: databricks#787 (comment)

P1.1 — get_query_state handles non-BaseException failure:
  Previously ``raise _reraise_kernel_error(failure) from failure``
  would explode with ``TypeError: exception causes must derive
  from BaseException`` if the kernel's ``status()`` ever returned a
  ``failure`` that wasn't a real ``KernelError`` (struct, dict —
  kernel API drift). Now route through ``_wrap_kernel_exception``,
  which isinstance-checks and falls through to ``OperationalError``
  for non-PEP-249-shaped values. New unit test
  ``test_get_query_state_handles_non_baseexception_failure``.

P1.2 — regression tests for prior fixes:
  - ``test_bearer_prefix_is_case_insensitive`` (P1 databricks#3 from the
    earlier review): parametrised over "Bearer "/"bearer "/
    "BEARER "/"BeArEr ". RFC 6750 §2.1 compliance was added but
    not covered by a test.
  - ``test_close_skips_kernel_call_when_connection_already_closed``
    (P1 databricks#6 from the earlier review): exercises the
    ``connection.open is False`` branch in
    ``KernelResultSet.close()`` — asserts neither the kernel
    handle's close nor backend.close_command fire, but the result
    set still ends in ``CLOSED`` so ``__del__`` is idempotent.
  - ``test_token_with_control_chars_or_whitespace_rejected``
    (pre-existing security guard): parametrised over NUL / CR / LF
    / DEL / space / tab — the regex previously missed space (0x20).
    Covered + extended.

P2.1 — wrap session_id extraction in open_session:
  ``SessionId.from_sea_session_id(self._kernel_session.session_id)``
  was outside the ``try/except _wrap_kernel_exception`` scope. A
  raw PyO3 attribute-conversion error on the
  ``self._kernel_session.session_id`` access could escape unwrapped.
  Now wrapped.

P2.2 — drop redundant _async_handles.pop in result-set close:
  After the M1 fix (``get_execution_result`` pops the guid before
  constructing the result set), the pop in ``KernelResultSet.close``
  is dead code — every call misses. Sync-execute and metadata
  paths never registered in ``_async_handles`` to begin with.
  Drop the per-close pop; rewrote the surrounding comment so
  ``backend.close_command`` is now the single bookkeeping seam.

P2.3 — control-char regex includes whitespace:
  Extended ``[\x00-\x1f\x7f]`` → ``[\x00-\x20\x7f]`` and renamed
  ``_CONTROL_CHAR_RE`` → ``_TOKEN_REJECT_RE``. RFC 6750 forbids
  whitespace within the credential token itself; a token like
  ``"Bearer  doubled-space-token"`` previously slipped past the
  injection guard. Test parametrised above.

type_mapping reuse — use SqlType constants:
  Replaced literal type strings ("bigint", "string", …) in
  ``_arrow_type_to_dbapi_string`` with the ``SqlType`` constants
  from ``databricks.sql.backend.sea.utils.conversion`` — same
  single source of truth the SEA backend already uses, so the
  kernel and SEA backends emit byte-identical type-code strings.
  The Arrow → SqlType lookup itself stays local to the kernel
  (SEA receives type-text from the server and normalises it; the
  kernel receives Arrow schemas directly), but the names are now
  shared.

100/100 kernel unit tests pass (added 12).

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

---------

Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants