[SEA-NodeJS] (8/8) M1 Phase 1 — OAuth M2M + OAuth U2M (5 review rounds, ZERO HIGH at close)#383
[SEA-NodeJS] (8/8) M1 Phase 1 — OAuth M2M + OAuth U2M (5 review rounds, ZERO HIGH at close)#383msrathore-db wants to merge 10 commits into
Conversation
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
e9131ae to
e7c979d
Compare
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
9b057f0 to
1a9ddd9
Compare
3917148 to
0eb08b8
Compare
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Adds OAuth M2M and U2M onto the SEA auth path. The auth-u2m worktree
landed both at once (rather than rebasing on top of the M2M branch)
because the JS adapter flow-selector (`oauthClientSecret defined ?
M2M : U2M`, mirroring thrift `DBSQLClient.ts:143`) is cleanest when
both arms exist together — the no-secret branch was rejection-only
in the M2M-alone state.
The napi binding's `AuthMode` enum gains a third variant (`OAuthU2m`),
crossing the FFI as the PascalCase string `'OAuthU2m'`. The JS adapter
hardcodes `oauthRedirectPort: 8030` on the U2M payload to override
the kernel default of 8020 — preserves parity with thrift, which
defaults to 8030 (`OAuthManager.ts:245`). All other U2M knobs
(`client_id`, `scopes`, `callback_timeout`, `token_url_override`)
stay at kernel defaults; thrift hides them from its public surface
too, so SEA follows the same pattern.
`OAuthPersistence` is rejected on U2M with an explicit M1-Phase-2
deferral message: thrift exposes the hook, the kernel doesn't yet —
parity gap to close once `AuthConfig::External` lands. The kernel
disk cache at `~/.config/databricks-sql-kernel/oauth/{sha256}.json`
covers the standard flow today. Azure-direct knobs
(`azureTenantId` / `useDatabricksOAuthInAzure`) rejected on both
M2M and U2M with the same "Phase 2" message — kernel uses workspace
OIDC which works for Azure-databricks workspaces regardless.
Task: M1 OAuth M2M + U2M (sea-auth feature, U2M worktree).
Files:
- native/sea/src/database.rs — AuthMode { Pat, OAuthM2m, OAuthU2m } +
ConnectionOptions union + open_session dispatch with U2M arm
forwarding `oauth_redirect_port` from JS and leaving every other
U2M kernel knob at None
- native/sea/index.{d.ts,js} — regenerated napi artifact
- lib/sea/SeaAuth.ts — buildSeaConnectionOptions grows M2M + U2M
branches; flow selector mirrors thrift; persistence rejection
message reads as a parity gap, not a feature add
- lib/sea/SeaNativeLoader.ts — SeaNativeBinding.openSession type
accepts the three-arm discriminated payload
- tests/unit/sea/auth-pat.test.ts — assertions updated for new
`authMode: 'Pat'` round-trip; no-secret OAuth now asserts U2M
happy-path dispatch
- tests/unit/sea/auth-m2m.test.ts — new (8 cases — same as the
M2M-worktree commit, minus the now-obsolete no-secret rejection)
- tests/unit/sea/auth-u2m.test.ts — new (7 cases — happy path,
port 8030 hardcode, clientId not propagated, path slash prepend,
Azure rejected, persistence rejected, SeaBackend round-trip)
- tests/integration/sea/auth-m2m-e2e.test.ts — env-gated live e2e
(mirrors the M2M-worktree e2e)
- tests/integration/sea/auth-u2m-e2e.test.ts — new (it.skip pending
TBD-oauth_u2m_test_harness; comment points at testing-agent's
Playwright/Puppeteer harness work)
Tests:
- Unit: 55/55 pass (`npm run test -- 'tests/unit/sea/**/*.test.ts'`):
13 PAT (assertions updated for authMode + no-secret now U2M),
8 M2M, 7 U2M, 25 SeaErrorMapping regression, 2 ConnectionOptions
base.
- U2M e2e: 1 pending (intentional `it.skip` — gated on browser
harness).
- M2M e2e: same as the M2M-worktree commit — kernel-side OAuth
plumbing reaches the workspace; pecotesting SP credentials produce
the workspace's `invalid_client` (verified reproducible via
direct curl), an environmental issue not a code defect.
Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…helper, doc + loader cleanup Ports the M2M-side round-1 review fixes (commit 88d7d21 on sea-auth-m2m) onto the U2M worktree so the two branches stay aligned in review quality. The U2M-specific work in 5eba37f is unchanged; this commit is pure cleanup applied across all three SEA-auth test files (PAT / M2M / U2M). - Extracted `makeFakeBinding()` to `tests/unit/sea/_helpers/fakeBinding.ts` and refactored all three auth-*.test.ts files to import it. The U2M-worktree commit had THREE copies of the closure (the third was the cause for the bloat reviewer's "rule of three" call-out that the M2M-worktree fixup was meant to forestall). - Dropped the unused `SeaAuthMode` type alias from `SeaAuth.ts` — zero callers; inlined literals already power the discriminated union. - Tightened `SeaNativeBinding.openSession` parameter type to consume the discriminated `SeaNativeConnectionOptions` union from `SeaAuth.ts`, restoring compile-time per-mode field enforcement at the FFI seam. - Augmented the Rust `AuthMode` doc-comment with the napi-emission explanation (PascalCase verbatim, not kebab-case) plus the cross-reference reminder to extend `open_session()`'s match on every new variant. - Added the const-enum hazard note to `SeaNativeConnectionOptions`' doc-comment, locking in the duplicated-literal pattern as deliberate (importing the napi `const enum AuthMode` breaks `isolatedModules`). - Cleaned up the conditional-type-cast lobotomy in `auth-pat.test.ts` on the token-provider fixture; plain `as any` + eslint-disable. Skipped findings (same justification as M2M-worktree commit): F-3 borderline error-class taxonomy, F-4 cosmetic arg-order, F-5 redundant comment-anchor (compiler already enforces), F-8 null vs undefined paranoia, F-9 mocha named-function style. Tests: - Unit: 55/55 pass (same count as 5eba37f — pure restructure). - Native build: clean (1m04s release profile). - Type-check: clean (tsc --noEmit). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…ediums) Ports the M2M-side devils-advocate round-1 fixes (commit eef9d30 on sea-auth-m2m) onto the U2M worktree. Same shape, with the U2M-specific adjustments noted below. Added `decodeNapiKernelError(err: unknown): Error` to `SeaErrorMapping.ts` and wrapped `SeaBackend.openSession`'s napi call in `try`/`catch` + the decoder. The wiring step was missing on both branches; now M2M and U2M users see typed errors (`AuthenticationError` on Unauthenticated, `HiveDriverError` on NetworkError, etc.) instead of raw `Error` with sentinel-prefixed message bodies. `buildSeaConnectionOptions` rejects: - PAT path + stray OAuth fields → HiveDriverError "cannot supply both `token` and `oauthClientId`/`...Secret`". - OAuth path (M2M or U2M) + stray `token` → HiveDriverError "cannot supply `token` alongside `authType: 'databricks-oauth'`". The OAuth-side check fires BEFORE the M2M/U2M split, so the U2M arm gets the protection too. Rewrote three M2M-arm error messages plus the U2M persistence message to be time-bound free: - "U2M lands in sea-auth-u2m" → "OAuth M2M requires `oauthClientSecret`. For interactive OAuth (U2M), see the driver OAuth U2M docs." - "Azure-direct OAuth ... is a later M1 task" → "Azure-direct OAuth ... is not supported. The workspace-OIDC discovery path handles Azure workspaces today without these options." - "M1+ follow-ups" → "Supported modes on the SEA backend today: ..." - U2M persistence: dropped "M1 Phase 2" — kept the technical explanation citing kernel-side `AuthConfig::External` plumbing (durable; describes the kernel gap, not the feature roadmap). Zero hits for `sea-auth-u2m|sea-auth-m2m|later M1|M1\+ follow|M1 Phase` in `lib/sea/`. Updated regex assertions in lockstep. `isBlankOrReserved(s)` helper trims + rejects empty-after-trim and literal `'undefined'` / `'null'` strings. Applied to `token`, `oauthClientId`, `oauthClientSecret`. E2e env-gate hardened the same way. Added `tests/unit/sea/auth-edge-cases.test.ts` with 18 cases: - Whitespace + reserved-literal PAT (3) - Same for `oauthClientId` / `oauthClientSecret` on M2M (4) - Ambiguous-creds: PAT+id, PAT+secret, M2M+token, U2M+token (4) - Explicit-undefined Azure-direct discriminants on M2M + U2M (3) - `decodeNapiKernelError` for Unauthenticated, NetworkError, SQLSTATE preservation, plain napi pass-through, corrupted envelope fallback (5) Added a bad-secret `it(...)` block to `auth-m2m-e2e.test.ts` that asserts `AuthenticationError` + `invalid_client`. Closes the loop on DA-F1 by proving the kernel-side error path surfaces correctly. The U2M e2e remains `it.skip` pending the Playwright/Puppeteer harness; once it lands, the same negative-path pattern can be added there. L-F3, L-F4, L-F5 — deferred per the previous fixup's reasoning. Tests: - Unit: 74/74 pass (was 55 before this commit: +18 from edge-cases + 1 from new pending placeholder count). - TypeScript build: clean. - Native build: unchanged (no Rust changes this commit). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…raise on U2M+id, case-insensitive validation, preserve all error envelope fields Addresses devils-advocate-auth-u2m-1 round-1 findings on commit 8e99b40. NF-1 is a HIGH continuation of DA-F1 — the previous fixup wired `decodeNapiKernelError` at `SeaBackend.openSession` but missed the second extant napi call site, `SeaSessionBackend.close()`. NF-2 through NF-4 are mediums. `SeaSessionBackend.close()` calls `await this.connection.close()` on the napi `Connection`. Kernel errors from there (e.g., a delete- session RPC failure that the kernel chose to surface despite the fire-and-forget pattern) were reaching JS callers as raw `Error` with `__databricks_error__:` envelope. Wrapped in try/catch + `throw decodeNapiKernelError(err)` — same 3-line shape as openSession. Per `grep -rn "await this\.native\.\|await this\.connection\." lib/sea/`, these are the only two napi call sites on `sea-auth-u2m`. Future napi call sites on sea-execution / sea-results / sea-operation branches need the same wrap (Phase 2; tracked elsewhere). Previously, `oauthClientId` set without `oauthClientSecret` was silently dropped (kernel uses built-in `databricks-cli`). A user setting the field clearly expects it honored; silent-drop hid intent. Flipped to throw HiveDriverError with explicit guidance ("kernel uses 'databricks-cli'; omit for U2M or supply oauthClientSecret for M2M"). The matching unit test in `tests/unit/sea/auth-u2m.test.ts` flipped from "drops the supplied oauthClientId" to "rejects oauthClientId on the U2M path with a clear 'not supported' error". `isBlankOrReserved` previously compared `trimmed === 'undefined'` and `=== 'null'`, so `'UNDEFINED'`, `'Null'`, `'NULL'`, `'nUlL'`, etc. slipped through. Changed to `trimmed.toLowerCase()` before the comparison. New unit case in `auth-edge-cases.test.ts` iterates five case variants and asserts each rejects. `decodeNapiKernelError` previously routed only `{code, message, sqlState}` to the JS error class. The kernel envelope at `native/sea/src/error.rs:50-89` actually carries 7 fields total (`code`, `message`, `sqlState`, `errorCode`, `vendorCode`, `httpStatus`, `retryable`, `queryId`). The remaining 5 were silently dropped. Thrift parity demand: thrift errors carry these fields. Added `attachMetadata(error, meta)` helper that `Object.defineProperty`s each non-undefined field as a non-enumerable own-property — matches the way `attachSqlState` works and the way Node attaches `.code` to system errors. Two new unit tests verify (a) all 5 fields round-trip through a synthetic envelope, (b) they remain non-enumerable (absent from `Object.keys(err)`) but accessible via direct property read. - NF-5 (envelope versioning): `__databricks_error__:` payload has no `version` field. A kernel refactor that renames a field would silently break the JS decoder. Phase 2: add `version: 1` to the kernel-side serialization, check + fallback on JS side. Not in this commit because it requires coordinated kernel-side change. - NF-6 (U2M e2e harness): `auth-u2m-e2e.test.ts` is fully `it.skip` pending the Playwright/Puppeteer harness. Devils- advocate noted that port-collision + headless-negative-path subsets don't strictly need a browser. Phase 2: enable those subsets when the harness lands. Not in this commit because the work is mostly harness-wiring rather than test code. Tests: - Unit: 79/79 pass (was 74 before this commit: +5 — 1 case-insensitive reserved literal sweep, 1 M2M oauthClientSecret reserved-literal reject, 2 envelope-metadata preservation, 1 close() decode + 1 flipped from drop-to-reject which kept the count net same — but the OAuthClientId test rewrite is on a different file). - TypeScript build: clean. - Native build: unchanged (no Rust changes this commit). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…cate, type-guard envelope, treat blank secret as U2M Addresses devils-advocate-auth-u2m-2 round-1 findings on commit 98d5ecf. NF-N1 is a real bug (collision between the kernel envelope's textual `errorCode` field and the pre-existing enum-typed `errorCode` on `OperationStateError` / `RetryError`). NF-N2..NF-N4 are mediums. Includes B-4 collapse (one defineProperty helper for both top-level sqlState and the kernel metadata namespace). ## NF-N1 (HIGH) — namespace kernel metadata + B-4 collapse Before this commit, `decodeNapiKernelError` `defineProperty`d each of the 5 kernel envelope fields (`errorCode`, `vendorCode`, `httpStatus`, `retryable`, `queryId`) directly on the JS error. But `OperationStateError.ts:21` and `RetryError.ts:12` already declare a top-level `errorCode: enum` field, and `DBSQLOperation.ts:209` switches on `err.errorCode === OperationStateErrorCode.Canceled`. A Cancelled kernel envelope with `errorCode: "USER_REQUESTED_CANCEL"` would clobber the enum string `'CANCELED'`, silently breaking cancel detection. Going with option (a) from team-lead's three remediation paths: nest the 5 kernel envelope fields under a single `error.kernelMetadata.*` namespace. Clean separation, no surprise, matches the way `attachSqlState`'s pattern keeps `sqlState` at the top level (which is collision-free). Folded B-4 simultaneously: replaced the two helpers (`attachSqlState`, `attachMetadata`) with one `defineErrorMetadata(error, key, value)` that owns the `defineProperty` flags. Both `sqlState` (top-level) and `kernelMetadata` (namespaced) go through the same helper now. ## NF-N2 (medium) — dedupe e2e `looksReal` against production `auth-m2m-e2e.test.ts:58-62` had a `looksReal` predicate that was still case-sensitive even though round-2's `isBlankOrReserved` is case-insensitive. Exported `isBlankOrReserved` from `SeaAuth.ts` and imported it in the e2e test. Eliminates the predicate-drift risk (also resolves the bloat-watchdog's B-3). ## NF-N3 (medium) — blank/reserved oauthClientSecret routes to U2M A user passing `oauthClientSecret: process.env.MY_SECRET || ''` previously hit the M2M arm's "secret must be non-empty" rejection, which never mentions U2M. Now blank/whitespace/reserved-literal secrets route to the U2M arm — where if `oauthClientId` is also set, the dedicated "not supported on U2M" rejection fires (round-2 NF-2 work). The error message correctly points at the right flow. Updated 5 existing test cases that had assumed the old M2M-rejects behavior; they now assert the U2M-via-id-rejection path. Added 3 new cases (empty string, whitespace-only, literal `'undefined'` routing to U2M happy path when no clientId is set). ## NF-N4 (medium) — per-field envelope type-guards `decodeNapiKernelError` previously cast `parsed` to a typed shape without runtime-validating the 5 optional fields. A kernel bug that emits `retryable: "true"` (string) instead of `true` (boolean) would propagate the wrong-typed property to JS callers. Added a `buildKernelMetadata(parsed: Record<string, unknown>)` helper that checks `typeof` per-field and discards mis-typed values. New unit test verifies all 5 wrong-type variants are dropped while a single correctly-typed field survives. Also: when the parsed envelope has no validated metadata fields, the decoder now omits the `kernelMetadata` namespace entirely (rather than attaching `{}`-shaped noise). Pinned by a new unit test. ## DEFERRED to Phase 2 - NF-N5 (low — SeaNativeLoader top-level require): per team-lead's guidance, defer to Phase 2 (deploy-time visibility issue). - Language-expert-auth-u2m-2's 1 medium + 6 low. ## Kernel fix consumption note team-lead's message indicated kernel-author landed the Error::io() → Error::unauthenticated() fix on `krn-napi-binding` at commit `a64479a`. My napi binding's path-dep (`native/sea/Cargo.toml`) points to `../../../../databricks-sql-kernel-sea-WT/async-public-api`, not `krn-napi-binding`. As of the round-3 build, `async-public-api` still has the OLD `Error::io()` at `m2m.rs:270`. So my rebuild this round picks up the new TS code only — NOT the kernel error- class fix. Consequence for the bad-secret e2e: it would STILL fail with HiveDriverError (not AuthenticationError) on a live run today, because the kernel envelope on the worktree my path-dep reaches still carries `code: "Internal"`. The kernel author's fix needs to land on `async-public-api` (the branch my path-dep tracks), or my path-dep needs to point at `krn-napi-binding`. Flagging to team-lead in the reply. Tests: - Unit: 85/85 pass (was 79 before this commit: +6 net — added 4 new cases for NF-N3 routing + NF-N1 collision + NF-N4 type-guard + NF-N4 metadata-omitted; flipped 3 existing M2M-rejection cases to U2M-rejection-via-id; updated 2 NF-4 metadata tests to read through the new namespace). - TypeScript build: clean. - Native build: cached (no Rust changes from this commit). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
The pecotesting workspace SP we were targeting (DATABRICKS_PECO_CLIENT_*) is NOT registered on the warehouse — yields `invalid_client` on token exchange. The Azure AD SP (DATABRICKS_PECOTESTING_AAD_CLIENT_*) IS registered on HTTP_PATH2 (warehouse 00adc7b6c00429b8), so flip the e2e to those creds. Both happy-path (openSession 730ms) and bad-secret (AuthenticationError 217ms) now pass against the napi-binding kernel worktree (carries DA-F1 fix a64479a). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…p envelope sentinel, trim RetryError doc - NF3-2 (HIGH): when oauthClientId is set and oauthClientSecret is blank/reserved, raise AuthenticationError (M2M intent) instead of routing to U2M which then raises HiveDriverError. The round-3 NF-N3 fix over-applied — U2M routing only kicks in when BOTH id and secret are blank/absent. - NF3-3 (MEDIUM): on corrupted-envelope JSON.parse failure, strip the internal __databricks_error__: sentinel from the message before returning to the caller. - NF3-6 (LOW): trim RetryError mention from KernelMetadata.errorCode doc-comments; no kernel ErrorCode currently maps to RetryError. Deferred per team-lead disposition: NF3-1 (kernel RequestTokenError sub-classification — Phase 2 kernel work), NF3-4 (e2e kernel-error-code assertion — blocked on NF3-1), NF3-5 (path-dep checksum — resolves when kernel publishes), NF3-7 (looksReal double-neg — cosmetic), LE3-1..7 (Phase 2 decoder polish). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…th test, message mutation safety, class-pin simplification - DA4-1 (HIGH): rewrite buildSeaConnectionOptions function-level JSDoc to describe the id-keyed flow selector (round-4 NF3-2 fix); the block-comment was updated but the public-API JSDoc was missed. - DA4-2 (MEDIUM): add test for the SeaAuth.ts:201-210 defense-in-depth U2M+id rejection branch (zero coverage after round-4 flipped the three tests that previously exercised it). - DA4-3a (MEDIUM): wrap err.message mutation on corrupted-envelope path in try/catch; fall back to a fresh HiveDriverError if the message descriptor is non-writable (defensive for future napi-rs versions; mutation preserves napi-side stack on the common path). - DA4-3b (MEDIUM): delete redundant constructor.name check from class-pin test; instanceof AuthenticationError is sufficient because instanceof is a one-way subclass check. Fix the comment that incorrectly claimed instanceof couldn't distinguish. - LE4-1 (MEDIUM): add this.name = 'AuthenticationError' constructor to the AuthenticationError class so err.name / err.toString() identify the subclass (3 lines; doesn't extend to sibling error classes in this PR). - DA4-4 (LOW): drop "reserved for future RetryError mappings" from three SeaErrorMapping.ts doc-comments — no kernel ErrorCode maps to RetryError and there's no design doc proposing one. - LE4-2 (LOW): unify the class-pin test to chai's idiomatic .to.throw(Class, /regex/) form, matching the rest of the suite. - LE4-4 (LOW): one-line comment justifying mutate-vs-clone choice on the corrupted-envelope path. Skipped per disposition: LE4-3 (idIsBlank/secretIsBlank symmetry — LE-4 own recommended leave-as-is). Deferred (carries over from round-3): NF3-1 kernel sub-classification (Phase 2 kernel work), NF3-4 e2e kernel-error-code assertion (blocked on NF3-1), NF3-5 path-dep SHA pin (resolves on kernel publish), LE3-1..3 SeaErrorMapping decoder polish (Phase 2 bundle). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…integration shape
Reconciles the sea-auth-u2m commits onto sea-integration (the linear-
stack target) by combining the post-integration SeaBackend / SeaSession-
Backend / SeaNativeLoader / test shapes with the OAuth M2M+U2M behaviors
introduced across the 5 review rounds on sea-auth-u2m.
Behaviors preserved from sea-auth-u2m:
- decodeNapiKernelError on openSession / executeStatement / close
(kernelMetadata namespace, sentinel-stripping, AuthenticationError
raising for kernel-side `Unauthenticated`)
- buildSeaConnectionOptions: id-keyed M2M-vs-U2M selector, blank-or-
reserved-literal credential rejection, defense-in-depth U2M-with-id
- AuthenticationError this.name constructor (LE4-1)
- discriminated SeaNativeConnectionOptions union (Pat | OAuthM2m |
OAuthU2m) at the napi-binding seam
Shape kept from sea-integration:
- SeaBackend constructor signature `{ context, nativeBinding? }`
(DBSQLClient.ts:241 call-site stays compiling)
- SeaSessionBackend as a separate module (was inline in sea-auth-u2m)
- SeaSessionBackendOptions: { connection, context, defaults?, id? }
- SeaSessionBackend session ids via uuidv4() (auth-only counter scheme
superseded; OAuth tests updated)
- post-integration SeaNativeLoader exports (SeaExecuteOptions,
SeaArrow{Batch,Schema}, SeaNative{Statement,Connection}) carry through
Test reconciliations:
- new SeaBackend(binding) → new SeaBackend({ nativeBinding: binding })
across 14 OAuth-test call-sites
- SeaBackendOptions.context made optional (constructor already
downcasts undefined; runtime callers always supply via DBSQLClient)
- session-id regex from /^sea-session-\d+$/ to UUIDv4
- _helpers/fakeBinding.ts openSession return cast to SeaNativeConnection
- execution.test.ts: the "rejects databricks-oauth (M0 PAT-only)" test
flips to "rejects unsupported auth modes (non-PAT, non-OAuth)" —
databricks-oauth is now the U2M happy path
- execution.test.ts: openSession round-trip asserts new authMode:'Pat'
field on the discriminated union
Skipped commit:
- 37156db (Cargo.toml path-dep flip) became empty after sea-integration's
napi-source relocation — the native crate is no longer at
native/sea/Cargo.toml, it's in the kernel workspace.
Verification (in /tmp/dry-run-nodejs):
- tsc --project tsconfig.build.json: clean
- SEA unit subset: 144/144 passing (87 sea-auth-u2m + 57 sea-integration)
- M2M e2e: 2/2 passing (happy-path 652ms + bad-secret AuthenticationError
233ms)
This is a dry-run-only commit. Do not push or force-push the real
sea-auth-u2m branch from this work; the real branch stays at e9131ae
until owner approves the rebase. Branch:
`dryrun/sea-auth-u2m-on-integration-fresh` lives in /tmp/dry-run-nodejs.
Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
…rwarding to openSession The napi binding moved initialCatalog/initialSchema/sessionConfig from ExecuteOptions onto ConnectionOptions (matching pyo3) because the kernel does not read StatementSpec::statement_conf — they were silently no-op'd in the per-statement path. Adapter follows. - SeaAuth.ts: extend SeaNativeConnectionOptions with optional catalog / schema / sessionConf (intersection with each auth-mode variant). New SeaSessionDefaults interface for the shared shape. - SeaBackend.ts::openSession: fold OpenSessionRequest.initialCatalog / initialSchema / configuration into the napi options before the openSession call. Drop the SeaSessionBackend.defaults forwarding. - SeaNativeLoader.ts: drop SeaExecuteOptions; executeStatement now takes only the SQL. - SeaSessionBackend.ts: drop SeaSessionDefaults and defaults field; drop per-statement overlay logic. useCloudFetch becomes a no-op on the SEA path (kernel hardcodes disposition to INLINE_OR_EXTERNAL_LINKS; ResultConfig exposure is M1 work). - tests: replace per-statement-forwarding assertions with openSession-arg assertions. 23/23 SEA execution tests pass (143/143 across the SEA suite). Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
1a9ddd9 to
1f914fd
Compare
0eb08b8 to
bafd8b9
Compare
… IOperationBackend (#378) * sea-abstraction: introduce IBackend / ISessionBackend / IOperationBackend Refactors DBSQLClient/Session/Operation to dispatch through three backend interfaces. ThriftBackend (lib/thrift-backend/) contains the relocated existing thrift logic. SeaBackend (lib/sea/) is a stub for M0; the sea-napi-binding feature wires the real impl. Public surface (lib/index.ts) unchanged. No new dependencies. All existing tests pass. Files: - lib/contracts/IBackend.ts (new) - lib/contracts/ISessionBackend.ts (new) - lib/contracts/IOperationBackend.ts (new) - lib/contracts/IDBSQLClient.ts (adds useSEA?: boolean to ConnectionOptions) - lib/thrift-backend/ThriftBackend.ts (new) - lib/thrift-backend/ThriftSessionBackend.ts (new) - lib/thrift-backend/ThriftOperationBackend.ts (new) - lib/sea/SeaBackend.ts (new, M0 stub) - lib/DBSQLClient.ts (dispatch through IBackend; useSEA picks SeaBackend) - lib/DBSQLSession.ts (facade over ISessionBackend; staging stays here) - lib/DBSQLOperation.ts (facade over IOperationBackend; iterators/fetchAll stay here) - tests/unit/DBSQLClient.test.ts (retarget internal state lookup through backend; pre-seed client.backend in tests that bypass connect()) - tests/unit/DBSQLOperation.test.ts (retarget internal state lookup through backend) * sea-abstraction: cleanup — restore JSDoc, dedupe test pre-seed, fix inline type Addresses code-bloat-watchdog findings from commit 0085928: - Restores public-API JSDoc on DBSQLSession + DBSQLOperation methods (was deleted as scope creep; contracts unchanged so docs still apply) - Adds makeStubbedClient() helper to tests/unit/DBSQLClient.test.ts; replaces 14× duplicated ThriftBackend pre-seed - Imports WaitUntilReadyOptions instead of inline option types in IOperationBackend + DBSQLOperation.waitUntilReady * sea-abstraction: address full-review findings (F1-F17 except F5) Round-N fixes from the 9-reviewer pre-review. Public IOperation/DBSQLOperation surface preserved byte-identical; backend interfaces (IBackend / ISessionBackend / IOperationBackend) made fully neutral so both Thrift and SEA can implement the same contract. F1 — neutral DTOs at IOperationBackend with Thrift-shape preservation on the public facade (adapter pattern): - lib/contracts/OperationStatus.ts (new) — neutral OperationStatus + OperationState enum mirroring databricks-sql-python's CommandState and kernel pyo3's StatementStatus taxonomy. - lib/contracts/ResultMetadata.ts (new) — neutral ResultMetadata + ResultFormat enum mirroring the three TSparkRowSetType cases. - IOperationBackend.status()/getResultMetadata() return the neutral DTOs. - ThriftOperationBackend.status() adapts at the boundary via adaptOperationStatus / adaptResultMetadata; module-level helpers thriftStateToOperationState and thriftRowSetTypeToResultFormat do the enum maps. - ThriftOperationBackend exposes thriftStatusResponse() and thriftResultMetadataResponse() as public Thrift-only accessors used by the facade's zero-loss fast path (kept for internal state-machine + result-handler dispatch as well). - lib/utils/thriftWireSynthesis.ts (new) — synthesizeThriftStatus and synthesizeThriftResultSetMetadata: convert neutral DTOs back to Thrift wire shape for the non-Thrift backend path. Lossy on Thrift-only fields (taskStatus, numModifiedRows, cacheLookupResult, etc.). - DBSQLOperation.status() and getMetadata() preserved Thrift return shape: Thrift backend path returns the real wire response (zero loss); non-Thrift backend path synthesizes via the new helpers. - DBSQLOperation.getResultMetadata() — new additive neutral accessor on IOperation; DBSQLSession.handleStagingOperation uses it instead of the deprecated Thrift-shaped getMetadata(). F2 — IBackend.connect() is now zero-arg. Backend reads everything it needs from IClientContext / constructor; matches Python connector's pattern of passing session_configuration via constructor not method-arg. F3 — Restore the 'Server protocol version' debug log dropped by the original PR-378 refactor. Re-added to ThriftSessionBackend.constructor with the LogLevel.debug + IClientContext.getLogger() pattern; matches the pre-refactor log site at main:lib/DBSQLSession.ts:175. F4 + F11 + F14 — SeaBackend stub safety: - close() is a no-op so DBSQLClient.close()'s state-clearing block can finish even after a useSEA: true connect() failure. - connect() and openSession() throw HiveDriverError instead of generic Error, matching the rest of the codebase. - connect(options: ConnectionOptions) and openSession(request: OpenSessionRequest) declare their parameters (with @typescript-eslint/no-unused-vars disable) so IDE autocomplete prompts the M1 SEA implementer. F6 + F7 + F9 + F10 — JSDoc on backend interfaces: - IBackend: connect/openSession/close docstrings; close() doc explicitly states transport-layer cleanup is owned by DBSQLClient. - ISessionBackend: copy IDBSQLSession's per-method one-liner JSDoc. - IOperationBackend: doc hasResultSet (readonly external; mutates internally), waitUntilReady (MUST throw OperationStateError on terminal non-success). F8 — tests/unit/sea/SeaBackend.test.ts (new) locks in the stub contract: connect() rejects HiveDriverError, openSession() rejects HiveDriverError, close() resolves no-op. ~30 LOC. F12 — Drop legacy { handle, ... } ctor branch from DBSQLOperation and DBSQLSession: - Facades accept only { backend, context }. - DBSQLSession no longer imports ThriftSessionBackend at all. - DBSQLOperation imports ThriftOperationBackend solely for the F1 typed downcast (zero-loss Thrift fast path); this is a deliberate, scoped coupling tied to the back-compat decision. - tests/unit/.stubs/createSessionForTest.ts and createOperationForTest.ts (new) wrap the legacy shape; all 48 + 54 test sites mechanically migrated. F15 — ThriftOperationBackend.waitUntilReady uses imported WaitUntilReadyOptions type instead of an anonymous inline shape. F16 — useSEA flag moved out of public ConnectionOptions: - Removed useSEA?: boolean from the exported lib/contracts/IDBSQLClient.ts ConnectionOptions; no longer ships in the public .d.ts. - lib/contracts/InternalConnectionOptions.ts (new) declares the flag as a non-exported internal extension; DBSQLClient.connect() reads via a typed cast. Mirrors Python's kwargs.get('use_sea', False) pattern at databricks-sql-python/src/databricks/sql/session.py:111. F17 — Missing return; after case 'timeout' in forwardConnectionEvent so a future fifth case doesn't silently fall through. The trailing return; in the last case triggers no-useless-return — quieted with a localized eslint-disable-next-line + intent comment. F5 — deferred per owner instruction (test-only as any cast tightening). Verification: - yarn lint clean (3 pre-existing warnings in tests/e2e/protocol_versions.test.ts). - yarn build clean. - tsc --noEmit -p tsconfig.json clean (apart from pre-existing examples/tokenFederation/* import errors that exist on main). - Runtime smoke test of SeaBackend stub + Thrift-wire synthesis round-trip passes 5/5 assertions. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> * sea-abstraction: address PR #378 review-comment fixes (H1 / M1-M4 / L1-L10) Addresses 15 review findings from the code-review-squad pass on PR #378. L11 (backend kind field on the three interfaces) is deliberately deferred to avoid a cross-stack cascade ripple while the downstream PRs are still in flight. H1 — fetchChunk lost mid-flight failIfClosed regression. Add optional `isClosed?: () => boolean` to IOperationBackend.fetchChunk's options bag. ThriftOperationBackend.fetchChunk probes it after the setTimeout(0) macrotask yield and returns [] when set; the facade's post-fetch failIfClosed then raises the user-visible OperationStateError. Restores the guard that the refactor split across the facade/backend boundary so a cancel/close arriving during the yield window no longer runs the data RPC to completion needlessly. M1 — neutralize WaitUntilReadyOptions callback shape. Introduce IOperationBackendWaitOptions { callback?: (status: OperationStatus) => unknown } on the backend interface. Facade keeps the public Thrift-typed OperationStatusCallback and adapts at the boundary by wrapping the user's callback with synthesizeThriftStatus. ThriftOperationBackend.waitUntilReady consumes the neutral options and passes adaptOperationStatus(response) to the callback. M2 — synthesizeOkStatus maps OperationState to TStatusCode. Add synthesizeStatusFromOperation that returns ERROR_STATUS for Failed/Cancelled/Closed (carrying errorMessage + sqlState) and SUCCESS_STATUS otherwise. Wire it into synthesizeThriftStatus so legacy Status.assert(resp.status) sees the right code on non-Thrift backends. M3 — TelemetryEvent + DriverConfiguration carry a backend tag. Add optional backend?: 'thrift' | 'sea' | 'kernel' on both interfaces so dashboards can slice latency/error rate by backend without a metrics-schema migration once non-Thrift emission goes live. M4 — test coverage for the synthesize helpers + useSEA failure path. New tests/unit/thrift-backend/wireSynthesis.test.ts covering all OperationState/ResultFormat mappings, ERROR_STATUS carries errorMessage/sqlState, hasResultSet round-trip, schema/arrowSchema/ lz4Compressed/isStagingOperation preservation, and the L3 throw on unknown ResultFormat. New test in DBSQLClient.test.ts asserts that a useSEA:true connect failure leaves this.backend === undefined and the next openSession() surfaces "not connected" rather than the SeaBackend's "not implemented" error. L1 — forwardConnectionEvent normalizes payload to Error. Replace `payload as Error` with `payload instanceof Error ? payload : new Error(String(payload))` so a backend that emits a non-Error through the cross-backend onConnectionEvent doesn't crash the logger.log call. L2 — DBSQLClient.connect publishes this.backend only on success. Construct the backend locally, await connect() in a try/catch, run a best-effort backend.close() (per IBackend.close()'s safe-on-partial-init contract) and rethrow on failure. Only assign this.backend after a clean connect so a failed connect surfaces "DBSQLClient: not connected" on the next openSession. L3 — resultFormatToThrift throws on unknown ResultFormat. Replace the silent default fallback to COLUMN_BASED_SET with a HiveDriverError. Prevents a future ResultFormat enum extension from silently routing results through JsonResultHandler and surfacing garbled rows. L4 — DBSQLOperation.getMetadata carries @deprecated. Adds the canonical TypeScript JSDoc tag so IDEs (strikethrough), tsc, ESLint plugins, and agentic codegen pick up the soft deprecation in favour of getResultMetadata. L5 — numberToInt64 re-export carries @deprecated. Re-export through a named const with a JSDoc block (rather than a bare `export { ... } from`) so the @deprecated tag attaches to the symbol consumers see in their IDE / .d.ts. L6 — DBSQLSession.runBackend helper. Collapse 11 duplicated `failIfClosed → backend.X → failIfClosed` brackets into a single private runBackend<T>(fn) so the open-flag-before-and-after contract has a name and can't be forgotten in a new delegation method. L7 — restore three why-comments deleted from DBSQLSession. Staging-detection invariant in executeStatement, AWS-vs-Azure 404 difference on staging-remove, and the Content-Length-required note on staging-upload. Verbatim from main; these document non-obvious intentional behaviour the refactor inadvertently dropped. L8 — hasResultSet becomes a method on IOperationBackend. The value is state-dependent (the Thrift impl mutates the underlying operation handle inside processOperationStatusResponse), so the property+readonly+disclaimer-JSDoc pattern was misleading. Method form makes the live-read semantics obvious to a fresh implementer. 3 facade call sites updated. L9 — wireSynthesis moves under thrift-backend. The file imports Thrift IDL types and produces Thrift-typed values; it belongs next to ThriftOperationBackend, not in the neutral lib/utils/ tree where it would creep into the dependency cone of future backend-neutral helpers. Same reasoning that placed numberToInt64 and getDirectResultsOptions under thrift-backend/. L10 — interface-level downcast policy. Add a JSDoc paragraph on IOperationBackend grandfathering the two existing `instanceof ThriftOperationBackend` downcasts in DBSQLOperation.status/getMetadata and prohibiting new ones. Future zero-loss back-compat needs should extend the interface (or add an optional method) rather than spawn a per-backend branch matrix. Gates: yarn build (exit 0), yarn lint (0 errors, 3 pre-existing warnings in tests/e2e/protocol_versions.test.ts), yarn test on touched files (163 passing, +12 net new tests from M4 work; 2 failures pre- existing on PR head unchanged: getSchema-directResults and the LZ4-cloud-fetch flag — both flagged in the team-lead playbook as known prior regressions). Cascade implications for downstream PRs (#380 #377 #379 #382 #381 #384 #383): L8 converts hasResultSet from a property to a method, M1 swaps WaitUntilReadyOptions for IOperationBackendWaitOptions on the backend interface. Both are mechanical renames at downstream backend impls when they rebase. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> * Fix SEA abstraction merge fallout Restore Thrift compatibility paths needed by existing schema and result-handler tests after merging main telemetry changes. Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> * Restore Thrift result-handler compatibility hooks Keep existing e2e-only inspection hooks available through the facade while the new backend abstraction owns result handling. Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> --------- Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Stack
Linear stack of 8 PRs landing the M0 + M1 Phase 1 SEA NodeJS work. Merge in order from base ↑ to tip. The tip branch (
msrathore/sea-auth-u2m, PR #383) is the single snapshot containing everything in flight — point your test or benchmark harness at it for an end-to-end check.sea-abstractionsea-napi-binding.nodeartifactsea-errors-loggingErrorCode→ JS error-class mapping (M0 minimum)sea-authuseSEA: truesea-executionexecuteStatement+openSession(sessionConfig, initialCatalog/Schema)sea-resultssea-operationsea-auth-u2m← TIPCompanion kernel stack (
databricks/databricks-sql-kernel): 8 PRs — root #26 (async-public-api) → #27 → #25 → #29 → #28 → #30 → #24 → #23 (tip).Policy: new PRs always stack on the current tip. No sibling/parallel topology. No force-pushes on existing PRs unless absolutely necessary; if a PR's content is wrong, add a fix-up commit on top of the stack tip rather than rewriting history.
This PR is position 8/8 — TIP OF STACK. Contains everything in the M0 + M1 Phase 1 OAuth work. Single snapshot branch for end-to-end testing + benchmarking.
Summary
M1 Phase 1 OAuth M2M + OAuth U2M. Built on the cumulative M0 linear stack (#378–#384).
Size note (1902 LOC, over the 800 cap) — review commit-by-commit
This PR is large because it carries 5 rounds of adversarial + language review as accumulated commits. The commits ARE the audit trail; squashing or splitting would lose the round-by-round review record. Reviewers should walk it commit-by-commit:
f1ddad1— Round 1 base: OAuth M2M + U2M wiring throughSeaBackend→ napi → kernele244dec— Round 1 M2M review fixup: sharedfakeBindinghelper, doc + loader cleanup3809749— Round 1 review: HIGH error-mapping wiring + 7 mediums3aa6e04— Round 2 fixup: wrap close() indecodeNapiKernelError, raise on U2M+id, case-insensitive validation, preserve all envelope fieldsa317c10— Round 3 fixup: NF-N1 namespace kernel metadata, dedupe predicate, type-guard envelope, treat blank secret as U2M2e57346— Rewire M2M e2e to AAD SP on pecotesting HTTP_PATH2d344901— Round 4 fixup: restore M2M-with-bad-secret class, strip envelope sentinel, trim RetryError docd311718— Round 5 fixup: JSDoc selector contract, defense-in-depth test, message mutation safety, class-pin simplificatione7c979d— Post-integration rebase reconciliation: API-shear fix bridging OAuth code to post-M0SeaBackendOptions/IClientContextRound 5 closed ZERO HIGH findings from two independent fresh reviewers (devils-advocate + language-expert). Full review trail in
decisions.md:D-014..D-019.Verification at TIP
AuthenticationError)Companion kernel PR
#28 kernel-oauth-error-class reclassifies OAuth token-exchange failures as
Error::unauthenticated()— required for the bad-secret e2e to raiseAuthenticationError.Draft until you give the go for review.
Tracking
Co-authored-by: Isaac