Skip to content

Commit fab2288

Browse files
authored
[XTA-15079] Add SPOG ?o= routing support and fix final-flush auth on close (#391)
* Add SPOG support: x-databricks-org-id header for telemetry + feature-flag SPOG (Single Panel of Glass) replaces workspace-specific hostnames with account-level vanity URLs. When httpPath carries `?o=<workspaceId>`, endpoints that don't include the workspace in their URL path (telemetry, feature flags) need the workspace conveyed via the `x-databricks-org-id` header instead. Changes: - Parse `?o=<digits>` out of httpPath in DBSQLClient.connect() and stash the org-id as `x-databricks-org-id` on a new `ClientConfig.customHeaders` field. A user-supplied `customHeaders` entry (case-insensitive) takes precedence. - DatabricksTelemetryExporter spreads `config.customHeaders` into the outgoing POST headers. Auth headers still win on collision. - FeatureFlagCache does the same for the feature-flag GET. Not applicable to this driver (vs JDBC port in databricks/databricks-jdbc#1316): - httpPath property parser fix — Node.js passes `options.path` through unmodified. - Warehouse ID regex fix for SEA — driver uses Thrift only. - DBFS Volume header injection — driver exposes no Volume API. OAuth/OIDC token requests deliberately do NOT receive customHeaders. Co-authored-by: Isaac * Fix close-ordering: keep context registered through final telemetry flush `TelemetryClientProvider.releaseClient` was calling `unregisterContext` before `client.close()`. On the last refcount, that emptied the FIFO of `(context, authProvider)` pairs before the final flush ran, so the exporter's `getAuthProvider()` walk returned undefined and the batch was dropped with "Telemetry: Authorization header missing — metrics will be dropped". Defer `unregisterContext` until after `close()` on the last refcount; multi-refcount path is unchanged (immediate unregister so surviving consumers don't reach into a closing context). Verified end-to-end against a real SPOG host: the final flush now exports its 3 metrics (connection.open, statement.start, statement.complete) instead of warning and dropping them. Co-authored-by: Isaac * Simplify close-ordering fix Tighten the previous commit to the minimal diff: guard the existing unregisterContext call instead of restructuring the function. Restores the unchanged comment on TelemetryClient.unregisterContext. Co-authored-by: Isaac * Address review feedback: pure header builder, SPOG logs, all-purpose path form - buildCustomHeaders(httpPath, userHeaders) and extractWorkspaceId(httpPath): SPOG-routing inputs now in the signature instead of read off this.httpPath. extractWorkspaceId is static. Drops the (client as any).httpPath test workaround. Fixes the hidden ordering dependency flagged in PR review [M1]. - Log SPOG header injection (debug), caller-supplied override (debug), and non-numeric workspace ID (warn). Closes JDBC parity gap so oncall has a grep handle when SPOG routing misbehaves [H2]. - extractWorkspaceId now also matches the all-purpose cluster path form /o/<wsId>/<cluster-id> in addition to the warehouse query form ?o=<wsId>. Verified end-to-end against a real all-purpose cluster on a SPOG host. Co-authored-by: Isaac Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com> * Apply prettier formatting Auto-fix from `prettier --write` on the two files touched by the previous commit; no logic changes. Co-authored-by: Isaac Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com> --------- Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
1 parent 7e04c7c commit fab2288

10 files changed

Lines changed: 366 additions & 28 deletions

lib/DBSQLClient.ts

Lines changed: 82 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -298,9 +298,9 @@ export default class DBSQLClient extends EventEmitter implements IDBSQLClient, I
298298
/**
299299
* Extract the numeric workspace ID for telemetry.
300300
*
301-
* The only reliable carrier in the connection params today is the `?o=N`
302-
* query parameter on `httpPath` — Databricks SQL warehouses are typically
303-
* connected to via paths like `/sql/1.0/warehouses/<id>?o=12345678901234`.
301+
* Two URL shapes carry the workspace ID today:
302+
* - Warehouse, query form: `/sql/1.0/warehouses/<id>?o=<wsId>`
303+
* - All-purpose cluster, path form: `sql/protocolv1/o/<wsId>/<cluster-id>`
304304
*
305305
* Host-based extraction was tried previously but produced confidently-wrong
306306
* values:
@@ -313,21 +313,84 @@ export default class DBSQLClient extends EventEmitter implements IDBSQLClient, I
313313
* Returns `undefined` when no workspace ID can be derived. Server-side
314314
* attribution is better off seeing a missing field than a wrong value.
315315
*/
316-
private extractWorkspaceId(): string | undefined {
317-
const { httpPath } = this;
316+
private static extractWorkspaceId(httpPath: string | undefined): string | undefined {
318317
if (!httpPath) {
319318
return undefined;
320319
}
321320
const queryIdx = httpPath.indexOf('?');
322-
if (queryIdx < 0) {
323-
return undefined;
321+
// Warehouse form: `?o=<digits>` in the query string.
322+
if (queryIdx >= 0) {
323+
const query = httpPath.slice(queryIdx + 1);
324+
// Match `o=<digits>` as the first param, an inner `&o=<digits>`, etc.
325+
// Workspace IDs are decimal integers; reject anything else so a stray
326+
// `o=tenant_42` doesn't ship as a workspace ID.
327+
const queryMatch = query.match(/(?:^|&)o=(\d+)(?:&|$)/);
328+
if (queryMatch) {
329+
return queryMatch[1];
330+
}
331+
}
332+
// All-purpose cluster form: `/o/<digits>/<cluster-id>` as a path segment.
333+
const pathOnly = queryIdx >= 0 ? httpPath.slice(0, queryIdx) : httpPath;
334+
const pathMatch = pathOnly.match(/(?:^|\/)o\/(\d+)(?:\/|$)/);
335+
return pathMatch ? pathMatch[1] : undefined;
336+
}
337+
338+
// Detects an `o=<value>` or `/o/<value>` where `<value>` is present but
339+
// non-numeric, so the caller can warn instead of silently dropping a
340+
// malformed workspace param.
341+
private static hasMalformedOrgParam(httpPath: string | undefined): boolean {
342+
if (!httpPath) {
343+
return false;
344+
}
345+
const queryIdx = httpPath.indexOf('?');
346+
if (queryIdx >= 0) {
347+
const query = httpPath.slice(queryIdx + 1);
348+
const hasOrg = /(?:^|&)o=/.test(query);
349+
const hasNumericOrg = /(?:^|&)o=\d+(?:&|$)/.test(query);
350+
if (hasOrg && !hasNumericOrg) {
351+
return true;
352+
}
324353
}
325-
const query = httpPath.slice(queryIdx + 1);
326-
// Match `o=<digits>` as the first param, an inner `&o=<digits>`, etc.
327-
// Workspace IDs are decimal integers; reject anything else so a stray
328-
// `o=tenant_42` doesn't ship as a workspace ID.
329-
const match = query.match(/(?:^|&)o=(\d+)(?:&|$)/);
330-
return match ? match[1] : undefined;
354+
const pathOnly = queryIdx >= 0 ? httpPath.slice(0, queryIdx) : httpPath;
355+
const hasPathOrg = /(?:^|\/)o\/[^/]+/.test(pathOnly);
356+
const hasNumericPathOrg = /(?:^|\/)o\/\d+(?:\/|$)/.test(pathOnly);
357+
return hasPathOrg && !hasNumericPathOrg;
358+
}
359+
360+
/**
361+
* Build the customHeaders map applied to telemetry POSTs and feature-flag
362+
* GETs (SPOG / Single Panel of Glass support). When `httpPath` carries a
363+
* workspace ID — either as a `?o=<wsId>` query (warehouse) or a
364+
* `/o/<wsId>/<cluster-id>` path segment (all-purpose cluster) — endpoints
365+
* that don't include the workspace in their URL path need it conveyed via
366+
* the `x-databricks-org-id` header instead. A user-supplied value in
367+
* `userHeaders` (case-insensitively keyed) wins over the parsed value.
368+
*
369+
* `httpPath` is passed explicitly (rather than read off `this.httpPath`) so
370+
* the SPOG-routing dependency is visible in the signature — a future
371+
* refactor that reorders connect() can't silently break injection.
372+
*/
373+
private buildCustomHeaders(
374+
httpPath: string | undefined,
375+
userHeaders: Record<string, string> | undefined,
376+
): Record<string, string> | undefined {
377+
const merged: Record<string, string> = { ...(userHeaders ?? {}) };
378+
const hasOrgIdAlready = Object.keys(merged).some((k) => k.toLowerCase() === 'x-databricks-org-id');
379+
if (hasOrgIdAlready) {
380+
this.logger.log(LogLevel.debug, 'SPOG: x-databricks-org-id supplied by caller; not extracting from httpPath');
381+
} else {
382+
const orgId = DBSQLClient.extractWorkspaceId(httpPath);
383+
if (orgId) {
384+
merged['x-databricks-org-id'] = orgId;
385+
this.logger.log(LogLevel.debug, `SPOG: injecting x-databricks-org-id=${orgId} (extracted from httpPath)`);
386+
} else if (DBSQLClient.hasMalformedOrgParam(httpPath)) {
387+
this.logger.log(
388+
LogLevel.warn,
389+
'SPOG: httpPath contains non-numeric workspace ID; x-databricks-org-id not injected',
390+
);
391+
}
392+
}
393+
return Object.keys(merged).length > 0 ? merged : undefined;
331394
}
332395

333396
/**
@@ -561,6 +624,11 @@ export default class DBSQLClient extends EventEmitter implements IDBSQLClient, I
561624
this.config.userAgentEntry = options.userAgentEntry;
562625
}
563626

627+
// SPOG: parse `?o=<workspaceId>` out of httpPath and stash it as
628+
// `x-databricks-org-id` for the telemetry + feature-flag clients, which
629+
// hit endpoints that don't carry the workspace in their URL path.
630+
this.config.customHeaders = this.buildCustomHeaders(options.path, options.customHeaders);
631+
564632
this.authProvider = this.createAuthProvider(options, authProvider);
565633

566634
this.connectionProvider = this.createConnectionProvider(options);
@@ -677,7 +745,7 @@ export default class DBSQLClient extends EventEmitter implements IDBSQLClient, I
677745
safeEmit(this, (emitter) => {
678746
if (!this.host) return;
679747
const latencyMs = Date.now() - startTime;
680-
const workspaceId = this.extractWorkspaceId();
748+
const workspaceId = DBSQLClient.extractWorkspaceId(this.httpPath);
681749
const driverConfig = this.driverConfigShipped ? undefined : this.buildDriverConfiguration();
682750
if (driverConfig) {
683751
this.driverConfigShipped = true;

lib/contracts/IClientContext.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,17 @@ export interface ClientConfig {
5252
*/
5353
telemetryFlushOnExit?: boolean;
5454
userAgentEntry?: string;
55+
56+
/**
57+
* Extra HTTP headers attached to driver-owned out-of-band requests
58+
* (telemetry, feature flags). Populated by `DBSQLClient.connect()` from
59+
* `ConnectionOptions.customHeaders` plus an `x-databricks-org-id` header
60+
* derived from the `?o=` query parameter on `httpPath` when present, to
61+
* support SPOG (Single Panel of Glass) account-level routing on endpoints
62+
* that don't carry `?o=` in their URL path. NOT applied to Thrift or
63+
* OAuth/OIDC requests.
64+
*/
65+
customHeaders?: Record<string, string>;
5566
}
5667

5768
export default interface IClientContext {

lib/contracts/IDBSQLClient.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,17 @@ export type ConnectionOptions = {
5555
proxy?: ProxyOptions;
5656
enableMetricViewMetadata?: boolean;
5757

58+
/**
59+
* Extra HTTP headers attached to driver-owned out-of-band requests
60+
* (telemetry POSTs and feature-flag GETs). Not applied to the primary
61+
* Thrift transport or to OAuth/OIDC token requests.
62+
*
63+
* When `path` contains `?o=<workspaceId>` (SPOG account-level routing),
64+
* the driver automatically injects an `x-databricks-org-id` header unless
65+
* one is already present in this map.
66+
*/
67+
customHeaders?: Record<string, string>;
68+
5869
/**
5970
* Whether the driver emits telemetry events (connection / statement /
6071
* cloud-fetch / error). Defaults to `true`.

lib/telemetry/DatabricksTelemetryExporter.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,7 @@ export default class DatabricksTelemetryExporter {
249249
let headers: Record<string, string> = {
250250
'Content-Type': 'application/json',
251251
'User-Agent': userAgent,
252+
...(config.customHeaders ?? {}),
252253
};
253254

254255
if (authenticatedExport) {

lib/telemetry/FeatureFlagCache.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ export default class FeatureFlagCache {
140140
const headers: Record<string, string> = {
141141
'Content-Type': 'application/json',
142142
'User-Agent': this.userAgent,
143+
...(this.context.getConfig().customHeaders ?? {}),
143144
...(await this.getAuthHeaders()),
144145
};
145146

lib/telemetry/TelemetryClientProvider.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,11 @@ class TelemetryClientProvider {
133133
return;
134134
}
135135

136-
holder.client.unregisterContext(context);
136+
// Skip unregister on the last release so close()'s final flush can still
137+
// resolve auth/connection providers from the FIFO snapshot.
138+
if (holder.refCount > 1) {
139+
holder.client.unregisterContext(context);
140+
}
137141
holder.refCount -= 1;
138142
logger.log(LogLevel.debug, `TelemetryClient reference count for ${host}: ${holder.refCount}`);
139143

tests/unit/DBSQLClient.test.ts

Lines changed: 132 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,18 @@ describe('DBSQLClient.connect', () => {
103103

104104
logSpy.restore();
105105
});
106+
107+
it('populates config.customHeaders with org-id parsed from ?o= (SPOG)', async () => {
108+
const client = new DBSQLClient();
109+
await client.connect({ ...connectOptions, path: '/sql/1.0/warehouses/abc?o=12345678901234' });
110+
expect(client.getConfig().customHeaders).to.deep.equal({ 'x-databricks-org-id': '12345678901234' });
111+
});
112+
113+
it('leaves config.customHeaders undefined when path has no ?o= and none supplied', async () => {
114+
const client = new DBSQLClient();
115+
await client.connect({ ...connectOptions, path: '/sql/1.0/warehouses/abc' });
116+
expect(client.getConfig().customHeaders).to.be.undefined;
117+
});
106118
});
107119

108120
describe('DBSQLClient.openSession', () => {
@@ -755,33 +767,140 @@ describe('DBSQLClient telemetry paths', () => {
755767

756768
describe('extractWorkspaceId', () => {
757769
it('returns the numeric o= param from httpPath', () => {
758-
const client = new DBSQLClient();
759-
(client as any).httpPath = '/sql/1.0/warehouses/abc?o=12345678901234';
760-
const id = (client as any).extractWorkspaceId();
770+
const id = (DBSQLClient as any).extractWorkspaceId('/sql/1.0/warehouses/abc?o=12345678901234');
761771
expect(id).to.equal('12345678901234');
762772
});
763773

764774
it('returns undefined when no query string', () => {
765-
const client = new DBSQLClient();
766-
(client as any).httpPath = '/sql/1.0/warehouses/abc';
767-
expect((client as any).extractWorkspaceId()).to.be.undefined;
775+
expect((DBSQLClient as any).extractWorkspaceId('/sql/1.0/warehouses/abc')).to.be.undefined;
768776
});
769777

770778
it('returns undefined when o= is not numeric', () => {
771-
const client = new DBSQLClient();
772-
(client as any).httpPath = '/sql/1.0/warehouses/abc?o=tenant_xyz';
773-
expect((client as any).extractWorkspaceId()).to.be.undefined;
779+
expect((DBSQLClient as any).extractWorkspaceId('/sql/1.0/warehouses/abc?o=tenant_xyz')).to.be.undefined;
774780
});
775781

776782
it('handles o= as a non-first param', () => {
777-
const client = new DBSQLClient();
778-
(client as any).httpPath = '/sql/1.0/warehouses/abc?foo=bar&o=42&baz=qux';
779-
expect((client as any).extractWorkspaceId()).to.equal('42');
783+
expect((DBSQLClient as any).extractWorkspaceId('/sql/1.0/warehouses/abc?foo=bar&o=42&baz=qux')).to.equal('42');
780784
});
781785

782786
it('returns undefined when httpPath is unset', () => {
787+
expect((DBSQLClient as any).extractWorkspaceId(undefined)).to.be.undefined;
788+
});
789+
790+
it('returns the numeric workspace id from all-purpose cluster path form', () => {
791+
expect((DBSQLClient as any).extractWorkspaceId('sql/protocolv1/o/99999999999999/0101-000000-aaaaaaaa')).to.equal(
792+
'99999999999999',
793+
);
794+
});
795+
796+
it('returns the numeric workspace id from all-purpose cluster path with leading slash', () => {
797+
expect((DBSQLClient as any).extractWorkspaceId('/sql/protocolv1/o/12345/0101-000000-aaaaaaaa')).to.equal('12345');
798+
});
799+
800+
it('returns undefined when all-purpose cluster path has non-numeric workspace segment', () => {
801+
expect((DBSQLClient as any).extractWorkspaceId('sql/protocolv1/o/tenant_xyz/0101-000000-aaaaaaaa')).to.be
802+
.undefined;
803+
});
804+
805+
it('prefers ?o= query form over /o/ path form when both are present', () => {
806+
expect((DBSQLClient as any).extractWorkspaceId('sql/protocolv1/o/111/cluster?o=222')).to.equal('222');
807+
});
808+
});
809+
810+
describe('buildCustomHeaders (SPOG)', () => {
811+
it('injects x-databricks-org-id from ?o= in httpPath', () => {
783812
const client = new DBSQLClient();
784-
expect((client as any).extractWorkspaceId()).to.be.undefined;
813+
const headers = (client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=12345678901234', undefined);
814+
expect(headers).to.deep.equal({ 'x-databricks-org-id': '12345678901234' });
815+
});
816+
817+
it('returns undefined when no ?o= and no user-supplied customHeaders', () => {
818+
const client = new DBSQLClient();
819+
const headers = (client as any).buildCustomHeaders('/sql/1.0/warehouses/abc', undefined);
820+
expect(headers).to.be.undefined;
821+
});
822+
823+
it('preserves user-supplied customHeaders alongside parsed org-id', () => {
824+
const client = new DBSQLClient();
825+
const headers = (client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=42', { 'x-trace-id': 'tid-001' });
826+
expect(headers).to.deep.equal({ 'x-trace-id': 'tid-001', 'x-databricks-org-id': '42' });
827+
});
828+
829+
it('user-supplied x-databricks-org-id wins over ?o= parsed value (case-insensitive)', () => {
830+
const client = new DBSQLClient();
831+
const headers = (client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=42', {
832+
'X-Databricks-Org-Id': '999',
833+
});
834+
expect(headers).to.deep.equal({ 'X-Databricks-Org-Id': '999' });
835+
});
836+
837+
it('does not inject org-id when ?o= value is non-numeric', () => {
838+
const client = new DBSQLClient();
839+
const headers = (client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=tenant_xyz', undefined);
840+
expect(headers).to.be.undefined;
841+
});
842+
843+
it('injects x-databricks-org-id from all-purpose cluster path form', () => {
844+
const client = new DBSQLClient();
845+
const headers = (client as any).buildCustomHeaders(
846+
'sql/protocolv1/o/99999999999999/0101-000000-aaaaaaaa',
847+
undefined,
848+
);
849+
expect(headers).to.deep.equal({ 'x-databricks-org-id': '99999999999999' });
850+
});
851+
852+
it('logs a warning when workspace ID segment is non-numeric (path form)', () => {
853+
const client = new DBSQLClient();
854+
const logSpy = sinon.spy((client as any).logger, 'log');
855+
try {
856+
(client as any).buildCustomHeaders('sql/protocolv1/o/tenant_xyz/cluster', undefined);
857+
const warnCalls = logSpy.getCalls().filter((c) => c.args[0] === LogLevel.warn);
858+
expect(warnCalls).to.have.lengthOf(1);
859+
expect(warnCalls[0].args[1]).to.match(/non-numeric workspace ID/);
860+
} finally {
861+
logSpy.restore();
862+
}
863+
});
864+
865+
it('logs a warning when ?o= is present but non-numeric', () => {
866+
const client = new DBSQLClient();
867+
const logSpy = sinon.spy((client as any).logger, 'log');
868+
try {
869+
(client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=tenant_xyz', undefined);
870+
const warnCalls = logSpy.getCalls().filter((c) => c.args[0] === LogLevel.warn);
871+
expect(warnCalls).to.have.lengthOf(1);
872+
expect(warnCalls[0].args[1]).to.match(/non-numeric workspace ID/);
873+
} finally {
874+
logSpy.restore();
875+
}
876+
});
877+
878+
it('logs a debug line when injecting org-id from httpPath', () => {
879+
const client = new DBSQLClient();
880+
const logSpy = sinon.spy((client as any).logger, 'log');
881+
try {
882+
(client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=42', undefined);
883+
const injectLog = logSpy
884+
.getCalls()
885+
.find((c) => c.args[0] === LogLevel.debug && /injecting x-databricks-org-id=42/.test(String(c.args[1])));
886+
expect(injectLog, 'expected SPOG inject debug log').to.exist;
887+
} finally {
888+
logSpy.restore();
889+
}
890+
});
891+
892+
it('logs a debug line when caller supplies x-databricks-org-id', () => {
893+
const client = new DBSQLClient();
894+
const logSpy = sinon.spy((client as any).logger, 'log');
895+
try {
896+
(client as any).buildCustomHeaders('/sql/1.0/warehouses/abc?o=42', { 'x-databricks-org-id': '999' });
897+
const callerLog = logSpy
898+
.getCalls()
899+
.find((c) => c.args[0] === LogLevel.debug && /supplied by caller/.test(String(c.args[1])));
900+
expect(callerLog, 'expected SPOG caller-supplied debug log').to.exist;
901+
} finally {
902+
logSpy.restore();
903+
}
785904
});
786905
});
787906

0 commit comments

Comments
 (0)