Skip to content

Commit ca55f0c

Browse files
committed
docs: add telemetry design, test summary, and user guide
Rebases the docs PR onto main now that #327 has landed. Drops all code changes from the diff; keeps only the new docs (docs/TELEMETRY.md, spec/telemetry-design.md, spec/telemetry-test-completion-summary.md) and swaps README's telemetry section to a short pointer to docs/TELEMETRY.md. Co-authored-by: Isaac
1 parent 5f1728a commit ca55f0c

4 files changed

Lines changed: 3947 additions & 82 deletions

File tree

README.md

Lines changed: 44 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -53,88 +53,50 @@ client
5353

5454
## Telemetry
5555

56-
Starting with version 1.13, the driver collects telemetry — connection,
57-
statement, and CloudFetch chunk metrics, plus error events with redacted
58-
stack traces — to help Databricks improve driver performance and
59-
reliability. **Telemetry is enabled by default and gated by a server-side
60-
feature flag**: events are emitted only when the workspace's feature flag
61-
is on. No SQL text, parameter values, or row data are ever included.
62-
63-
### What's collected
64-
65-
- Connection lifecycle (`CREATE_SESSION`, `DELETE_SESSION`) with latency.
66-
- Statement lifecycle (`STATEMENT_START`, `STATEMENT_COMPLETE`) with
67-
execution latency, operation type, and result format.
68-
- CloudFetch chunk timings and byte counts.
69-
- Error events with redacted stack traces (Bearer/JWT tokens, OAuth
70-
secrets, home-directory paths, and Databricks PATs are stripped before
71-
emission).
72-
73-
See `TelemetryEvent` and `TelemetryMetric` in the package exports for the
74-
exact payload shapes.
75-
76-
### Multi-tenant SaaS deployments — read this before enabling telemetry
77-
78-
The telemetry layer shares one per-host `TelemetryClient` across every
79-
`DBSQLClient` connected to the same Databricks workspace host. The
80-
authenticated export path uses the **first-registered** client's auth
81-
provider, User-Agent, and `telemetryAuthenticatedExport` value — these
82-
fields are snapshotted at the host singleton and are **not** per-tenant.
83-
84-
If you are operating a SaaS layer that fronts multiple tenants against the
85-
same Databricks workspace host with a shared driver process, telemetry from
86-
tenant B's queries can be POSTed under tenant A's auth headers, with
87-
tenant A's `userAgentEntry`. A tenant B that explicitly set
88-
`telemetryAuthenticatedExport: false` will still ride tenant A's
89-
authenticated pipeline.
90-
91-
> **Recommendation for multi-tenant deployments**: set
92-
> `telemetryEnabled: false` on all `DBSQLClient` instances, or partition
93-
> by Databricks workspace host so each tenant owns its own
94-
> `TelemetryClient`. Subsequent registrants with diverging auth/UA values
95-
> emit a warn-level log so the leak is at least visible.
96-
97-
### Opting out
98-
99-
Three independent ways to disable telemetry, in order of precedence:
100-
101-
1. **Environment variable** — set `DATABRICKS_TELEMETRY_DISABLED` to one
102-
of `1`, `true`, `yes`, or `on` (case-insensitive). Other values
103-
(empty, `0`, `false`, `off`, `no`) are ignored, leaving the runtime
104-
config in charge.
105-
2. **Programmatic** — pass `telemetryEnabled: false` to `connect()`:
106-
```javascript
107-
await client.connect({
108-
host,
109-
path,
110-
token,
111-
telemetryEnabled: false,
112-
});
113-
```
114-
3. **Server-side** — Databricks-managed feature flag; if disabled for
115-
your workspace, the driver does not emit telemetry regardless of
116-
client config.
117-
118-
### Tuning
119-
120-
If you keep telemetry on, the following knobs are available on
121-
`ConnectionOptions` (see JSDoc on `IDBSQLClient.ts` for defaults and
122-
units):
123-
124-
- `telemetryAuthenticatedExport` — set to `false` to ship reduced
125-
payloads (no statement/session correlation IDs, generic User-Agent)
126-
via the unauthenticated endpoint.
127-
- `telemetryBatchSize`, `telemetryFlushIntervalMs`, `telemetryMaxRetries`
128-
— batching and retry tuning.
129-
- `telemetryCircuitBreakerThreshold`, `telemetryCircuitBreakerTimeout`
130-
circuit-breaker tuning for the export endpoint.
131-
- `telemetryCloseTimeoutMs` — bound on `await client.close()` waiting for
132-
the final flush.
133-
134-
> **Note for short-lived processes**: always `await client.close()`
135-
> before `process.exit(0)` so the final batch is flushed. Without an
136-
> explicit close, the periodic flush timer is `unref()`'d to avoid
137-
> holding the event loop open, so any unflushed events are dropped.
56+
The Databricks SQL Driver for Node.js includes an **opt-in telemetry system** that collects driver usage metrics and performance data to help improve the driver. Telemetry is **disabled by default** and follows a **privacy-first design**.
57+
58+
### Key Features
59+
60+
- **Privacy-first**: No SQL queries, results, or sensitive data is ever collected
61+
- **Opt-in**: Controlled by server-side feature flag (disabled by default)
62+
- **Non-blocking**: All telemetry operations are asynchronous and never impact your queries
63+
- **Resilient**: Circuit breaker protection prevents telemetry failures from affecting your application
64+
65+
### What Data is Collected?
66+
67+
When enabled, the driver collects:
68+
69+
- ✅ Driver version and configuration settings
70+
- ✅ Query performance metrics (latency, chunk counts, bytes downloaded)
71+
- ✅ Error types and status codes
72+
- ✅ Feature usage (CloudFetch, Arrow format, compression)
73+
74+
**Never collected**:
75+
76+
- ❌ SQL query text
77+
- ❌ Query results or data values
78+
- ❌ Table/column names or schema information
79+
- ❌ User credentials or personal information
80+
81+
### Configuration
82+
83+
To enable or disable telemetry explicitly:
84+
85+
```javascript
86+
const client = new DBSQLClient({
87+
telemetryEnabled: true, // Enable telemetry (default: false)
88+
});
89+
90+
// Or override per connection:
91+
await client.connect({
92+
host: '********.databricks.com',
93+
path: '/sql/2.0/warehouses/****************',
94+
token: 'dapi********************************',
95+
telemetryEnabled: false, // Disable for this connection
96+
});
97+
```
98+
99+
For detailed documentation including configuration options, event types, troubleshooting, and privacy details, see [docs/TELEMETRY.md](docs/TELEMETRY.md).
138100

139101
## Run Tests
140102

0 commit comments

Comments
 (0)