-
Notifications
You must be signed in to change notification settings - Fork 14
feat(appkit): send internal telemetry via AppkitLog schema #332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
59458ca
4e720ed
af6f6f3
e45773b
b207f28
986c42f
f7ac315
651377d
688059e
eab7c4b
08378b9
f955015
6e80df3
fb4e05a
db00fcd
9547764
ce2e36e
a9f6a18
be24186
6894085
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,3 +11,5 @@ coverage | |
| .turbo | ||
|
|
||
| .databricks | ||
|
|
||
| .superset/config.json | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,6 +5,8 @@ env: | |
| valueFrom: genie-space | ||
| - name: DATABRICKS_SERVING_ENDPOINT_NAME | ||
| valueFrom: serving-endpoint | ||
| - name: DATABRICKS_JOB_ID | ||
| valueFrom: job | ||
|
Comment on lines
+8
to
+9
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As long as we don't have databricks.yml it won't help as we need to set the resource manually 🤔 Maybe it's better to skip it?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My deployed |
||
| # Files plugin manifest declares a static DATABRICKS_VOLUME_FILES | ||
| # requirement; keep it bound so appkit's runtime validation passes | ||
| # even though the policy harness below uses its own keys. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| --- | ||
| sidebar_position: 99 | ||
| --- | ||
|
|
||
| # Privacy | ||
|
|
||
| AppKit sends a small amount of anonymized usage telemetry to Databricks | ||
| so the team can understand how the SDK is used and prioritize | ||
| improvements. This page documents exactly what is sent, when, and how | ||
| to turn it off. | ||
|
|
||
| ## What we collect | ||
|
|
||
| Every event is a single record with three top-level fields: | ||
|
|
||
| | Field | Type | Source | | ||
| | ---------------- | ------ | ----------------------------------- | | ||
| | `event_name` | enum | One of `APP_STARTUP`, `HEARTBEAT`, `REQUEST_METRICS` | | ||
| | `app_id` | string | The app's OAuth client UUID (`DATABRICKS_CLIENT_ID`) | | ||
| | `appkit_version` | string | The AppKit SDK version | | ||
|
|
||
| Each event also carries one of three event-specific bodies: | ||
|
|
||
| - **`APP_STARTUP`** — emitted once when `createApp` finishes booting. | ||
| Empty body. | ||
| - **`HEARTBEAT`** — emitted every five minutes from a running app. | ||
| Empty body. | ||
| - **`REQUEST_METRICS`** — emitted once per minute, one record per HTTP | ||
| endpoint that received traffic in the window. Each record contains: | ||
| - `endpoint` — the route template (e.g. `GET /api/genie/:space_id/messages`), | ||
| never the raw request URL or any user-provided values. | ||
| - `request_count` | ||
| - `request_latency_ms_avg` | ||
| - `response_count_http4xx` | ||
| - `response_count_http5xx` | ||
|
|
||
| ## How to opt out | ||
|
|
||
| Set any one of the following: | ||
|
|
||
| ```sh | ||
| # AppKit-specific kill switch | ||
| DISABLE_APPKIT_INTERNAL_TELEMETRY=true | ||
|
|
||
| # Cross-tool standard (https://consoledonottrack.com) | ||
| DO_NOT_TRACK=1 | ||
| ``` | ||
|
|
||
| Either fully disables the reporter — no events are emitted and no | ||
| network calls are made. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,8 +8,13 @@ import type { | |
| PluginData, | ||
| PluginMap, | ||
| } from "shared"; | ||
| import { version as productVersion } from "../../package.json"; | ||
| import { CacheManager } from "../cache"; | ||
| import { ServiceContext } from "../context"; | ||
| import { | ||
| isInternalTelemetryEnabled, | ||
| TelemetryReporter, | ||
| } from "../internal-telemetry"; | ||
| import { createLogger } from "../logging/logger"; | ||
| import { ResourceRegistry, ResourceType } from "../registry"; | ||
| import type { TelemetryConfig } from "../telemetry"; | ||
|
|
@@ -171,6 +176,7 @@ export class AppKit<TPlugins extends InputPluginMap> { | |
| cache?: CacheConfig; | ||
| client?: WorkspaceClient; | ||
| onPluginsReady?: (appkit: PluginMap<T>) => void | Promise<void>; | ||
| disableInternalTelemetry?: boolean; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd vote for removing that. IMO the environmental variable is enough 👍
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was the proposal in the design doc. Since we're obtaining some data from customer apps, we wanted to be as transparent as possible about it. |
||
| } = {}, | ||
| ): Promise<PluginMap<T>> { | ||
| // Initialize core services | ||
|
|
@@ -212,6 +218,10 @@ export class AppKit<TPlugins extends InputPluginMap> { | |
| logger.debug("onPluginsReady hook completed"); | ||
| } | ||
|
|
||
| if (isInternalTelemetryEnabled(config)) { | ||
| AppKit.bootstrapInternalTelemetry(); | ||
| } | ||
|
|
||
| const serverPlugin = instance.#pluginInstances.server; | ||
| if (serverPlugin && typeof (serverPlugin as any).start === "function") { | ||
| await (serverPlugin as any).start(); | ||
|
|
@@ -220,6 +230,18 @@ export class AppKit<TPlugins extends InputPluginMap> { | |
| return handle; | ||
| } | ||
|
|
||
| private static bootstrapInternalTelemetry(): void { | ||
| const serviceCtx = ServiceContext.get(); | ||
| const reporter = TelemetryReporter.initialize({ | ||
| workspaceId: serviceCtx.workspaceId, | ||
| client: serviceCtx.client, | ||
| appId: process.env.DATABRICKS_CLIENT_ID || "", | ||
| appkitVersion: productVersion, | ||
| }); | ||
| reporter.start(); | ||
| reporter.sendStartup().catch(() => {}); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we are calling this (which is asynchronous) and basically ignoring the promise, is this on purpose?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we don't want to cause delays or raise errors because of internal telemetry. |
||
| } | ||
|
|
||
| private static preparePlugins( | ||
| plugins: PluginData<PluginConstructor, unknown, string>[], | ||
| ) { | ||
|
|
@@ -279,6 +301,7 @@ export async function createApp< | |
| cache?: CacheConfig; | ||
| client?: WorkspaceClient; | ||
| onPluginsReady?: (appkit: PluginMap<T>) => void | Promise<void>; | ||
| disableInternalTelemetry?: boolean; | ||
| } = {}, | ||
| ): Promise<PluginMap<T>> { | ||
| return AppKit._createApp(config); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| // IMPORTANT: keep this file in sync with the AppkitLog proto schema served by | ||
| // the Databricks client telemetry endpoint. Field names use proto JSON | ||
| // conventions (snake_case) so the wire format matches the backend. | ||
|
|
||
| export type AppkitEventName = | ||
| | "APPKIT_EVENT_NAME_UNSPECIFIED" | ||
| | "APP_STARTUP" | ||
| | "HEARTBEAT" | ||
| | "REQUEST_METRICS"; | ||
|
|
||
| export type AppStartupEvent = Record<string, never>; | ||
|
|
||
| export type HeartbeatEvent = Record<string, never>; | ||
|
|
||
| export interface RequestMetricsEvent { | ||
| endpoint?: string; | ||
| request_count?: number; | ||
| request_latency_ms_avg?: number; | ||
| response_count_http4xx?: number; | ||
| response_count_http5xx?: number; | ||
| } | ||
|
|
||
| export interface AppkitLog { | ||
|
calvarjorge marked this conversation as resolved.
|
||
| event_name: AppkitEventName; | ||
| app_id?: string; | ||
| appkit_version?: string; | ||
| app_startup_event?: AppStartupEvent; | ||
| heartbeat_event?: HeartbeatEvent; | ||
| request_metrics_event?: RequestMetricsEvent; | ||
| } | ||
|
|
||
| interface AppkitLogEnvelope { | ||
| frontend_log_event_id: string; | ||
| inferred_timestamp_millis: number; | ||
| entry: { appkit_log: AppkitLog }; | ||
| } | ||
|
|
||
| interface TelemetryPayload { | ||
| uploadTime: number; | ||
| items: never[]; | ||
| protoLogs: string[]; | ||
| } | ||
|
|
||
| export function wrapAppkitLog(log: AppkitLog): AppkitLogEnvelope { | ||
| return { | ||
| frontend_log_event_id: `appkit-${log.event_name.toLowerCase()}-${crypto.randomUUID()}`, | ||
| inferred_timestamp_millis: Date.now(), | ||
| entry: { appkit_log: log }, | ||
| }; | ||
| } | ||
|
|
||
| export function buildAppkitPayload(logs: AppkitLog[]): TelemetryPayload { | ||
| return { | ||
| uploadTime: Date.now(), | ||
| items: [], | ||
| protoLogs: logs.map((log) => JSON.stringify(wrapAppkitLog(log))), | ||
| }; | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| /** | ||
| * Checks whether internal telemetry is enabled. | ||
| * Shared across all telemetry event types (startup, heartbeat, metrics, etc.). | ||
| */ | ||
| export function isInternalTelemetryEnabled(opts?: { | ||
| disableInternalTelemetry?: boolean; | ||
| }): boolean { | ||
| if (opts?.disableInternalTelemetry) return false; | ||
| if (process.env.DISABLE_APPKIT_INTERNAL_TELEMETRY === "true") return false; | ||
| // Honor the cross-tool DO_NOT_TRACK convention (https://consoledonottrack.com). | ||
| if (process.env.DO_NOT_TRACK === "1") return false; | ||
| return true; | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| // Internal telemetry: APP_STARTUP, HEARTBEAT, and REQUEST_METRICS events | ||
| // POSTed to /telemetry-ext so the Databricks team can prioritize SDK work. | ||
| // Disable with disableInternalTelemetry: true on createApp, | ||
| // DISABLE_APPKIT_INTERNAL_TELEMETRY=true, or DO_NOT_TRACK=1. | ||
| // Full data inventory: docs/docs/privacy.mdx. | ||
|
|
||
| export { isInternalTelemetryEnabled } from "./config"; | ||
| export { TelemetryReporter } from "./reporter"; |
Uh oh!
There was an error while loading. Please reload this page.