Skip to content

Self-hosted v4 runner pods leak process.env to stdout and to the webapp debug-log endpoint #3566

@brentshulman-silkline

Description

@brentshulman-silkline

Update: original report was incomplete. Two additional leak sites confirmed from production logs that do dump the full project env. See follow-up comment for the corrected scope and code references. Treat as security advisory.

Summary

On every runner pod startup, the managed run controller emits three distinct log entries whose payloads include sensitive material:

  1. "Creating run controller" (packages/cli-v3/src/entryPoints/managed/controller.ts:84) — spreads ...env.raw (parsed RunnerEnv schema) into the log properties.
  2. "started attempt" (packages/cli-v3/src/entryPoints/managed/execution.ts:448) — logs { start: start.data }, where start.data is the startRunAttempt API response including the project envVars map (all user-provided env vars passed to the task — API keys, DB creds, JWT signing certs, etc.).
  3. "initializing task run process" (packages/cli-v3/src/executions/taskRunProcess.ts:152) — logs { env: fullEnv, ... } where fullEnv = { ...$env, NODE_OPTIONS, PATH, ... } and $env is the same project env from the IPC message. Same secrets dumped a second time.

All three entries go to the runner pod's stdout (operator-side log aggregation: CloudWatch / Loki / Datadog / …). #1 and #2 additionally go to the webapp's debug-log HTTP endpoint, which the webapp persists and renders in the dashboard run timeline.

Setting DEBUG=false, LOG_LEVEL=info, VERBOSE=false, or PRETTY_LOGS=false does not silence any of these. managed-run-controller.ts:7 hardcodes logger.loggerLevel = "debug" so site #3 always fires regardless of operator config. Sites #1 and #2 go through ManagedRunLogger.sendDebugLog, whose local sink uses SimpleStructuredLogger.log (i.e. LogLevel.log = 0 — lowest in the enum; the guard if (this.level < LogLevel.log) return; cannot suppress it).

Confirmed on chart 4.4.5 / app image v4.4.5. The code paths exist on main at the time of writing.

Impact

For any self-hosted v4 installation that runs tasks needing any kind of API key (LLM provider, error tracker, third-party APIs, database, JWT signing, etc.), the full set of those secrets is written:

  • To operator-side log storage on every runner pod startup.
  • To the webapp's debug-log endpoint, which persists them and shows them in the dashboard run timeline (site Test issue for webhook workflow #2).

In my own production-gov logs, I can see real values for Auth0 client secret + management API client secret + signing secret, Hasura admin secret, Hasura JWT RS512 signing certificate, AWS access key + secret, OpenAI / Sendgrid / Datadog / Sentry / Apollo / Merge / Kernel / Cerebras / OpenExchangeRates API keys, the Trigger DB password embedded in TRIGGER_DATABASE_URL, and per-run TRIGGER_JWT.

Source

Site #1packages/cli-v3/src/entryPoints/managed/controller.ts:77-87

const properties = {
  ...env.raw,                              // parsed RunnerEnv — bounded by schema
  TRIGGER_POD_SCHEDULED_AT_MS: env.TRIGGER_POD_SCHEDULED_AT_MS.toISOString(),
  TRIGGER_DEQUEUED_AT_MS: env.TRIGGER_DEQUEUED_AT_MS.toISOString(),
};

this.sendDebugLog({
  runId: env.TRIGGER_RUN_ID,
  message: "Creating run controller",
  properties,
});

Site #2packages/cli-v3/src/entryPoints/managed/execution.ts:448

this.sendDebugLog("started attempt", { start: start.data });

start.data is the body of httpClient.startRunAttempt(), which includes envVars — the project env map passed to the task. Concretely: every env var configured on the Trigger project.

Site #3packages/cli-v3/src/executions/taskRunProcess.ts:152

const fullEnv = {
  ...$env,
  OTEL_IMPORT_HOOK_INCLUDES: ...,
  NODE_OPTIONS: ...,
  PATH: process.env.PATH,
  TRIGGER_PROCESS_FORK_START_TIME: ...,
  TRIGGER_WARM_START: ...,
  TRIGGERDOTDEV: "1",
};

logger.debug(\`initializing task run process\`, {
  env: fullEnv,
  path: workerManifest.workerEntryPoint,
  cwd,
});

$env is the project env passed in from the IPC EXECUTE_TASK_RUN message — same secret material as site #2.

ManagedRunLogger.sendDebugLog (the routing for sites #1 and #2)

packages/cli-v3/src/entryPoints/managed/logger.ts:38-66:

if (print) {
  this.logger.log(message, mergedProperties);   // stdout via SimpleStructuredLogger
}
this.httpClient.sendDebugLog(runId, {           // POSTed to webapp
  message,
  time: date ?? new Date(),
  properties: flattenedProperties,
});

The webapp endpoint persists the properties and surfaces them in the dashboard run timeline.

Suggested fix

Three changes, one per site:

  1. controller.ts — drop ...env.raw from the properties spread, or replace with an explicit allowlist (e.g. the supervisor's RunnerEnv schema is a fine starting point if you want to keep some diagnostics).
  2. execution.ts:448 — log a redacted summary of start.data (attempt number, snapshot id, run id, machine preset) rather than the raw API response body. At minimum, strip envVars before logging.
  3. taskRunProcess.ts:152 — drop the env: fullEnv field from the log entry, or redact (e.g. log only the keys of fullEnv, never values).

Bonus: the ManagedRunLogger.sendDebugLog HTTP sink ignores the print flag and always forwards to the webapp endpoint. Worth considering whether print: false (or a new persist: false) should also gate the HTTP sink, so operators have a way to disable persistence of incidentally-sensitive properties without losing local diagnostics.

Happy to open a PR. Filing this as an issue first so you can decide on scope.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions