Skip to content

sql: add stable jobs introspection surfaces in information_schema#170864

Open
dt wants to merge 5 commits into
cockroachdb:masterfrom
dt:info-schema-jobs-views
Open

sql: add stable jobs introspection surfaces in information_schema#170864
dt wants to merge 5 commits into
cockroachdb:masterfrom
dt:info-schema-jobs-views

Conversation

@dt
Copy link
Copy Markdown
Contributor

@dt dt commented May 24, 2026

Adds four customer-facing surfaces under information_schema.crdb_* for programmatic visibility into the jobs system. The existing surfaces (crdb_internal.jobs, SHOW JOBS) are either internal (now blocked for ordinary sessions by unsafesql.CheckInternalsAccess) or lossy by design (description truncated to 70 chars, statement conflated with description, no message history, no progress history).

The four customer-facing surfaces:

  • information_schema.crdb_jobs — per-job metadata from system.jobs without the truncation or column omission applied by SHOW JOBS. The column set is deliberately scoped to fields with stable customer meaning; internal execution detail (claim_session_id, claim_instance_id), dead bookkeeping (num_runs, last_run), and creator coupling (created_by_*) are excluded.
  • information_schema.crdb_jobs_with_progress — extends crdb_jobs with progress_fraction, resolved (raw HLC), status_message, last_updated.
  • information_schema.crdb_job_messages(job_id) — SRF over system.job_message. Function form forces a per-job filter at the call site rather than letting a view accidentally fan out across all jobs.
  • information_schema.crdb_job_progress_history(job_id) — SRF over system.job_progress_history, same shape as messages.

Each surface has a logictest that exercises it with SET allow_unsafe_internals = false to lock in the post-unsafesql access path.

The view bodies call crdb_internal.can_view_job(owner) for row-level visibility, which required the first commit on the branch: the optbuilder's system-view path was already skipping the SELECT privilege check on underlying tables but was not skipping the unsafe-builtin guard on builtins referenced in the body. The two gates are now symmetric for system views.

Commit map (each is independently reviewable):

  1. optbuilder: skip the unsafe-builtin guard inside system view bodies (plumbing prerequisite)
  2. crdb_jobs view
  3. crdb_jobs_with_progress view
  4. crdb_job_messages SRF
  5. crdb_job_progress_history SRF

The resolved columns in surfaces 2 and 4 are exposed as raw HLC DECIMAL (matching system.job_progress.resolved) rather than a lossy TIMESTAMPTZ conversion. Customers who want a wall-clock value apply hlc_to_timestamp(resolved) at the call site; this preserves the HLC logical component for callers feeding the value into AS OF SYSTEM TIME or changefeed cursors.

Epic: none

@dt dt requested a review from a team as a code owner May 24, 2026 14:05
@dt dt requested review from msbutler and shghasemi and removed request for a team May 24, 2026 14:05
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented May 24, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@dt dt requested a review from kevin-v-ngo May 24, 2026 14:06
dt added 3 commits May 24, 2026 14:14
The optbuilder's system-view path already skipped the SELECT
privilege check on tables referenced by a system view's body, on the
grounds that system views are authored by the node user and form
part of the supported customer surface. The unsafe-internals guard
applied to builtins, however, was not skipped: a system view body
could reference system tables (via the privilege bypass above) but
could not invoke a crdb_internal builtin in its WHERE or projection.
The asymmetry forced one-off aliases of each crdb_internal builtin
under information_schema for any view that needed one.

Mirror the privilege-check skip for skipUnsafeInternalsCheck inside
the system-view branch so the two gates are symmetric inside the
definer context. The skip is restricted to actual system views
(view.IsSystemView()), not the cluster-setting bypass
(skipUnderlyingPrivilegeChecks), so user-created views built under
the legacy setting do not gain access to unsafe builtins.

No behavior change for ordinary user-created views or direct user
queries against crdb_internal builtins; only the in-system-view-
body path is affected.

The change has no direct test of its own; downstream commits adding
information_schema views whose bodies invoke crdb_internal.can_view_job
exercise the path under SET allow_unsafe_internals = false.

Epic: none

Release note: None
Add a stable, customer-facing view exposing per-job metadata from
system.jobs without the truncation or column omission that SHOW JOBS
applies for human readability. The new view returns the full
description un-truncated, preserves NULL in the error column rather
than coalescing to empty string, and renames the lifecycle column
from `status` to `state` so the term "status" can be reserved for the
granular phase string surfaced in subsequent commits.

The column set is intentionally scoped to fields that have meaning
for an external consumer:

  job_id, job_type, owner, description, created, finished, state, error

Internal execution detail (claim_session_id, claim_instance_id),
retry bookkeeping (num_runs, last_run), and creator coupling
(created_by_type, created_by_id) are deliberately excluded so the
stable contract does not depend on those implementation choices.

Row-level access mirrors crdb_internal.jobs by calling
crdb_internal.can_view_job(owner) in the view body. This relies on
the preceding commit's symmetry fix in optbuilder which permits
system-view bodies to invoke crdb_internal builtins (the table-
privilege bypass was already in place; the builtin guard was not,
forcing one-off aliases without that fix).

This is the first of four commits adding stable visibility into the
jobs system; subsequent commits add a `_with_progress` join view and
two SRF builtins for per-job message and progress history.

Epic: none

Release note (sql change): Added `information_schema.crdb_jobs`, a
stable view exposing per-job metadata from the jobs system intended
for programmatic use. Unlike SHOW JOBS, the description column is
returned in full and the error column distinguishes NULL (no error)
from the empty string. The lifecycle column is named `state`.
Extend the prior crdb_jobs view with the current progress reading
and user-visible status message, materialized via LEFT OUTER JOINs
on system.job_progress and system.job_status. The new columns are:

  progress_fraction, resolved, status_message, last_updated

resolved is exposed as the raw HLC decimal it is stored as,
matching system.job_progress.resolved. The earlier prototype
converted it to TIMESTAMPTZ via hlc_to_timestamp; that was
rejected because the conversion drops the HLC's logical component
and customers consuming this surface to feed timestamps into AS OF
SYSTEM TIME or to restart a changefeed cursor need the full HLC.
Callers wanting wall-clock can apply hlc_to_timestamp at the call
site:

  SELECT job_id, hlc_to_timestamp(resolved) AS resolved_at, ...
  FROM information_schema.crdb_jobs_with_progress;

last_updated is greatest(p.written, s.written), collapsing the
progress and status write timestamps into a single column so the
stable contract does not commit to the internal split between those
two tables.

status_message is the granular per-job phase string ("backfilling",
"waiting for gc", ...). Distinct from `state` which is the
lifecycle enum (running, paused, succeeded, ...).

The view is kept separate from crdb_jobs rather than synthesized as
a projection over it because the optimizer cannot elide the LEFT
OUTER JOINs (see the TODO at the existing crdb_internal.jobs view).
Splitting the views avoids forcing the joins on every crdb_jobs
query.

Epic: none

Release note (sql change): Added
`information_schema.crdb_jobs_with_progress`, a stable view
exposing per-job metadata together with the current progress
fraction, resolved HLC, status message, and last-updated timestamp.
The resolved column is exposed as the raw HLC; apply
hlc_to_timestamp at the call site if a wall-clock value is needed.
@dt dt force-pushed the info-schema-jobs-views branch from 70b4dbf to 74a469b Compare May 24, 2026 14:22
@dt dt requested a review from a team as a code owner May 24, 2026 14:22
@dt dt requested review from ZhouXing19 and removed request for a team May 24, 2026 14:22
dt added 2 commits May 24, 2026 14:34
Add a set-returning function exposing the per-job message history
recorded in system.job_message:

  information_schema.crdb_job_messages(job_id INT)
    RETURNS SETOF (recorded TIMESTAMPTZ, kind STRING, message STRING)

Modeled as a function rather than a view because the message log is
expected to be queried for one job at a time. A view would let a
caller `SELECT * FROM ...` with no filter and materialize every
message across every job, while the function signature forces the
caller to commit to a specific job_id at the call site.

The generator first looks up the job's owner via an internal SQL
query and applies the same `HasViewAccessToJob` check used by
crdb_internal.can_view_job. A non-existent or invisible job_id
returns zero rows rather than an error, matching the row-filtering
semantics of information_schema.crdb_jobs.

Both internal queries (owner lookup and the message fetch) run with
NodeUserSessionDataOverride so non-admin callers can still drive the
function for jobs they own.

Epic: none

Release note (sql change): Added
`information_schema.crdb_job_messages(job_id)`, a set-returning
function that returns the message history for one job (recorded,
kind, message). The argument is mandatory; an unknown or invisible
job_id returns zero rows.
Add a set-returning function exposing the per-job progress
trajectory recorded in system.job_progress_history:

  information_schema.crdb_job_progress_history(job_id INT)
    RETURNS SETOF (
      recorded          TIMESTAMPTZ,
      progress_fraction FLOAT,
      resolved          DECIMAL
    )

Same shape and access semantics as crdb_job_messages: function form
forces a single-job filter at the call site rather than letting a
view accidentally materialize ~1000 history entries per job times
the number of jobs in the cluster; the visibility check matches
crdb_internal.can_view_job; a non-existent or invisible job_id
returns zero rows.

resolved is exposed as the raw HLC decimal stored in
system.job_progress_history; the prototype that converted it to
TIMESTAMPTZ via hlc_to_timestamp was rejected because the
conversion drops the HLC's logical component, and customers
consuming this surface to record a "last seen" cursor for AS OF
SYSTEM TIME or to restart a changefeed need the full HLC. The
conversion is a one-liner at the call site:

  SELECT recorded, progress_fraction, hlc_to_timestamp(resolved)
    FROM information_schema.crdb_job_progress_history($1);

Epic: none

Release note (sql change): Added
`information_schema.crdb_job_progress_history(job_id)`, a
set-returning function that returns the full progress trajectory of
one job (recorded, progress_fraction, resolved). resolved is the
raw HLC decimal; apply hlc_to_timestamp at the call site for a
wall-clock value. The argument is mandatory; an unknown or
invisible job_id returns zero rows.
@dt dt force-pushed the info-schema-jobs-views branch from 74a469b to 32140f6 Compare May 24, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants