Operator-action panels: live blast radius via Business Forms#4675
Draft
Operator-action panels: live blast radius via Business Forms#4675
Conversation
Contributor
Observability diff (vs staging)Diff truncated (104648 bytes; limit 60000). Full diff: https://github.com/cardstack/boxel/actions/runs/25457607630 diff --git a/tmp/remote-canon.iNLy7o/dashboards/boxel-status/boxel-jobs.json b/tmp/committed-canon.7HzRmO/dashboards/boxel-status/boxel-jobs.json
index 6a40566..3d08e23 100644
--- a/tmp/remote-canon.iNLy7o/dashboards/boxel-status/boxel-jobs.json
+++ b/tmp/committed-canon.7HzRmO/dashboards/boxel-status/boxel-jobs.json
@@ -34,101 +34,223 @@
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Operator actions: trigger a reindex via the realm-server. Each button POSTs with an Authorization: Bearer header (token substituted into a hidden constant template variable at apply time from SSM, CS-10929) and shows a confirmation dialog. Single-realm reindex targets the realm in the variable picker above; full reindex hits every realm and is the more disruptive option.",
+ "description": "Operator actions: trigger a reindex via the realm-server. Live blast-radius (pending / in-flight / oldest pending) is fetched from boxel_index/jobs every refresh; the reindex buttons disable themselves while an indexing job is already in flight for the selected realm. Each click POSTs with `Authorization: Bearer ${grafana_secret}` (substituted from SSM at apply time, CS-10929).",
"fieldConfig": {
- "defaults": {
- "actions": [
- {
- "confirmation": "Reindex ${full_index_realm}?",
- "fetch": {
- "body": "",
- "headers": [
- [
- "Authorization",
- "Bearer ${grafana_secret}"
- ]
- ],
- "method": "POST",
- "queryParams": [
- [
- "realm",
- "${full_index_realm}"
- ]
- ],
- "url": "${realm_server}_grafana-reindex"
- },
- "oneClick": false,
- "title": "Reindex ${full_index_realm}",
- "type": "fetch"
- },
- {
- "confirmation": "Reindex ALL realms? This kicks off an indexing job for every realm on the server and can take a long time.",
- "fetch": {
- "body": "",
- "headers": [
- [
- "Authorization",
- "Bearer ${grafana_secret}"
- ]
- ],
- "method": "POST",
- "queryParams": [],
- "url": "${realm_server}_grafana-full-reindex"
- },
- "oneClick": false,
- "title": "Reindex ALL realms",
- "type": "fetch"
- }
- ],
- "color": {
- "mode": "thresholds"
- },
- "mappings": [
- {
- "options": {
- "from": 0,
- "result": {
- "index": 0,
- "text": "Reindex"
- },
- "to": 9999999999999
- },
- "type": "range"
- }
- ],
- "thresholds": {
- "mode": "absolute",
- "steps": [
- {
- "color": "blue"
- }
- ]
- }
- },
+ "defaults": {},
"overrides": []
},
"gridPos": {
- "h": 4,
+ "h": 11,
"w": 24,
"x": 0,
"y": 0
},
"id": 4,
"options": {
- "colorMode": "value",
- "graphMode": "none",
- "justifyMode": "center",
- "orientation": "horizontal",
- "reduceOptions": {
- "calcs": [
- "lastNotNull"
+ "buttonGroup": {
+ "orientation": "center",
+ "size": "md"
+ },
+ "confirmModal": {
+ "body": "Please confirm the action.",
+ "cancel": "Cancel",
+ "columns": {
+ "include": [
+ "name",
+ "newValue"
+ ],
+ "name": "Field",
+ "newValue": "Value",
+ "oldValue": "Previous"
+ },
+ "confirm": "Confirm",
+ "elementDisplayMode": "modified",
+ "title": "Confirm operator action"
+ },
+ "elementValueChanged": "if (context.element.id === 'realm_picker' && context.element.value) {\n context.grafana.locationService.partial({ 'var-full_index_realm': context.element.value }, true);\n}\n",
+ "elements": [
+ {
+ "id": "realm_picker",
+ "labelWidth": 28,
+ "options": [],
+ "optionsSource": "Query",
+ "queryField": {
+ "refId": "A",
+ "value": "realm"
+ },
+ "queryOptions": {
+ "label": "label",
+ "source": "B",
+ "value": "value"
+ },
+ "section": "current",
+ "title": "Realm",
+ "tooltip": "Pick the realm to operate on. Selection is mirrored into the URL (?var-full_index_realm=…) so links to this dashboard preselect a realm.",
+ "type": "select",
+ "value": ""
+ },
+ {
+ "fieldName": "pending",
+ "id": "pending",
+ "labelWidth": 28,
+ "queryField": {
+ "refId": "A",
+ "value": "pending"
+ },
+ "section": "current",
+ "title": "Pending jobs (this realm)",
+ "tooltip": "Indexing jobs queued for the selected realm with no live worker reservation.",
+ "type": "disabled"
+ },
+ {
+ "fieldName": "in_flight",
+ "id": "in_flight",
+ "labelWidth": 28,
+ "queryField": {
+ "refId": "A",
+ "value": "in_flight"
+ },
+ "section": "current",
+ "title": "In-flight (this realm)",
+ "tooltip": "Indexing jobs currently held by a worker for the selected realm. While > 0, the per-realm reindex button is disabled.",
+ "type": "disabled"
+ },
+ {
+ "fieldName": "pending_full_reindex",
+ "id": "pending_full_reindex",
+ "labelWidth": 28,
+ "queryField": {
+ "refId": "A",
+ "value": "pending_full_reindex"
+ },
+ "section": "current",
+ "title": "Pending full-reindex",
+ "tooltip": "Number of `full-reindex` orchestration jobs currently queued or running. While > 0, the \"Reindex ALL realms\" button is disabled to prevent stacking duplicate full reindexes.",
+ "type": "disabled"
+ },
+ {
+ "fieldName": "oldest_pending_human",
+ "id": "oldest_pending_human",
+ "labelWidth": 28,
+ "queryField": {
+ "refId": "A",
+ "value": "oldest_pending_human"
+ },
+ "section": "current",
+ "title": "Oldest pending",
+ "tooltip": "Age of the oldest pending indexing job for the selected realm. Sustained age usually means workers are saturated or stuck.",
+ "type": "disabled"
+ },
+ {
+ "fieldName": "last_reindex_status",
+ "id": "last_reindex_status",
+ "labelWidth": 28,
+ "queryField": {
+ "refId": "A",
+ "value": "last_reindex_status"
+ },
+ "rows": 2,
+ "section": "current",
+ "title": "Last reindex (this realm)",
+ "tooltip": "Most recent from-scratch-index for the selected realm. Reads from the jobs / job_progress tables (CS-10930).",
+ "type": "disabledTextarea"
+ },
+ {
+ "buttonLabel": "Reindex ${full_index_realm:text}",
+ "customCode": "const realm = '${full_index_realm}';\nconst inFlight = Number((context.panel.elements.find(function(e){return e.id==='in_flight';})||{}).value || 0);\nconst pending = Number((context.panel.elements.find(function(e){return e.id==='pending';})||{}).value || 0);\nconst oldest = (context.panel.elements.find(function(e){return e.id==='oldest_pending_human';})||{}).value || 'n/a';\nif (inFlight > 0) {\n context.grafana.notifyWarning(['Reindex blocked', 'An indexing job is already in flight for this realm. Wait for it to finish before triggering a new one.']);\n return;\n}\nif (!window.confirm('Reindex ' + realm + '?\\n\\nBlast radius:\\n pending: ' + pending + '\\n oldest pending: ' + oldest + '\\n\\nThis will queue a from-scratch index for the selected realm only.')) { return; }\ntry {\n const r = await fetch('${realm_server}_grafana-reindex?realm=' + encodeURIComponent(realm), { method: 'POST', headers: { 'Authorization': 'Bearer ${grafana_secret}' } });\n if (r.ok) {\n context.grafana.notifySuccess(['Reindex queued', 'Started reindex of ' + realm]);\n if (typeof context.grafana.refresh === 'function') { context.grafana.refresh(); }\n } else {\n const txt = await r.text();\n context.grafana.notifyError(['Reindex failed', 'HTTP ' + r.status + ': ' + txt]);\n }\n} catch (err) {\n context.grafana.notifyError(['Reindex failed', String(err)]);\n}\n",
+ "disableIf": "return Number((context.panel.elements.find(function(e){return e.id==='in_flight';})||{}).value || 0) > 0;",
+ "id": "btn_reindex_realm",
+ "labelWidth": 28,
+ "section": "actions",
+ "show": "form",
+ "size": "md",
+ "title": "",
+ "tooltip": "POST /_grafana-reindex?realm=${full_index_realm}. Disabled while an indexing job is in flight for this realm.",
+ "type": "button",
+ "value": "",
+ "variant": "primary"
+ },
+ {
+ "buttonLabel": "Reindex ALL realms",
+ "customCode": "const pendingFull = Number((context.panel.elements.find(function(e){return e.id==='pending_full_reindex';})||{}).value || 0);\nif (pendingFull > 0) {\n context.grafana.notifyWarning(['Full reindex blocked', 'A full-reindex job is already pending or running. Wait for it to finish before triggering another.']);\n return;\n}\nif (!window.confirm('Reindex ALL realms?\\n\\nThis kicks off an indexing job for every realm on the server and can take a long time.')) { return; }\ntry {\n const r = await fetch('${realm_server}_grafana-full-reindex', { method: 'POST', headers: { 'Authorization': 'Bearer ${grafana_secret}' } });\n if (r.ok) {\n context.grafana.notifySuccess(['Full reindex queued', 'Started reindex of all realms.']);\n if (typeof context.grafana.refresh === 'function') { context.grafana.refresh(); }\n } else {\n const txt = await r.text();\n context.grafana.notifyError(['Full reindex failed', 'HTTP ' + r.status + ': ' + txt]);\n }\n} catch (err) {\n context.grafana.notifyError(['Full reindex failed', String(err)]);\n}\n",
+ "disableIf": "return Number((context.panel.elements.find(function(e){return e.id==='pending_full_reindex';})||{}).value || 0) > 0;",
+ "id": "btn_reindex_all",
+ "labelWidth": 28,
+ "section": "actions",
+ "show": "form",
+ "size": "md",
+ "title": "",
+ "tooltip": "POST /_grafana-full-reindex. Disabled while a `full-reindex` orchestration job is already pending or running. Long-running — every realm is reindexed.",
+ "type": "button",
+ "value": "",
+ "variant": "destructive"
+ }
+ ],
+ "initial": {
+ "code": "",
+ "contentType": "application/json",
+ "getPayload": "return {};",
+ "highlight": false,
+ "method": "query",
+ "payload": {}
+ },
+ "layout": {
+ "orientation": "horizontal",
+ "padding": 10,
+ "sectionVariant": "default",
+ "sections": [
+ {
+ "id": "current",
+ "name": "Current state"
+ },
+ {
+ "id": "actions",
+ "name": "Actions"
+ }
],
- "fields": "",
- "values": false
+ "variant": "split"
+ },
+ "reset": {
+ "backgroundColor": "purple",
+ "foregroundColor": "yellow",
+ "icon": "process",
+ "text": "Refresh",
+ "variant": "hidden"
+ },
+ "resetAction": {
+ "code": "",
+ "confirm": false,
+ "contentType": "application/json",
+ "getPayload": "return {};",
+ "method": "-",
+ "mode": "initial",
+ "payload": {}
+ },
+ "saveDefault": {
+ "icon": "save",
+ "text": "Save Default",
+ "variant": "hidden"
},
- "textMode": "name"
+ "submit": {
+ "backgroundColor": "purple",
+ "foregroundColor": "yellow",
+ "icon": "cloud-upload",
+ "text": "Submit",
+ "variant": "hidden"
+ },
+ "sync": false,
+ "update": {
+ "code": "",
+ "confirm": false,
+ "contentType": "application/json",
+ "getPayload": "return {};",
+ "method": "-",
+ "payload": {},
+ "payloadMode": "all"
+ },
+ "updateEnabled": "disabled"
},
- "pluginVersion": "12.4.3",
+ "pluginVersion": "6.2.0",
"targets": [
{
"datasource": {
@@ -138,41 +260,43 @@
"editorMode": "code",
"format": "table",
"rawQuery": true,
- "rawSql": "SELECT 1 AS click;",
- "refId": "A",
- "sql": {
- "columns": [
- {
- "parameters": [],
- "type": "function"
- }
- ],
- "groupBy": [
- {
- "property": {
- "type": "string"
- },
- "type": "groupBy"
- }
- ],
- "limit": 50
- }
+ "rawSql": "WITH realm_jobs AS (\n SELECT j.*\n FROM jobs j\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND COALESCE(j.args->>'realmURL','') = '${full_index_realm}'\n),\nrealm_pending AS (\n SELECT COUNT(*) AS n,\n MIN(j.created_at) AS oldest_created\n FROM realm_jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.status = 'unfulfilled' AND jr.id IS NULL\n),\nrealm_in_flight AS (\n SELECT COUNT(*) AS n\n FROM realm_jobs j\n JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.finished_at IS NULL\n),\npending_full_reindex AS (\n SELECT COUNT(*) AS n\n FROM jobs j\n WHERE j.job_type = 'full-reindex'\n AND j.finished_at IS NULL\n),\nlast_reindex AS (\n SELECT j.id, j.created_at AS started, j.finished_at AS finished,\n j.status,\n COALESCE(jp.files_completed, 0) AS files_completed,\n COALESCE(jp.total_files, 0) AS total_files\n FROM realm_jobs j\n LEFT JOIN job_progress jp ON jp.job_id = j.id\n WHERE j.job_type = 'from-scratch-index'\n ORDER BY j.created_at DESC\n LIMIT 1\n)\nSELECT\n '${full_index_realm}' AS realm,\n COALESCE((SELECT n FROM realm_pending), 0) AS pending,\n COALESCE((SELECT n FROM realm_in_flight), 0) AS in_flight,\n COALESCE((SELECT n FROM pending_full_reindex), 0) AS pending_full_reindex,\n CASE\n WHEN (SELECT oldest_created FROM realm_pending) IS NULL THEN '—'\n ELSE TO_CHAR(NOW() - (SELECT oldest_created FROM realm_pending), 'HH24:MI:SS')\n || ' (since ' || TO_CHAR((SELECT oldest_created FROM realm_pending) AT TIME ZONE 'UTC', 'YYYY-MM-DD HH24:MI:SS') || ' UTC)'\n END AS oldest_pending_human,\n COALESCE(\n (SELECT\n CASE\n WHEN finished IS NULL AND started IS NOT NULL THEN\n 'running — ' || files_completed || '/' || NULLIF(total_files,0) || ' files, started ' || TO_CHAR(started AT TIME ZONE 'UTC', 'YYYY-MM-DD HH24:MI:SS') || ' UTC'\n WHEN finished IS NOT NULL THEN\n INITCAP(COALESCE(status::text,'finished')) || ' at ' || TO_CHAR(finished AT TIME ZONE 'UTC', 'YYYY-MM-DD HH24:MI:SS') || ' UTC'\n ELSE COALESCE(status::text,'unknown')\n END\n FROM last_reindex),\n 'never'\n ) AS last_reindex_status;",
+ "refId": "A"
+ },
+ {
+ "datasource": {
+ "type": "grafana-postgresql-datasource",
+ "uid": "cef5v5sl9k7i8f"
+ },
+ "editorMode": "code",
+ "format": "table",
+ "rawQuery": true,
+ "rawSql": "SELECT REGEXP_REPLACE(url, '^https?://', '') AS label, url AS value FROM realm_registry WHERE kind IN ('bootstrap', 'source') ORDER BY 1;",
+ "refId": "B"
}
],
"title": "Operator Actions",
- "type": "stat"
+ "type": "volkovlabs-form-panel"
},
{
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Indexing jobs waiting for a worker across all realm-server tasks (CS-10930). Reconciles with `SELECT count(*) FROM jobs WHERE status='unfulfilled' AND job_type IN ('from-scratch-index','incremental-index')` minus those with an active reservation.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
+ "custom": {
+ "align": "left",
+ "cellOptions": {
+ "type": "auto"
+ },
+ "filterable": true,
+ "inspect": false,
+ "minWidth": 150
+ },
"mappings": [],
"thresholds": {
"mode": "absolute",
@@ -180,42 +304,99 @@
{
"color": "green"
},
- {
- "color": "yellow",
- "value": 50
- },
{
"color": "red",
- "value": 200
+ "value": 80
}
]
- },
- "unit": "short"
+ }
},
- "overrides": []
+ "overrides": [
+ {
+ "matcher": {
+ "id": "byName",
+ "options": "job_id"
+ },
+ "properties": [
+ {
+ "id": "actions",
+ "value": [
+ {
+ "confirmation": "Delete waiting job ${__value.raw}? This marks it as completed without running it.",
+ "fetch": {
+ "body": "",
+ "headers": [
+ [
+ "Authorization",
+ "Bearer ${grafana_secret}"
+ ]
+ ],
+ "method": "POST",
+ "queryParams": [
+ [
+ "job_id",
+ "${__value.raw}"
+ ]
+ ],
+ "url": "${realm_server}_grafana-complete-job"
+ },
+ "oneClick": false,
+ "title": "Delete job ${__value.raw}",
+ "type": "fetch"
+ }
+ ]
+ },
+ {
+ "id": "mappings",
+ "value": [
+ {
+ "options": {
+ "from": 0,
+ "result": {
+ "color": "red",
+ "index": 0,
+ "text": "Delete"
+ },
+ "to": 9999999999999
+ },
+ "type": "range"
+ }
+ ]
+ },
+ {
+ "id": "displayName",
+ "value": "Action"
+ },
+ {
+ "id": "custom.filterable",
+ "value": false
+ }
+ ]
+ }
+ ]
},
"gridPos": {
- "h": 4,
- "w": 8,
+ "h": 9,
+ "w": 24,
"x": 0,
- "y": 4
+ "y": 40
},
- "id": 11,
+ "id": 2,
"options": {
- "colorMode": "value",
- "graphMode": "area",
- "justifyMode": "center",
- "orientation": "auto",
- "reduceOptions": {
- "calcs": [
- "lastNotNull"
- ],
+ "cellHeight": "sm",
+ "footer": {
+ "countRows": false,
+ "enablePagination": false,
"fields": "",
- "values": false
+ "reducer": [
+ "sum"
+ ],
+ "show": false
},
- "textMode": "auto"
+ "showHeader": true,
+ "sortBy": []
},
- "pluginVersion": "12.4.3",
+ "pluginVersion": "10.4.1",
"targets": [
{
"datasource": {
@@ -225,34 +406,376 @@
"editorMode": "code",
"format": "table",
"rawQuery": true,
- "rawSql": "SELECT COUNT(*) AS pending\n FROM jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.status = 'unfulfilled'\n AND j.job_type IN ('from-scratch-index','incremental-index')\n AND jr.id IS NULL;",
- "refId": "A"
+ "rawSql": "SELECT \n j.id, \n j.priority, \n j.job_type, \n CASE \n WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '') \n WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '') \n ELSE j.concurrency_group \n END AS concurrency_group, \n j.status AS status, \n j.created_at AS created_at, \n\n\n -- Wait time in seconds\n CASE \n WHEN jr.created_at IS NOT NULL \n THEN EXTRACT(EPOCH FROM (jr.created_at - j.created_at))\n ELSE \n EXTRACT(EPOCH FROM (NOW() - j.created_at))\n END\n AS wait_seconds,\n j.id as job_id\n\nFROM \n jobs j\n \nLEFT JOIN \n job_reservations jr ON j.id = jr.job_id\n\nWHERE\njr.job_id IS NULL AND j.status = 'unfulfilled' \n \nORDER BY \n j.created_at ASC\nLIMIT 500;",
+ "refId": "A",
+ "sql": {
+ "columns": [
+ {
+ "parameters": [],
+ "type": "function"
+ }
+ ],
+ "groupBy": [
+ {
+ "property": {
+ "type": "string"
+ },
+ "type": "groupBy"
+ }
+ ],
+ "limit": 50
+ }
}
],
- "title": "Pending Indexing Jobs",
- "type": "stat"
+ "title": "Waiting Jobs",
+ "type": "table"
},
{
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Indexing jobs currently held by a worker (live reservation, finished_at NULL).",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
+ "custom": {
+ "align": "left",
+ "cellOptions": {
+ "type": "auto"
+ },
+ "filterable": true,
+ "inspect": false,
+ "minWidth": 150
+ },
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
- "color": "blue"
+ "color": "green"
},
{
- "color": "green",
- "value": 1
+ "color": "red",
+ "value": 80
+ }
+ ]
+ }
+ },
+ "overrides": [
+ {
+ "matcher": {
+ "id": "byName",
+ "options": "worker_id"
+ },
+ "properties": [
+ {
+ "id": "links",
+ "value": [
+ {
+ "targetBlank": true,
+ "title": "View logs",
+ "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3"
+ }
+ ]
+ },
+ {
+ "id": "mappings",
+ "value": [
+ {
+ "options": {
+ "pattern": "^(.{6}).*$",
+ "result": {
+ "index": 0,
+ "text": "View logs ($1)"
+ }
+ },
+ "type": "regex"
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "matcher": {
+ "id": "byName",
+ "options": "reservation_id"
+ },
+ "properties": [
+ {
+ "id": "actions",
+ "value": [
+ {
+ "confirmation": "Cancel running reservation ${__value.raw}? The worker will stop processing it.",
+ "fetch": {
+ "body": "",
+ "headers": [
+ [
+ "Authorization",
+ "Bearer ${grafana_secret}"
+ ]
+ ],
+ "method": "POST",
+ "queryParams": [
+ [
+ "reservation_id",
+ "${__value.raw}"
+ ]
+ ],
+ "url": "${realm_server}_grafana-complete-job"
+ },
+ "oneClick": false,
+ "title": "Delete reservation ${__value.raw}",
+ "type": "fetch"
+ }
+ ]
+ },
+ {
+ "id": "mappings",
+ "value": [
+ {
+ "options": {
+ "from": 0,
+ "result": {
+ "color": "red",
+ "index": 0,
+ "text": "Delete"
+ },
+ "to": 9999999999999
+ },
+ "type": "range"
+ }
+ ]
+ },
+ {
+ "id": "displayName",
+ "value": "Action"
+ },
+ {
+ "id": "custom.filterable",
+ "value": false
+ }
+ ]
+ }
+ ]
+ },
+ "gridPos": {
+ "h": 11,
+ "w": 24,
+ "x": 0,
+ "y": 49
+ },
+ "id": 1,
+ "options": {
+ "cellHeight": "sm",
+ "footer": {
+ "countRows": false,
+ "enablePagination": false,
+ "fields": "",
+ "reducer": [
+ "sum"
+ ],
+ "show": false
+ },
+ "showHeader": true,
+ "sortBy": []
+ },
+ "pluginVersion": "10.4.1",
+ "targets": [
+ {
+ "datasource": {
+ "type": "grafana-postgresql-datasource",
+ "uid": "cef5v5sl9k7i8f"
+ },
+ "editorMode": "code",
+ "format": "table",
+ "rawQuery": true,
+ "rawSql": "SELECT \n j.id,\n COALESCE(jrc.attempt, 0) AS attempt, \n j.priority, \n j.job_type, \n CASE \n WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '') \n WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '') \n ELSE j.concurrency_group \n END AS concurrency_group, \n j.status AS status, \n j.created_at AS created_at, \n\n\n -- Wait time in seconds\n CASE \n WHEN jr.created_at IS NOT NULL \n THEN EXTRACT(EPOCH FROM (jr.created_at - j.created_at))\n ELSE \n EXTRACT(EPOCH FROM (NOW() - j.created_at))\n END\n AS wait_seconds,\n\n jr.created_at AS started_at, \n\n\n -- Run time in seconds\n CASE \n WHEN jr.created_at IS NOT NULL THEN\n CASE \n WHEN j.finished_at IS NOT NULL \n THEN EXTRACT(EPOCH FROM (j.finished_at - jr.created_at))\n ELSE \n EXTRACT(EPOCH FROM (NOW() - jr.created_at))\n END\n ELSE NULL\n END\n AS run_seconds\n, jr.worker_id,\n jr.id as reservation_id \n\nFROM \n jobs j\nJOIN \n job_reservations jr ON j.id = jr.job_id AND jr.completed_at IS NULL AND jr.locked_until > NOW()\nLEFT JOIN \n (SELECT job_id, COUNT(*) AS attempt FROM job_reservations GROUP BY job_id) jrc ON j.id = jrc.job_id\nWHERE j.finished_at IS NULL\nORDER BY \n jr.created_at DESC\nLIMIT 500;",
+ "refId": "A",
+ "sql": {
+ "columns": [
+ {
+ "parameters": [],
+ "type": "function"
+ }
+ ],
+ "groupBy": [
+ {
+ "property": {
+ "type": "string"
+ },
+ "type": "groupBy"
+ }
+ ],
+ "limit": 50
+ }
+ }
+ ],
+ "title": "Running Jobs",
+ "type": "table"
+ },
+ {
+ "datasource": {
+ "type": "grafana-postgresql-datasource",
+ "uid": "cef5v5sl9k7i8f"
+ },
+ "fieldConfig": {
+ "defaults": {
+ "color": {
+ "mode": "thresholds"
+ },
+ "custom": {
+ "align": "left",
+ "cellOptions": {
+ "type": "auto"
+ },
+ "filterable": true,
+ "inspect": false,
+ "minWidth": 150
+ },
+ "mappings": [],
+ "thresholds": {
+ "mode": "absolute",
+ "steps": [
+ {
+ "color": "green"
+ },
+ {
+ "color": "red",
+ "value": 80
+ }
+ ]
+ }
+ },
+ "overrides": [
+ {
+ "matcher": {
+ "id": "byName",
+ "options": "worker_id"
+ },
+ "properties": [
+ {
+ "id": "links",
+ "value": [
+ {
+ "targetBlank": true,
+ "title": "View logs",
+ "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3\n\n\n"
+ }
+ ]
+ },
+ {
+ "id": "mappings",
+ "value": [
+ {
+ "options": {
+ "pattern": "^(.{6}).*$",
+ "result": {
+ "index": 0,
+ "text": "View logs ($1)"
+ }
+ },
+ "type": "regex"
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "matcher": {
+ "id": "byName",
+ "options": "reservation_id"
+ },
+ "properties": [
+ {
+ "id": "custom.hidden",
+ "value": true
+ }
+ ]
+ }
+ ]
+ },
+ "gridPos": {
+ "h": 18,
+ "w": 24,
+ "x": 0,
+ "y": 60
+ },
+ "id": 3,
+ "options": {
+ "cellHeight": "sm",
+ "footer": {
+ "countRows": false,
+ "enablePagination": false,
+ "fields": "",
+ "reducer": [
+ "sum"
+ ],
+ "show": false
+ },
+ "showHeader": true,
+ "sortBy": []
+ },
+ "pluginVersion": "10.4.1",
+ "targets": [
+ {
+ "datasource": {
+ "type": "grafana-postgresql-datasource",
+ "uid": "cef5v5sl9k7i8f"
+ },
+ "editorMode": "code",
+ "format": "table",
+ "rawQuery": true,
+ "rawSql": "SELECT \n j.id, \n jr.id as reservation_id, \n ROW_NUMBER() OVER (PARTITION BY j.id ORDER BY jr.created_at) AS attempt, \n j.priority, \n j.job_type, \n CASE \n WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '') \n WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '') \n ELSE j.concurrency_group \n END AS concurrency_group, \n j.status AS status, \n j.created_at AS created_at, \n\n\n -- Wait time in seconds\n CASE \n WHEN jr.created_at IS NOT NULL \n THEN EXTRACT(EPOCH FROM (jr.created_at - j.created_at))\n ELSE \n EXTRACT(EPOCH FROM (NOW() - j.created_at))\n END\n AS wait_seconds,\n\n jr.created_at AS started_at, \n\n\n -- Run time in seconds\n CASE \n WHEN jr.created_at IS NOT NULL THEN\n CASE \n WHEN j.finished_at IS NOT NULL \n THEN EXTRACT(EPOCH FROM (j.finished_at - jr.created_at))\n ELSE \n EXTRACT(EPOCH FROM (NOW() - jr.created_at))\n END\n ELSE NULL\n END\n AS run_seconds,\n j.finished_at AS finished_at, \n jr.worker_id\n\nFROM \n jobs j\nLEFT JOIN \n job_reservations jr ON j.id = jr.job_id\nWHERE j.finished_at IS NOT NULL\nORDER BY \n j.finished_at DESC\nLIMIT 500;",
+ "refId": "A",
+ "sql": {
+ "columns": [
+ {
+ "parameters": [],
+ "type": "function"
+ }
+ ],
+ "groupBy": [
+ {
+ "property": {
+ "type": "string"
+ },
+ "type": "groupBy"
+ }
+ ],
+ "limit": 50
+ }
+ }
+ ],
+ "title": "Finished Jobs (limit 500)",
+ "type": "table"
+ },
+ {
+ "datasource": {
+ "type": "grafana-postgresql-datasource",
+ "uid": "cef5v5sl9k7i8f"
+ },
+ "description": "Indexing jobs waiting for a worker across all realm-server tasks (CS-10930). Reconciles with `SELECT count(*) FROM jobs WHERE status='unfulfilled' AND job_type IN ('from-scratch-index','incremental-index')` minus those with an active reservation.",
+ "fieldConfig": {
+ "defaults": {
+ "color": {
+ "mode": "thresholds"
+ },
+ "mappings": [],
+ "thresholds": {
+ "mode": "absolute",
+ "steps": [
+ {
+ "color": "green"
+ },
+ {
+ "color": "yellow",
+ "value": 50
+ },
+ {
+ "color": "red",
+ "value": 200
}
]
},
@@ -263,10 +786,10 @@
"gridPos": {
"h": 4,
"w": 8,
- "x": 8,
+ "x": 0,
"y": 4
},
- "id": 12,
+ "id": 11,
"options": {
"colorMode": "value",
"graphMode": "area",
@@ -291,11 +814,11 @@
"editorMode": "code",
"format": "table",
"rawQuery": true,
- "rawSql": "SELECT COUNT(*) AS in_flight\n FROM jobs j\n JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.finished_at IS NULL\n AND j.job_type IN ('from-scratch-index','incremental-index');",
+ "rawSql": "SELECT COUNT(*) AS pending\n FROM jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.status = 'unfulfilled'\n AND j.job_type IN ('from-scratch-index','incremental-index')\n AND jr.id IS NULL;",
"refId": "A"
}
],
- "title": "In-flight Indexing Jobs",
+ "title": "Pending Indexing Jobs",
"type": "stat"
},
{
@@ -303,7 +826,7 @@
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Age of the oldest pending indexing job. Red after 5 minutes — sustained values here usually mean workers are saturated or stuck.",
+ "description": "Indexing jobs currently held by a worker (live reservation, finished_at NULL).",
"fieldConfig": {
"defaults": {
"color": {
@@ -314,29 +837,25 @@
"mode": "absolute",
"steps": [
{
- "color": "green"
- },
- {
- "color": "yellow",
- "value": 60
+ "color": "blue"
},
{
- "color": "red",
- "value": 300
+ "color": "green",
+ "value": 1
}
]
},
- "unit": "s"
+ "unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 8,
- "x": 16,
+ "x": 8,
"y": 4
},
- "id": 13,
+ "id": 12,
"options": {
"colorMode": "value",
"graphMode": "area",
@@ -361,11 +880,11 @@
"editorMode": "code",
"format": "table",
"rawQuery": true,
- "rawSql": "SELECT EXTRACT(EPOCH FROM (NOW() - MIN(j.created_at))) AS oldest_pending_seconds\n FROM jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.status = 'unfulfilled'\n AND j.job_type IN ('from-scratch-index','incremental-index')\n AND jr.id IS NULL;",
+ "rawSql": "SELECT COUNT(*) AS in_flight\n FROM jobs j\n JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.finished_at IS NULL\n AND j.job_type IN ('from-scratch-index','incremental-index');",
"refId": "A"
}
],
- "title": "Oldest Pending",
+ "title": "In-flight Indexing Jobs",
"type": "stat"
},
{
@@ -373,57 +892,53 @@
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Indexing-job flow rates. Three series — `arrived` (jobs queued), `started` (a worker reserved them), `completed` (jobs.finished_at set). Bucketed by 30s. A persistent gap between arrived and completed indicates worker saturation.",
+ "description": "Age of the oldest pending indexing job. Red after 5 minutes — sustained values here usually mean workers are saturated or stuck.",
"fieldConfig": {
"defaults": {
"color": {
- "mode": "palette-classic"
- },
- "custom": {
- "axisGridShow": true,
- "axisLabel": "jobs / 30s",
- "axisPlacement": "auto",
- "drawStyle": "line",
- "fillOpacity": 10,
- "gradientMode": "none",
- "lineInterpolation": "smooth",
- "lineWidth": 1,
- "pointSize": 4,
- "scaleDistribution": {
- "type": "linear"
- },
- "showPoints": "never",
- "spanNulls": true,
- "stacking": {
- "mode": "none"
- }
+ "mode": "thresholds"
},
"mappings": [],
- "unit": "short"
+ "thresholds": {
+ "mode": "absolute",
+ "steps": [
+ {
+ "color": "green"
+ },
+ {
+ "color": "yellow",
+ "value": 60
+ },
+ {
+ "color": "red",
+ "value": 300
+ }
+ ]
+ },
+ "unit": "s"
},
"overrides": []
},
"gridPos": {
- "h": 8,
- "w": 24,
- "x": 0,
- "y": 8
+ "h": 4,
+ "w": 8,
+ "x": 16,
+ "y": 4
},
- "id": 14,
+ "id": 13,
"options": {
- "legend": {
+ "colorMode": "value",
+ "graphMode": "area",
+ "justifyMode": "center",
+ "orientation": "auto",
+ "reduceOptions": {
"calcs": [
- "mean",
"lastNotNull"
],
- "displayMode": "list",
- "placement": "bottom",
- "showLegend": true
+ "fields": "",
+ "values": false
},
- "tooltip": {
- "mode": "multi",
- "sort": "none"
- }
+ "textMode": "auto"
},
"pluginVersion": "12.4.3",
"targets": [
@@ -433,43 +948,21 @@
"uid": "cef5v5sl9k7i8f"
},
"editorMode": "code",
- "format": "time_series",
+ "format": "table",
"rawQuery": true,
- "rawSql": "SELECT $__timeGroupAlias(j.created_at, '30s') AS time,\n COUNT(*) AS arrived\n FROM jobs j\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND $__timeFilter(j.created_at)\n GROUP BY 1\n ORDER BY 1;",
+ "rawSql": "SELECT EXTRACT(EPOCH FROM (NOW() - MIN(j.created_at))) AS oldest_pending_seconds\n FROM jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.status = 'unfulfilled'\n AND j.job_type IN ('from-scratch-index','incremental-index')\n AND jr.id IS NULL;",
"refId": "A"
- },
- {
- "datasource": {
- "type": "grafana-postgresql-datasource",
- "uid": "cef5v5sl9k7i8f"
- },
- "editorMode": "code",
- "format": "time_series",
- "rawQuery": true,
- "rawSql": "SELECT $__timeGroupAlias(jr.created_at, '30s') AS time,\n COUNT(*) AS started\n FROM job_reservations jr\n JOIN jobs j ON j.id = jr.job_id\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND $__timeFilter(jr.created_at)\n GROUP BY 1\n ORDER BY 1;",
- "refId": "B"
- },
- {
- "datasource": {
- "type": "grafana-postgresql-datasource",
- "uid": "cef5v5sl9k7i8f"
- },
- "editorMode": "code",
- "format": "time_series",
- "rawQuery": true,
- "rawSql": "SELECT $__timeGroupAlias(j.finished_at, '30s') AS time,\n COUNT(*) AS completed\n FROM jobs j\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND j.finished_at IS NOT NULL\n AND $__timeFilter(j.finished_at)\n GROUP BY 1\n ORDER BY 1;",
- "refId": "C"
}
],
- "title": "Indexing throughput (arrived / started / completed)",
- "type": "timeseries"
+ "title": "Oldest Pending",
+ "type": "stat"
},
{
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Stocks of indexing jobs over time. `pending` = queued, no live reservation; `in_flight` = a worker has an open reservation; `completed` = cumulative finished. Bucketed at 1 minute over the panel's time range. Cost: O(buckets × indexing-jobs in window) per refresh — if this gets slow, switch to a snapshot/materialized table.",
+ "description": "Indexing-job flow rates. Three series — `arrived` (jobs queued), `started` (a worker reserved them), `completed` (jobs.finished_at set). Bucketed by 30s. A persistent gap between arrived and completed indicates worker saturation.",
"fieldConfig": {
"defaults": {
"color": {
@@ -477,7 +970,7 @@
},
"custom": {
"axisGridShow": true,
- "axisLabel": "jobs",
+ "axisLabel": "jobs / 30s",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
@@ -503,13 +996,13 @@
"h": 8,
"w": 24,
"x": 0,
- "y": 16
+ "y": 8
},
- "id": 15,
+ "id": 14,
"options": {
"legend": {
"calcs": [
- "max",
+ "mean",
"lastNotNull"
],
"displayMode": "list",
@@ -531,316 +1024,91 @@
"editorMode": "code",
"format": "time_series",
"rawQuery": true,
- "rawSql": "WITH buckets AS (\n SELECT generate_series($__timeFrom()::timestamptz, $__timeTo()::timestamptz, '1 minute') AS bucket\n),\nindexing_jobs AS (\n SELECT j.id, j.created_at, j.finished_at,\n (SELECT MIN(jr.created_at) FROM job_reservations jr WHERE jr.job_id = j.id) AS first_started_at\n FROM jobs j\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND j.created_at <= $__timeTo()::timestamptz\n)\nSELECT b.bucket AS time,\n COUNT(*) FILTER (WHERE ij.created_at <= b.bucket\n AND (ij.first_started_at IS NULL OR ij.first_started_at > b.bucket)\n AND (ij.finished_at IS NULL OR ij.finished_at > b.bucket)) AS pending,\n COUNT(*) FILTER (WHERE ij.first_started_at IS NOT NULL\n AND ij.first_started_at <= b.bucket\n AND (ij.finished_at IS NULL OR ij.finished_at > b.bucket)) AS in_flight,\n COUNT(*) FILTER (WHERE ij.finished_at IS NOT NULL AND ij.finished_at <= b.bucket) AS completed\n FROM buckets b LEFT JOIN indexing_jobs ij ON TRUE\n GROUP BY b.bucket\n ORDER BY b.bucket;",
+ "rawSql": "SELECT $__timeGroupAlias(j.created_at, '30s') AS time,\n COUNT(*) AS arrived\n FROM jobs j\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND $__timeFilter(j.created_at)\n GROUP BY 1\n ORDER BY 1;",
"refId": "A"
- }
- ],
- "title": "Pending vs in-flight vs completed (over time)",
- "type": "timeseries"
- },
- {
- "datasource": {
- "type": "grafana-postgresql-datasource",
- "uid": "cef5v5sl9k7i8f"
- },
- "description": "One row per indexing job currently held by a worker (across all realm-server / worker tasks). `progress` and `current files` come from `job_progress` (CS-10930), populated by each realm-server's IndexingEventSink write-through. Click the realm cell to drill into the live activity feed.",
- "fieldConfig": {
- "defaults": {
- "color": {
- "mode": "thresholds"
- },
- "custom": {
- "align": "left",
- "cellOptions": {
- "type": "auto"
- },
- "filterable": true,
- "inspect": false,
- "minWidth": 100
- },
- "mappings": [],
- "thresholds": {
- "mode": "absolute",
- "steps": [
- {
- "color": "green"
- },
- {
- "color": "red",
- "value": 80
- }
- ]
- }
},
- "overrides": [
- {
- "matcher": {
- "id": "byName",
- "options": "percent"
- },
- "properties": [
- {
- "id": "custom.cellOptions",
- "value": {
- "mode": "gradient",
- "type": "gauge"
- }
- },
- {
- "id": "unit",
- "value": "percent"
- },
- {
- "id": "min",
- "value": 0
- },
- {
- "id": "max",
- "value": 100
- }
- ]
- },
- {
- "matcher": {
- "id": "byName",
- "options": "elapsed_seconds"
- },
- "properties": [
- {
- "id": "unit",
- "value": "s"
- }
- ]
- },
- {
- "matcher": {
- "id": "byName",
- "options": "realm_url"
- },
- "properties": [
- {
- "id": "custom.hidden",
- "value": true
- }
- ]
- },
- {
- "matcher": {
- "id": "byName",
- "options": "reservation_id"
- },
- "properties": [
- {
- "id": "custom.hidden",
- "value": true
- }
- ]
- },
- {
- "matcher": {
- "id": "byName",
- "options": "realm"
- },
- "properties": [
- {
- "id": "links",
- "value": [
- {
- "targetBlank": true,
- "title": "View activity feed",
- "url": "/d/fetquzizsej28b?${__url_time_range}&var-realm_url=${__data.fields.realm_url:queryparam}&var-job_id=${__data.fields.job_id}.${__data.fields.reservation_id}&orgId=1&viewPanel=11"
- }
- ]
- }
- ]
+ {
+ "datasource": {
+ "type": "grafana-postgresql-datasource",
+ "uid": "cef5v5sl9k7i8f"
},
- {
- "matcher": {
- "id": "byName",
- "options": "worker_id"
- },
- "properties": [
- {
- "id": "links",
- "value": [
- {
- "targetBlank": true,
- "title": "View logs",
- "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.job_id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3"
- }
- ]
- },
- {
- "id": "mappings",
- "value": [
- {
- "options": {
- "pattern": "^(.{6}).*$",
- "result": {
- "index": 0,
- "text": "View logs ($1)"
- }
- },
- "type": "regex"
- }
- ]
- }
- ]
- }
- ]
- },
- "gridPos": {
- "h": 10,
- "w": 24,
- "x": 0,
- "y": 24
- },
- "id": 16,
- "options": {
- "cellHeight": "sm",
- "footer": {
- "countRows": false,
- "enablePagination": false,
- "fields": "",
- "reducer": [
- "sum"
- ],
- "show": false
+ "editorMode": "code",
+ "format": "time_series",
+ "rawQuery": true,
+ "rawSql": "SELECT $__timeGroupAlias(jr.created_at, '30s') AS time,\n COUNT(*) AS started\n FROM job_reservations jr\n JOIN jobs j ON j.id = jr.job_id\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND $__timeFilter(jr.created_at)\n GROUP BY 1\n ORDER BY 1;",
+ "refId": "B"
},
- "showHeader": true,
- "sortBy": []
- },
- "pluginVersion": "12.4.3",
- "targets": [
{
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
"editorMode": "code",
- "format": "table",
+ "format": "time_series",
"rawQuery": true,
- "rawSql": "SELECT\n j.id AS job_id,\n RTRIM(REGEXP_REPLACE(j.concurrency_group, '^indexing:https?://[^/]+/', ''), '/') AS realm,\n COALESCE(j.args->>'realmURL','') AS realm_url,\n j.job_type,\n COALESCE(jp.files_completed, 0) AS files_completed,\n COALESCE(jp.total_files, 0) AS total_files,\n CASE WHEN COALESCE(jp.total_files, 0) > 0\n THEN (jp.files_completed::float / jp.total_files) * 100\n ELSE 0\n END AS percent,\n EXTRACT(EPOCH FROM (NOW() - jr.created_at)) AS elapsed_seconds,\n jr.created_at AS started_at,\n jr.worker_id,\n jr.id AS reservation_id\n FROM jobs j\n JOIN job_reservations jr ON jr.job_id = j.id\n AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n LEFT JOIN job_progress jp ON jp.job_id = j.id\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND j.finished_at IS NULL\n ORDER BY jr.created_at DESC;",
- "refId": "A"
+ "rawSql": "SELECT $__timeGroupAlias(j.finished_at, '30s') AS time,\n COUNT(*) AS completed\n FROM jobs j\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n AND j.finished_at IS NOT NULL\n AND $__timeFilter(j.finished_at)\n GROUP BY 1\n ORDER BY 1;",
+ "refId": "C"
}
],
- "title": "Active Indexing",
- "type": "table"
+ "title": "Indexing throughput (arrived / started / completed)",
+ "type": "timeseries"
},
{
"datasource": {
"type": "grafana-postgresql-datasource",
"uid": "cef5v5sl9k7i8f"
},
- "description": "Per-realm aggregate of indexing-job state. `oldest_pending_seconds` red after 5 min flags realms whose backlog isn't draining.",
+ "description": "Stocks of indexing jobs over time. `pending` = queued, no live reservation; `in_flight` = a worker has an open reservation; `completed` = cumulative finished. Bucketed at 1 minute over the panel's time range. Cost: O(buckets × indexing-jobs in window) per refresh — if this gets slow, switch to a snapshot/materialized table.",
"fieldConfig": {
"defaults": {
"color": {
- "mode": "thresholds"
+ "mode": "palette-classic"
},
"custom": {
- "align": "left",
- "cellOptions": {
- "type": "auto"
+ "axisGridShow": true,
+ "axisLabel": "jobs",
+ "axisPlacement": "auto",
+ "drawStyle": "line",
+ "fillOpacity": 10,
+ "gradientMode": "none",
+ "lineInterpolation": "smooth",
+ "lineWidth": 1,
+ "pointSize": 4,
+ "scaleDistribution": {
+ "type": "linear"
},
- "filterable": true,
- "inspect": false,
- "minWidth": 100
+ "showPoints": "never",
+ "spanNulls": true,
+ "stacking": {
+ "mode": "none"
+ }
},
"mappings": [],
- "thresholds": {
- "mode": "absolute",
- "steps": [
- {
- "color": "green"
- }
- ]
- }
+ "unit": "short"
},
- "overrides": [
- {
- "matcher": {
- "id": "byName",
- "options": "realm_url"
- },
- "properties": [
- {
- "id": "custom.hidden",
- "value": true
- }
- ]
- },
- {
- "matcher": {
- "id": "byName",
- "options": "oldest_pending_seconds"
- },
- "properties": [
- {
- "id": "unit",
- "value": "s"
- },
- {
- "id": "thresholds",
- "value": {
- "mode": "absolute",
- "steps": [
- {
- "color": "green"
- },
- {
- "color": "yellow",
- "value": 60
- },
- {
- "color": "red",
- "value": 300
- }
- ]
- }
- }
- ]
- },
- {
- "matcher": {
- "id": "byName",
- "options": "realm"
- },
- "properties": [
- {
- "id": "links",
- "value": [
- {
- "targetBlank": true,
- "title": "View activity feed",
- "url": "/d/fetquzizsej28b?${__url_time_range}&var-realm_url=${__data.fie |
Builds on CS-10929 (which replaced the toolbar-link placeholders with
Grafana-native `actions[]` button panels). Native actions can fire a
POST with a confirmation, but they can't show live blast radius in the
confirm message, can't disable themselves on data-driven conditions,
and can't render last-run state inline. This PR moves the two
operator-action panels (boxel-jobs / user-credits) to the Volkov Labs
Business Forms panel (`volkovlabs-form-panel`, signed) and uses its
`disableIf` / `customCode` / `disabled`-element machinery to deliver
the richer UX. The realm-permissions Grant Permission panel is left
as-is — no backend endpoint yet, deliberately deferred.
Boxel Jobs > Operator Actions
- One row per stat: selected realm, pending (this realm), in-flight
(this realm), in-flight (all realms), oldest pending, last reindex
status. Pulled live every refresh from `jobs` / `job_reservations`
/ `job_progress` (the same tables the rest of the dashboard reads).
- "Reindex ${full_index_realm}" button: `disableIf` returns true when
the realm has any in-flight indexing job. Tooltip explains why.
- "Reindex ALL realms" button: `disableIf` while any indexing job is
in flight server-wide; rendered as `destructive` variant.
- Each `customCode` runs a precondition check, browser-confirms with
the live counts in the message, fetches the existing
`_grafana-reindex` / `_grafana-full-reindex` endpoints with the
same `Authorization: Bearer ${grafana_secret}` header CS-10929
introduced, and dispatches notifySuccess/notifyError + a dashboard
refresh on completion.
User Credits > Add Extra Credit
- Read-only display of remaining plan allowance, remaining extra
credit, last extra-credit grant.
- Three preset buttons (+1,000 / +10,000 / +100,000) plus a custom-
amount number input + "Add custom amount" button. Custom button
`disableIf` while the amount is ≤ 0.
- All four buttons POST `/_grafana-add-credit?user=${matrix_user_id}
&credit=<n>` with the bearer header.
Per-row Delete buttons in the Waiting Jobs / Running Jobs tables stay
on the native actions[] mechanism — no behavioral gap there, no
benefit from migrating them.
Local Grafana picks up the plugin via `GF_INSTALL_PLUGINS:
volkovlabs-form-panel` added to `packages/observability/docker-compose.yml`.
Staging and production self-host Grafana need the matching install via
cardstack/infra Terraform — flagged in the PR description.
Plugin caveat
- The Business Forms repo was archived 2025-09-26 at v6.2.0. It
remains signed and functional on Grafana 12.x but is no longer
receiving updates. Pinned plugin version: 6.2.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nding Two follow-ups discovered while live-testing the panel against a running Grafana + boxel-pg: 1. `COALESCE(status, 'finished' | 'unknown')` blew up with `invalid input value for enum job_statuses` because `jobs.status` is an enum and the literal gets coerced to it. Cast to text first. 2. The Volkov Forms plugin runs its own datasource query for `initial` binding and ignores the panel's `targets[]` for that purpose. The disabled/textbox elements only populate when the SQL lives in `options.initial.payload.rawSql`, with `payloadMode: "custom"`. With that wired up, all six "Current state" fields (selected realm, pending, in-flight per-realm + global, oldest pending, last reindex status) populate live on every refresh as designed. Verified visually. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
UX follow-up after live-testing the panel against running Grafana. Five
discrete improvements; one minor bug fix.
Bug fix
- Initial form binding switched from `datasource` mode to `query` mode.
In datasource mode the plugin runs its own SQL once on panel mount and
never re-runs on variable change, so changing the realm picker left
the "Current state" fields stale (showing the previous realm's data).
Query mode binds elements from `data.series` (the panel's own
targets[]), which Grafana refreshes on every variable change. As a
side effect, the redundant SQL from `initial.payload` is dropped —
the panel's `targets[]` is the single source of truth.
UX
- Picker moved into the panel. The dashboard-level `Realm to Full Index`
variable is hidden (`hide: 2`) and a Volkov select element at the top
of the form replaces it. `elementValueChanged` writes the picked value
back to the URL via `locationService.partial({...}, true)`, so links
to this dashboard with `?var-full_index_realm=…` still preselect a
realm and the variable still drives panel SQL substitution.
- Picker source switched from `boxel_index` (only realms with at least
one indexed card — 5 entries locally) to `realm_registry` filtered
to `kind IN ('bootstrap','source')` (system mounts + user realms —
17 entries locally). Published snapshots excluded since they aren't
typically reindexed. The hidden dashboard variable's definition was
updated in lockstep.
- Realm cell is now a real select rather than a "Selected realm:"
read-only echo of the picker.
- Button labels use `${full_index_realm:text}` so they read e.g.
"Reindex skills/" rather than the full URL.
- Element labels widened (`labelWidth: 22 → 28`) to stop mid-word
wrapping ("Pending jobs (this realm)" was breaking on "(this");
redundant button-element titles ("Reindex selected realm",
"Reindex all realms") removed since the buttons label themselves.
- Last-reindex string switched from `UPPER(status)` to `INITCAP(status)`
so finished jobs read "Resolved at …" rather than the harsh-looking
all-caps "RESOLVED at …". `'none'` for empty oldest-pending replaced
with em dash.
- Panel grid height trimmed from 14 → 11 rows; with the form layout
laid out properly the previous height left a strip of dead space.
Disable rules
- "Reindex ALL realms" disable rule narrowed: was disabled while ANY
indexing job was in flight server-wide (overly cautious — every
per-realm reindex blocked the full reindex trigger). Now disabled
only while a `full-reindex` orchestration job itself is pending or
running (`COUNT FROM jobs WHERE job_type='full-reindex' AND
finished_at IS NULL > 0`). The orchestration job lifecycle is short
— it fans out to per-realm jobs and finishes — so the button
unblocks quickly while the actual reindex work continues.
- "Reindex selected realm" disable rule unchanged (still blocks while
this realm has an in-flight indexing job).
SQL
- Targeted minor fix: `COALESCE(status::text, 'finished'|'unknown')`
with explicit `::text` cast. Without it the literal gets coerced to
the `job_statuses` enum and the query errors with `invalid input
value for enum job_statuses`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1cb24af to
51a21d8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Builds on PR #4663 / CS-10929 (which replaced the toolbar-link placeholders with Grafana-native
actions[]button panels — auth + confirmations already in place). Native actions can't show live data in the confirmation, can't disable themselves on data-driven conditions, and can't render last-run state inline. This PR moves the two relevant operator-action panels to the Business Forms panel (volkovlabs-form-panel) and uses itsdisableIf/customCode/disabled-element machinery to deliver the richer UX.Per the redirect in scoping: no new endpoints, no audit table, no
triggered_byattribution, and therealm-permissionsGrant Permission panel is left as-is (no backend yet — deliberately deferred).What you get
Boxel Jobs › Operator Actions
jobs/job_reservations/job_progress— the same tables the rest of the dashboard already reads.Reindex ${full_index_realm}button:disableIfwhile this realm has an in-flight indexing job, with a tooltip explaining why.Reindex ALL realmsbutton:disableIfwhile any indexing job is in flight server-wide;destructivevariant styling.customCodeprecondition-checks, then browser-confirms with the live counts in the message ("Reindex http://…/experiments/? Blast radius: pending 12, oldest pending 02:14:33"), POSTs the existing_grafana-reindex/_grafana-full-reindexendpoints with the sameAuthorization: Bearer ${grafana_secret}header CS-10929 introduced, and notifies + refreshes on completion.User Credits › Add Extra Credit
+1,000,+10,000,+100,000) plus a custom-amount number input +Add custom amountbutton. Custom buttondisableIfwhile amount ≤ 0._grafana-add-credit?user=${matrix_user_id}&credit=<n>with the bearer header.Out of scope (unchanged): per-row Delete actions in the Waiting Jobs / Running Jobs tables stay on the native
actions[]mechanism — no UX gap there, no benefit from migrating them.Plugin install
GF_INSTALL_PLUGINS: volkovlabs-form-paneladded topackages/observability/docker-compose.yml.docker compose down && docker compose up -dafter pulling will install it.Plugin maintenance
Originally a Volkov Labs plugin; as of early May 2026 the Business Suite (Forms / Text / Charts / Calendar / Table / Variable / Input / Media / News / Links) has been picked up and is now maintained by Grafana Labs at
github.com/grafana/business-forms(and siblings). The plugin id is preserved (volkovlabs-form-panel) so the dashboard JSON in this PR is forward-compatible — no migration needed when Grafana cuts a release. Recent commit history on the new repo shows active engineering (security fixes, CI/CD standardization, release prep). The earlier "archived upstream" caveat from when this PR was first drafted no longer applies.Why draft
docker compose up -dand click through both panels (idle realm vs. in-flight realm; positive vs. zero credit input). I've verified live with screenshots in the discussion thread; reviewer should still poke at it.Test plan
cd packages/observability && docker compose up -d, open Grafana at :3001, confirm the plugin loads with no console errors and both Operator Actions panels render with live blast radius.0in custom amount, confirm "Add custom amount" disables.bash packages/observability/scripts/check-no-secrets.shclean.🤖 Generated with Claude Code