feat(mcp): rewrite dashboard authoring prompts; expose filters on save tool by alex-fedotyev · Pull Request #2264 · hyperdxio/hyperdx

alex-fedotyev · 2026-05-12T22:28:19Z

Summary

I rewrote the three MCP dashboard authoring prompts and added filters to hyperdx_save_dashboard's input schema. The goal: when an agent calls the MCP server to build a dashboard, it picks the right pattern by intent, follows the renderer's actual constraints, and produces a JSON shape that renders correctly on the first try.

What changed in the prompts:

create_dashboard now leads with a ten-rule design checklist (RED columns with aliases, per-series numberFormat for durations, groupByColumnsOnLeft for inventory tables, dashboard-level filters instead of per-tile where literals, one-metric-per-tile for metric sources, containers and tabs for grouping). The wall-of-JSON canonical example is gone; the four dashboard_examples patterns carry the concrete shapes. The prompt is shorter and the rules are easier for the model to scan.
dashboard_examples is now four verified patterns (service_inventory, service_detail, log_analytics, backend_dependencies) plus the existing infrastructure_sql. Each non-SQL example leads with a "When to use" header and a "Why this shape" note so the model picks by intent, not by surface keyword match. I built and rendered every example on a live dev stack before landing the prompt; that surfaced two design issues now captured as rules:
- chart-level numberFormat on a table that mixes counts and durations formats every value as a duration. Per-series numberFormat is the fix.
- metric tiles take exactly one select item. The renderer destructures select[0] and ignores the rest; multi-metric authoring needs one tile per metric.
query_guide gains a DASHBOARD FILTERS section that documents the filters: [{ type, name, expression, sourceId, where?, whereLanguage? }] shape, a NUMBER FORMAT section that explains the per-series vs. chart-level distinction, and a PER-TILE TYPE CONSTRAINTS note about the metric one-select rule.

What changed in hyperdx_save_dashboard:

New filters input field on the tool's inputSchema. Reuses externalDashboardFilterSchemaWithId from @/utils/zod so the MCP and REST surfaces stay in lockstep and the existing convertExternalFiltersToInternal helper handles the conversion without any translation layer. Same body schema (createDashboardBodySchema / updateDashboardBodySchema) that the v2 REST handler already uses.

Voice pass: every prompt string is now em-dash-free, with a snapshot test guarding regressions.

Drill-down behavior (onClick from a service-inventory row into the service-detail dashboard) is a separate follow-up PR; landing the prompt rewrite first so review surface stays focused.

Test plan

yarn workspace @hyperdx/api jest mcp/__tests__/dashboards (51 passed, 0 failed)
yarn workspace @hyperdx/api ci:lint (lint + tsc + openapi lint, clean)
Filter round-trip: save dashboard with two filters via MCP, get back, update with one renamed filter, fetch again, assert shape
Filter rejection: save with bogus sourceId returns 4xx
Backward compat: save dashboard without filters returns filters: []
Snapshot tests: design checklist present, four patterns listed, each example carries "When to use", zero em-dashes in any prompt string
Built every example dashboard on a live dev stack (slot 10) and confirmed every tile renders with realistic synthetic data: 90K traces across 8 services with realistic StatusMessage values, 17K logs across 8 services with severity distribution
Em-dash-free prose verified by prose-lint.py on every changed file

…ilters on save tool The `create_dashboard` prompt now leads with a ten-rule design checklist (RED columns with aliases, per-series numberFormat for durations, groupByColumnsOnLeft for inventory tables, dashboard-level filters instead of per-tile where literals, one-metric-per-tile for metric sources, containers and tabs for grouping). The wall-of-JSON canonical example is gone; the four dashboard_examples patterns carry the concrete shapes. The dashboard_examples set is replaced with four verified patterns (service_inventory, service_detail, log_analytics, backend_dependencies) plus the existing infrastructure_sql. Each non-SQL example leads with a "When to use" header and a "Why this shape" note so the model picks by intent, not by surface keyword match. Examples were built and rendered on a live dev stack before landing, which surfaced two design issues now captured as rules in the prompt: - chart-level numberFormat on a table that mixes counts and durations formats every value as a duration; per-series numberFormat is the fix. - metric tiles take exactly one select item (renderer destructures select[0] and ignores the rest). The query_guide prompt gains a DASHBOARD FILTERS section that documents the filters: [{ type, name, expression, sourceId, where?, whereLanguage? }] shape, a NUMBER FORMAT section that explains the per-series vs. chart-level distinction, and a PER-TILE TYPE CONSTRAINTS note about the metric one-select rule. hyperdx_save_dashboard now accepts `filters` on its input schema, reusing externalDashboardFilterSchemaWithId so the MCP and REST surfaces stay in lockstep and the existing convertExternalFiltersToInternal helper handles the conversion without translation. Filters round-trip through create, get, and update. Voice pass: every prompt string is now em-dash-free, with a snapshot test guarding regressions. Drill-down behavior (onClick from a service inventory row into the service detail dashboard) is a separate follow-up PR; landing the prompt rewrite first.

changeset-bot · 2026-05-12T22:28:23Z

🦋 Changeset detected

Latest commit: cd37010

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages

Name	Type
@hyperdx/api	Patch
@hyperdx/app	Patch
@hyperdx/otel-collector	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-05-12T22:28:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
hyperdx-oss	Ready	Preview, Comment	May 12, 2026 11:36pm

github-actions · 2026-05-12T22:28:50Z

🟡 Tier 3 — Standard

Introduces new logic, modifies core functionality, or touches areas with non-trivial risk.

Why this tier:

Diff size: 988 production lines changed (Tier 2 max: < 250)

Review process: Full human review — logic, architecture, edge cases.
SLA: First-pass feedback within 1 business day.

Stats

Production files changed: 5
Production lines changed: 988 (+ 474 in test files, excluded from tier calculation)
Branch: alex/HDX-mcp-prompts-overhaul
Author: alex-fedotyev

To override this classification, remove the review/tier-3 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

github-actions · 2026-05-12T22:31:11Z

PR Review

✅ No critical issues found.

Nice surface coverage: the create/update filter-id normalization (stripFilterIds for create, assignFilterIds for update) is the right shape to bridge the mcpFiltersParam.partial({ id: true }) input schema with the stricter createDashboardBodySchema / updateDashboardBodySchema, and both branches are exercised by tests. getMissingSources already takes filters, so source-existence is gated correctly. The having clause on mcpTableTileSchema was a real silent-strip risk in the prior round — good catch and matching round-trip test.

Minor (non-blocking) observations:

assignFilterIds populates an id for brand-new filters, but convertExternalFiltersToInternal will then immediately generate yet another fresh id (since the assigned id is not in existingFilterIds). Harmless waste, but the assigned id is never the persisted one — fine to leave.
mcpFiltersParam uses .partial({ id: true }) over a .strict() schema. Zod preserves strictness through .partial, so unknown keys still reject — confirmed safe.
Prompt voice and em-dash guards look thorough; buildSourceSummary regression coverage is welcome.

github-actions · 2026-05-12T22:37:37Z

E2E Test Results

✅ All tests passed • 176 passed • 3 skipped • 1176s

Status	Count
✅ Passed	176
❌ Failed	0
⚠️ Flaky	3
⏭️ Skipped	3

Tests ran across 4 shards in parallel.

View full report →

The buildSourceSummary helper is the source-list block that buildCreateDashboardPrompt prepends to the prompt body. It still carried em-dashes from before the voice pass, so the assembled `create_dashboard` prompt that the MCP transport returns to a client contained eight em-dashes even though every builder in content.ts is clean. Caught by an end-to-end check that fetched the prompt through the live MCP transport and counted em-dashes in the response payload.

github-actions · 2026-05-12T22:48:30Z

Deep Review

🟡 P2 -- recommended

packages/api/src/mcp/__tests__/dashboards.test.ts:1448 -- The new filter test block exercises round-trip and "add new filter on update," but never asserts the contract for omitting filters on update (preserve) or sending filters: [] on update (wipe), and updateDashboard at saveDashboard.ts:453 silently gates that behavior on filters !== undefined.
- Fix: Add two update-path tests against a seeded dashboard -- one that omits filters and asserts the persisted filters are unchanged, one that sends filters: [] and asserts the persisted array is empty.
- _{ce-testing-reviewer, ce-api-contract-reviewer}

🔵 P3 nitpicks (8)

packages/api/src/mcp/tools/dashboards/saveDashboard.ts:124 -- On the update path, an LLM that resends an existing filter without its id triggers assignFilterIds to mint a fresh ObjectId, which convertExternalFiltersToInternal then mints over again, so the persisted filter ends up with a brand-new id and the prior filter id is silently lost; no consumer keys persistent state by filter id today, so the failure is latent rather than active.
- Fix: Either document this "no id means new filter" semantic on mcpFiltersParam and add a dedicated test, or change assignFilterIds to fall back to content-matching against existingFilterIds so a re-sent filter without an id preserves the original id.
- _{ce-correctness-reviewer, ce-maintainability-reviewer, ce-api-contract-reviewer}
packages/api/src/mcp/tools/dashboards/schemas.ts:468 -- mcpFiltersParam's description lists the filter shape as { type, name, expression, sourceId, where?, whereLanguage? } with no mention of id, so an LLM has no signal about copy-back behavior between create (id stripped) and update (id preserved when provided, re-minted when missing).
- Fix: Extend the description to say id is optional, ignored on create, and required-for-preservation on update so callers know to round-trip it from hyperdx_get_dashboard.
- _{ce-api-contract-reviewer}
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:45 -- The tool description doesn't say that on update, omitting filters preserves existing filters while filters: [] wipes them; the asymmetric behavior is only encoded in the filters !== undefined guard at saveDashboard.ts:453.
- Fix: Add a one-line note to the hyperdx_save_dashboard description (or mcpFiltersParam description) clarifying the omit-to-preserve / empty-array-to-wipe contract.
- _{ce-api-contract-reviewer}
packages/api/src/mcp/__tests__/dashboards.test.ts:1449 -- The new having-clause test only exercises the create path, so a future regression where updateDashboardBodySchema drops having from the table-tile config wouldn't be caught.
- Fix: Extend the test to also update the saved dashboard with a different having value and assert it persists.
- _{ce-testing-reviewer}
packages/api/src/mcp/__tests__/dashboards.test.ts:1322 -- The "reject filter with missing source" test runs against a create call (no id), so the equivalent check on the update path (saveDashboard.ts:358, which calls the same getMissingSources(sources, tilesWithId, filters) helper) is not directly covered.
- Fix: Add a parallel test that creates a dashboard, then issues an update with a filter whose sourceId doesn't exist, and asserts the same Could not find source IDs envelope.
- _{ce-testing-reviewer}
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:119 -- stripFilterIds and assignFilterIds cast each filter to ExternalDashboardFilterWithId before destructuring id, but the actual input type is the union ExternalDashboardFilter | ExternalDashboardFilterWithId; the cast is fine today because runtime fields overlap, but it sidesteps the type system and would let a future schema split go unnoticed.
- Fix: Drop the union and type the parameter as (ExternalDashboardFilter & { id?: string })[] (or rely on the inferred zod schema type) so the destructure is type-checked rather than cast-through.
- _{ce-maintainability-reviewer}
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:102 -- The Create helper section divider sits directly above stripFilterIds and assignFilterIds, but only stripFilterIds is used by create; assignFilterIds is used exclusively by update, which is mildly confusing when scanning by section.
- Fix: Move both helpers above the Create helper divider into a shared Filter normalization section, or move assignFilterIds next to updateDashboard.
- _{ce-maintainability-reviewer}
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:132 -- assignFilterIds treats an empty-string id as "missing" via id.length > 0, but this branch has no test; a future refactor that flips the check to id != null would silently start trying to persist a filter with id: ''.
- Fix: Add a small update-path test that sends a filter with id: '' and asserts the persisted filter has a valid, non-empty id.
- _{ce-testing-reviewer}

Reviewers (5): correctness, testing, maintainability, kieran-typescript, api-contract.

Testing gaps:

Update-path filter behavior when filters is omitted (preserve) vs [] (wipe).
Update-path having round-trip and update-path filter sourceId rejection.
assignFilterIds empty-string id branch and the re-mint-on-no-id case for an existing filter.

…rompts claude-review flagged: - `service_detail` example uses `having: "StatusMessage != ''"` on a table tile, but mcpTableTileSchema.config does not include `having`. Zod's `.strip()` silently drops the field, so an LLM following the example through MCP would save a table that includes empty-message rows. Added `having: z.string().max(10000).optional()` mirroring externalDashboardTableChartConfigSchema and a round-trip test. - Heatmap tile's numberFormat describe still referenced `output: "time"` while the rest of the file uses `output: "duration"`. Aligned to "duration" for consistency with the prompt guidance. deep-review additionally flagged: - query_guide told the model that metric tiles take "1 select item with metricName + metricType", but mcpTileSelectItemSchema does not expose those fields, so the keys are silently stripped. Replaced the metric authoring guidance with an explicit "not currently exposed via the MCP schema; use raw SQL for infrastructure metrics" note. Same edit on the design-checklist rule 9 (replaced with a "replace, not merge" update-semantic note that addresses a separate review concern about partial-payload data loss on update). - mcpFiltersParam advertised id as optional via .partial({ id: true }), but createDashboardBodySchema rejects id (strict, no id) and updateDashboardBodySchema requires id. So an LLM copy-pasting a filter from get-dashboard into create, or adding a brand-new filter on update without an id, would hit a confusing strict-validation rejection. saveDashboard.ts now normalizes per flow: stripFilterIds() on create, assignFilterIds() on update. - Em-dash regression check did not cover buildQueryGuidePrompt (largest prompt, most likely to regress) or buildSourceSummary (helpers.ts carried em-dashes through the initial PR until fix `3539649e`). Added assertions for both. - Bad-sourceId test asserted only `text.toContain('source')`, which matches almost any error string. Tightened to assert "Could not find source IDs" plus the literal bad id. - Numbered-rule check used `prompt.toContain('${i}.')` which would match substrings like "1.2s" or "0.000000001". Anchored the check to line start (`/^${i}\\. /m`) and scoped it to the DESIGN CHECKLIST section. - DASHBOARD FILTERS and NUMBER FORMAT body content was not asserted beyond the heading. Now asserts substantive content (QUERY_EXPRESSION / expression / sourceId for filters; factor: 0.000000001 / duration / per-series for number format). - Filter round-trip did not cover the freshly-generated-id branch of convertExternalFiltersToInternal at the MCP layer. Added a test that ships a new no-id filter on update and asserts saveDashboard assigns a fresh id.

github-actions Bot added the review/tier-3 Standard — full human review required label May 12, 2026

vercel Bot deployed to Preview May 12, 2026 22:32 View deployment

vercel Bot deployed to Preview May 12, 2026 22:41 View deployment

vercel Bot deployed to Preview May 12, 2026 23:36 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): rewrite dashboard authoring prompts; expose filters on save tool#2264

feat(mcp): rewrite dashboard authoring prompts; expose filters on save tool#2264
alex-fedotyev wants to merge 3 commits into
mainfrom
alex/HDX-mcp-prompts-overhaul

alex-fedotyev commented May 12, 2026

Uh oh!

changeset-bot Bot commented May 12, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alex-fedotyev commented May 12, 2026

Summary

Test plan

Uh oh!

changeset-bot Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟡 Tier 3 — Standard

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Test Results

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deep Review

🟡 P2 -- recommended

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented May 12, 2026 •

edited

Loading

vercel Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading