feat(llm-obs): add spans analytics command with multi-dimension group-by#290
feat(llm-obs): add spans analytics command with multi-dimension group-by#290mbldatadog wants to merge 4 commits intodatadog-labs:mainfrom
Conversation
Adds `pup llm-obs spans analytics` to support aggregating LLM Obs span data grouped by one or more dimensions simultaneously — e.g. span_name + error type + error message, or model name + service tier + error type. Calls the llm-obs-query-rewriter timeseries endpoint (same backend as the LLM Obs Analytics tab in the UI). - New CLI: `pup llm-obs spans analytics --group-by --compute --limit` - Supports --query, --from, --to, --ml-app filters - 16 new tests (unit + integration, including 401/403/500 failure cases) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
Warning Auth issue / endpoint TBD — when testing, The expected output shape in the PR description is representative but constructed — not from a live call. The first open question covers this; flagging here so it's easy to find. |
The fix would be to work with @srosenthal-dd / AAA to ensure the oauth scope exists and is public |
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
I would say we should maintain the behavior we have in the LLM Obs UI which is group by not being required. It looks like just a global count (of either traces or spans depending on what you have selected) is returned
For field name conventions, I think we just use what we would use for querying (which I think is Additional "Features"
Validation |
|
Hi from AAA! It looks like the LLM Observability scopes are set up correctly, but a few of the backend APIs (https://github.com/DataDog/dd-source/blob/main/domains/ml-observability/apps/apis/llm-obs-query-rewriter/main.go) are using the BuiltInFeatures permission, which isn't exposed yet for most customers as a configurable scope. Do you know why? Could the APIs be updated to use LlmObservabilityRead or LlmObservabilityWrite? Happy to help if that's at all unclear! My main mission right now is to make API auth "just work" everywhere. |
|
Ah thanks @srosenthal-dd, yep, I wanted to make sure we had the OAuth scopes properly plumbed through everywhere before shipping this, will followup with you, thanks much... |
Replace hand-rolled comfy_table rendering with formatter::output so that spans analytics respects the -o flag (json, yaml, table, csv, tsv) like every other command. Buckets are flattened into a list of row objects before being passed to the formatter. Add unit tests covering single/multi-facet, no-facet aggregate, empty buckets, missing buckets key, and null fallbacks. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
okay @srosenthal-dd - I've got a PR up to try and plumb the correct scopes through, I need to take a closer look at it tomorrow, will follow up with you then! |
|
Hey @vpatel22 - turns out that this will require deeper auth surgery on our backend than my initial shot, working through it with @srosenthal-dd on my end, will let you know where things net out on Tuesday. It also turns out that Greg on our end is also building out a similar feature into the DD UI as well already, so I'll poke him next week and see what the plans are for that... |
Note
RFC — please do not merge. This is a draft for team/customer feedback before we consider it ready for production.
Summary
Adds
pup llm-obs spans analyticsto support aggregating LLM Obs span data grouped by multiple dimensions simultaneously — e.g.span_name + @meta.error.type + @meta.error.message, or@meta.model_name + @meta.error.type + service_tier.Changes
src/main.rs— newAnalyticsvariant inLlmObsSpansActionswith flags:--group-by,--compute,--limit,--query,--from,--to,--ml-appsrc/commands/llm_obs.rs— newspans_analytics()function; callsPOST /api/unstable/llm-obs-query-rewriter/timeseries; renders results as a tablesrc/test_commands.rs— 6 integration tests (200, 401, 403, 500, invalid--from, no auth) + 10 unit tests for the pure helper logicUsage
Open questions for feedback
llm-obs-query-rewriter/timeseriesreturns a 403 for Bearer token auth (requires session/cookie) and wraps the response in{"eventQueryResponse": "<escaped proto JSON>"}— both bad for a CLI. What's the right endpoint to call? Is there an MCP-layer equivalent, or should we call the EvP analytics endpoint directly?--group-bybe required, or silently no-op when omitted (returns a single global count)?@meta.error.typevserror_type)?🤖 Generated with Claude Code