feat: add client label to InterceptionCount, PromptCount and TokenUseCount metrics#209
feat: add client label to InterceptionCount, PromptCount and TokenUseCount metrics#209
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
e928b2c to
24bcf05
Compare
ssncferreira
left a comment
There was a problem hiding this comment.
Overall LGTM 👍 just a comment about initiator_id, but it is not directly related to the changes in this PR.
metrics/metrics.go
Outdated
| // Interception-related metrics. | ||
|
|
||
| // Pessimistic cardinality: 2 providers, 5 models, 2 statuses, 2 routes, 3 methods = up to 120 PER INITIATOR. | ||
| // Pessimistic cardinality: 3 providers, 5 models, 2 statuses, 2 routes, 3 methods, 10 clients = up to 1800 PER INITIATOR. |
There was a problem hiding this comment.
Isn't it 3 routes? 👀
- Anthropic:
/v1/messages - OpenAI/Copilot:
/chat/completions,/responses
| // Pessimistic cardinality: 3 providers, 5 models, 2 statuses, 2 routes, 3 methods, 10 clients = up to 1800 PER INITIATOR. | |
| // Pessimistic cardinality: 3 providers, 5 models, 2 statuses, 3 routes, 3 methods, 10 clients = up to 1800 PER INITIATOR. |
| Help: "The count of intercepted requests.", | ||
| }, append(baseLabels, "status", "route", "method", "initiator_id")), | ||
| // Pessimistic cardinality: 2 providers, 5 models, 2 routes = up to 20. | ||
| }, append(baseLabels, "status", "route", "method", "initiator_id", "client")), |
There was a problem hiding this comment.
This is not directly related to the changes in this PR, but is using initiator_id as a label a good idea? cc @dannykopping
It's an unbounded value and the main factor of label cardinality (as the comments themselves mention). For 10k users, InterceptionCount alone could produce up to 18M time series. Maybe this is something that was already discussed, and we are still far from those numbers, but just pointing this out.
The Prometheus docs explicitly recommend against using user IDs as labels.
There was a problem hiding this comment.
We decided that the benefit of operators being able to quickly identify which initiator(s) were overloading the system outweighed the possible cardinality problem. If cardinality becomes a problem they can just drop that label.
| expectModel: "claude-sonnet-4-0", | ||
| expectRoute: "/v1/messages", | ||
| expectProvider: config.ProviderAnthropic, | ||
| expectClient: aibridge.ClientUnknown, |
There was a problem hiding this comment.
nit: could be worth adding a couple more test cases with different user agents (e.g. Copilot, Cursor) to cover label propagation beyond just Claude Code and Unknown.
b24c6fc to
0f25518
Compare
24bcf05 to
c92a71f
Compare
0f25518 to
b01456d
Compare
c92a71f to
59973a3
Compare

No description provided.