Added missing statsd mappings, updated mappings dimensions and metrics docs to reflect the emitted Coordinator stats#19392
Conversation
…s docs to reflect the emitted Coordinator stats
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 3 |
| P3 | 0 |
| Total | 3 |
This is an automated review by Codex GPT-5
| "segment/availableDeepStorageOnly/count" : { "dimensions" : ["dataSource"], "type" : "gauge" }, | ||
| "segment/unneededEternityTombstone/count" : { "dimensions" : ["dataSource"], "type" : "gauge" }, | ||
| "segment/clone/assigned/count" : { "dimensions" : ["dataSource"], "type" : "count" }, | ||
| "segment/clone/dropped/count" : { "dimensions" : ["dataSource"], "type" : "count" }, |
There was a problem hiding this comment.
[P2] Clone-drop metric drops the emitted server dimension
segment/clone/dropped/count is emitted from CloneHistoricals.dropSegmentFromTargetServer with only the server dimension, but this new StatsD mapping asks for dataSource. DimensionConverter only forwards dimensions present on the event, so StatsD/DogStatsD will emit this newly mapped metric without either dimension and aggregate clone drops across all target servers. Map server, or emit dataSource as well if datasource-level reporting is intended.
| "compact/createJobs/time" : { "dimensions" : [], "type" : "timer" }, | ||
| "compact/createJobs/count" : { "dimensions" : [], "type" : "count" }, | ||
| "compact/runScheduler/time" : { "dimensions" : [], "type" : "timer" }, | ||
| "compactTask/cancelled/count" : { "dimensions" : [], "type" : "count" }, |
There was a problem hiding this comment.
[P2] Cancelled compaction tasks lose datasource labels
CompactSegments records compactTask/cancelled/count with RowKey.of(Dimension.DATASOURCE, dataSource), but the added StatsD mapping has an empty dimension list. This silently strips datasource from the emitted StatsD/DogStatsD metric, so cancellations for different datasources are collapsed into one series. Include dataSource in the mapping.
|
|
||
| "segment/balancer/compute/error" : { "dimensions" : ["tier", "dataSource", "description"], "type" : "count" }, | ||
|
|
||
| "config/brokerSync/time" : { "dimensions" : [], "type" : "timer" }, |
There was a problem hiding this comment.
[P2] Broker sync metrics collapse per-broker failures
BaseDynamicConfigSyncer emits config/brokerSync/time and config/brokerSync/error with configType and server, and config/brokerSync/total/time with configType, but the new mappings for these metrics use no dimensions. StatsD will therefore aggregate sync latency and errors across config types and brokers, hiding which broker/config is failing. Add the emitted dimensions to the mappings.
FrankChen021
left a comment
There was a problem hiding this comment.
I reviewed the follow-up changes. All three prior statsd mapping threads appear resolved in the current code: clone-drop keeps the emitted server dimension, cancelled compaction tasks keep dataSource, and broker sync metrics include their emitted configType and server dimensions.
This is an automated review by Codex GPT-5
Description
Metrics published by various Druid services are handled by the configured Emitter to send them to external observability systems. The Statsd Emitter is used to send to StatsD system. This emitter uses a default mapping file (packaged within Druid) to map from Druid metric types to StatsD metric types. If a mapping is not present for a metric, it is not emitted.
A number of metrics are missing mappings or have incomplete or incorrect dimensions specified in the mappings. It is likely because these mappings are not being kept up-to-date, with the additions and changes to published metrics in the code.
This change updates these mappings for Coordinator metrics and fixes documentation where it deviates from what is in the code.
Release note
Fixes missing Coordinator stats not being published to StatsD system
This PR has: