add(datadog_cluster_agent): add autoscaling metrics#22984
add(datadog_cluster_agent): add autoscaling metrics#22984clamoriniere wants to merge 1 commit intomasterfrom
Conversation
Added new metrics related to autoscaling conditions and constraints for the DatadogPodAutoscaler. those metrics were introduced in the cluster-agent by DataDog/datadog-agent#47138
|
This PR does not modify any files shipped with the agent. To help streamline the release process, please consider adding the |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3ec397de36
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| datadog.cluster_agent.autoscaling.workload.horizontal_scaling.constraints.max_replicas,gauge,,,,Maximum number of replicas configured in the DatadogPodAutoscaler horizontal scaling constraints,0,datadog_cluster_agent,autoscaling horizontal scaling constraints max replicas,, | ||
| datadog.cluster_agent.autoscaling.workload.horizontal_scaling.constraints.min_replicas,gauge,,,,Minimum number of replicas configured in the DatadogPodAutoscaler horizontal scaling constraints,0,datadog_cluster_agent,autoscaling horizontal scaling constraints min replicas,, |
There was a problem hiding this comment.
Remove unsupported autoscaling metrics from metadata
The default Cluster Agent scrape config only emits metrics that appear in DEFAULT_METRICS (datadog_cluster_agent/datadog_checks/datadog_cluster_agent/check.py:26-44,100-105), but rows like horizontal_scaling.constraints.max_replicas/min_replicas here—and the other new received_recommendations_version, status.*, and vertical_scaling.constraints.* entries in this block—have no mapping there. After this change, the published metric catalog will claim these autoscaling metrics are supported even though the integration never submits them unless users add custom mappings by hand.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
I'm find with the fact the metrics are documented event if they aren't always available depending of the user configuration and environment
| datadog.cluster_agent.autoscaling.workload.horizontal_scaling_received_replicas,gauge,,,,Number of replicas recommended by the main horizontal scaling source,0,datadog_cluster_agent,autoscaling horizontal scaling received replicas,, | ||
| datadog.cluster_agent.autoscaling.workload.local.horizontal_scaling_recommended_replicas,gauge,,,,Number of replicas recommended by the local in-cluster fallback recommender,0,datadog_cluster_agent,autoscaling local horizontal scaling recommended replicas,, | ||
| datadog.cluster_agent.autoscaling.workload.local.horizontal_utilization_pct,gauge,,,,CPU utilization percentage computed by the local fallback recommender for horizontal scaling,0,datadog_cluster_agent,autoscaling local horizontal utilization pct,, | ||
| datadog.cluster_agent.autoscaling.workload.local_fallback_enabled,gauge,,,,1 if the local in-cluster fallback recommender is currently active for horizontal scaling 0 otherwise,-1,datadog_cluster_agent,autoscaling local fallback enabled,, |
There was a problem hiding this comment.
Rename this metric to the emitted local fallback name
Users will look up datadog.cluster_agent.autoscaling.workload.local_fallback_enabled from the metric catalog, but the check actually emits datadog.cluster_agent.autoscaling.workload.local.fallback_enabled (datadog_cluster_agent/datadog_checks/datadog_cluster_agent/check.py:42-44), and the bundled dashboard queries that dotted name (datadog_cluster_agent/assets/dashboards/datadog_cluster_agent_overview.json:2222). As written, this metadata entry documents a metric name that does not exist.
Useful? React with 👍 / 👎.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files🚀 New features to boost your workflow:
|
What does this PR do?
Added new metrics related to autoscaling conditions and constraints for the DatadogPodAutoscaler. those metrics were introduced in the cluster-agent by DataDog/datadog-agent#47138
Motivation
keep documentation up-to-date
Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is merged