Context
Upon upgrading from v2.1.4 to v2.2.0 of the component, we realized that none of our failed job alerts seemed to be working as intended. After some investigation, we found that labels are not scraped from jobs by default when using the kube-state-metrics helmchart for exporting metrics.
In order to have these labels scraped so that the failed job alerts will work correctly, we had to add label scraping configuration here: https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-state-metrics/values.yaml#L141
Alternatives
As the default AlertManager rules rely on the scraping of the relevant job labels in prometheus, I feel there should be some documentation or warning that the default AlertManager rules in component versions >= v2.2.0 require the scraping of these job labels, as this scraping does not happen by default in the typical kube-state-metrics stack installation.
Context
Upon upgrading from v2.1.4 to v2.2.0 of the component, we realized that none of our failed job alerts seemed to be working as intended. After some investigation, we found that labels are not scraped from jobs by default when using the
kube-state-metricshelmchart for exporting metrics.In order to have these labels scraped so that the failed job alerts will work correctly, we had to add label scraping configuration here: https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-state-metrics/values.yaml#L141
Alternatives
As the default AlertManager rules rely on the scraping of the relevant job labels in prometheus, I feel there should be some documentation or warning that the default AlertManager rules in component versions >= v2.2.0 require the scraping of these job labels, as this scraping does not happen by default in the typical
kube-state-metricsstack installation.