Skip to content

Refine PrometheusFailedRate signal and incident metadata#5477

Draft
rhamitarora wants to merge 1 commit into
Azure:mainfrom
rhamitarora:rhamitarora/ARO-25102-PrometheusFailedRate
Draft

Refine PrometheusFailedRate signal and incident metadata#5477
rhamitarora wants to merge 1 commit into
Azure:mainfrom
rhamitarora:rhamitarora/ARO-25102-PrometheusFailedRate

Conversation

@rhamitarora
Copy link
Copy Markdown

@rhamitarora rhamitarora commented Jun 2, 2026

Harden the PrometheusFailedRate rule with scoped and backward-compatible remote-write metrics, safer ratio math, and explicit correlation fields so incidents group by failing endpoint. Update alert tests and runbook URLs to match the new behavior and improve on-call triage.

What

Why

Testing

Special notes for your reviewer

PR Checklist

  • PR is scoped to a single task (no mixed concerns)
  • Title follows Conventional Commits format
  • Summary explains the "Why" behind the change
  • Linked to relevant ticket/issue
  • Screenshots included (if graph/UI/metrics changes)
  • Self-reviewed the diff
  • CI/CD checks are passing (ignore Tide)
  • Draft PR used for WIP (if applicable)
  • Commit history is clean (rebased/squashed)
  • Tricky code blocks are commented
  • Specific reviewers tagged
  • All comment threads resolved before merge

Harden the PrometheusFailedRate rule with scoped and backward-compatible remote-write metrics, safer ratio math, and explicit correlation fields so incidents group by failing endpoint. Update alert tests and runbook URLs to match the new behavior and improve on-call triage.

Co-authored-by: Cursor <cursoragent@cursor.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jun 2, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rhamitarora
Once this PR has been reviewed and has the lgtm label, please assign mmazur for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jun 2, 2026

Hi @rhamitarora. Thanks for your PR.

I'm waiting for a Azure member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant