Add grace period to project stuck creating alert#547
Conversation
Add a 2-minute grace period before the ProjectStuckCreatingSLOViolation alert fires. This prevents false pages when monitoring infrastructure has brief disruptions that cause incomplete query results.
🤖 Automatically added newlines to 1 file(s) Co-Authored-By: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
|
🤖 I automatically added missing newlines at the end of 1 file(s) in this PR. All files should now end with a newline character as per coding standards. |
This comment has been minimized.
This comment has been minimized.
|
|
1 similar comment
|
|
Summary
Adds a 2-minute grace period to the ProjectStuckCreatingSLOViolation alert rule. Previously, the alert fired immediately (
for: 0s), which meant any brief gap in monitoring data would trigger a false page.Context
On March 28, a short monitoring database disruption caused this alert to fire incorrectly. The 2-minute grace period gives the system time to recover from brief outages before paging on-call.
Tracking issue: https://github.com/datum-cloud/infra/issues/2059