infinite or negative infinite error budget

Some of our SLOs are showing infinite or negative infinite error budgets. More details are provided below.

## Setup

- **Sloth Version:** v0.12.0

  ```yaml
  args:
    - kubernetes-controller
    - --resync-interval=5m
    - --workers=5
    - --default-slo-period=28d
    - --logger=json
  ```

- **Kubernetes Version:** v1.33.1
- **vmalert Version:** v1.125.0 (using VictoriaMetrics)

### Negative Infinite Error Budget

We observed a negative infinite error budget after recently changing the target objective. Following this change, dashboards started showing a negative infinite error budget. However, in the past few days, this issue resolved itself and now displays percentage values as expected.

Below is the SLO spec for the affected service:

```yaml
---
apiVersion: sloth.slok.dev/v1
kind: PrometheusServiceLevel
metadata:
  name: vmcluster
  namespace: monitoring
  labels:
    team: diablo
spec:
  service: "vmcluster"
  labels:
    team: "diablo"
    namespace: "monitoring"
  slos:
    - name: "scrape-success"
      objective: 95.0
      description: "VictoriaMetrics SLI is the percentage of successful scrapes"
      sli:
        events:
          errorQuery: |
            sum(rate(vm_promscrape_scrapes_failed_total[{{.window}}]))
          totalQuery: |
            sum(rate(vm_promscrape_scrapes_total[{{.window}}]))
      alerting:
        name: SLOVMClusterScrapeFailure
        labels:
          team: diablo
        annotations:
          summary: "VictoriaMetrics scrapes are failing"
        pageAlert:
          labels:
            team: diablo
        ticketAlert:
          labels:
            team: diablo
```

https://github.com/user-attachments/assets/68348025-5b9d-4bda-ace4-fe5a13c53175

### Infinite Error Budget

We are unsure why this is happening. For example, the SLO for another service has been showing an infinite error budget for the past two weeks, whereas previously it displayed a numeric value.

<img width="1916" height="691" alt="Image" src="https://github.com/user-attachments/assets/6c35413a-3d02-46f5-b0b1-10d081e93157" />

I have checked all underlying recording rules by executing them in PromQL to see their evaluations, but I still can't pinpoint where things are going wrong.

Can you share some insights on why this is happening and how to prevent it in the future?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

infinite or negative infinite error budget #749

Setup

Negative Infinite Error Budget

Infinite Error Budget

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

infinite or negative infinite error budget #749

Description

Setup

Negative Infinite Error Budget

Infinite Error Budget

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions