Skip to content

otel_metrics task is too expensive — high DB CPU usage and long execution time #7475

@git-hyagi

Description

@git-hyagi

The otel_metrics scheduled task (pulpcore.app.tasks.telemetry.otel_metrics) runs every 5 minutes and currently takes approximately 1 minute and 30 seconds to complete. During execution, the underlying query consumes ~3 vCPUs from the database.

Here is where we believe the issue is happening (pulpcore/app/tasks/telemetry.py:31-33):

  space_usage_per_domain = Artifact.objects.values("pulp_domain__name").annotate(
      total_size=Sum("size", default=0)
  )
  • A telemetry task that is meant to be lightweight is placing significant load on the database.
  • 3 vCPU consumption for a single periodic query reduces capacity available for actual content operations (sync, publish, etc.).
  • The task takes 1m30s out of every 5-minute cycle (30% duty cycle), meaning a worker is occupied with telemetry for a disproportionate amount of time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions