Skip to content

[observability escalation] Smoke test workflows repeatedly exceed resource and control thresholds (Smoke Claude, Smoke Copilot) #23528

@github-actions

Description

@github-actions

Two smoke test workflows have crossed escalation thresholds 3 and/or 4 across their last two runs in the Mar 16–30 window. Because these are smoke tests — which should be lean, directed, and narrowly scoped — sustained heavy resource profiles and write_heavy actuation are structurally suspect.

Workflows Requiring Attention

1. Smoke Claude — thresholds 3 (resource_heavy ×2) and 4 (poor_control ×2)

Both runs completed successfully but consumed ~1.09 M tokens each ($0.77–$0.90 per run) over 9 minutes, with 28–33 turns. The behavior fingerprint is exploratory, broad tools, write_heavy, heavy resource profile.

  • resource_heavy_for_domain:high — both runs (§23730204448, §23730326948)
  • poor_agentic_control:medium — both runs
  • partially_reducible:medium — both runs
  • Second run classified changed with turns_increase against a cohort baseline

Recommended action: Review the Smoke Claude prompt and task scope. A smoke test should validate basic engine connectivity, not perform full exploratory triage. Consider splitting into a lightweight connectivity check (lean + read) and a separate heavier integration test if broad exploration is genuinely needed.

2. Smoke Copilot — threshold 3 (resource_heavy ×2), and poor_agentic_control:high on most recent run

Both runs show resource_heavy_for_domain:medium despite agentic_fraction=0 (no tracked Copilot turns), with write_heavy actuation and 6.8–7.0 min duration. The most recent run additionally carries poor_agentic_control:high — the highest control severity in the dataset.

  • resource_heavy_for_domain:medium — both runs (§23730204451, §23730326985)
  • poor_agentic_control:high — most recent run
  • Missing tool: Serena MCP server (activate_project, find_symbol) — most recent run

Recommended action: Investigate why the workflow carries a write-heavy actuation profile even with agentic_fraction=0. Review whether the Serena MCP server dependency is intentional and, if so, add a graceful guard when it is unavailable (the tool was absent from the Actions environment).

Evidence Summary

Workflow Threshold crossed Severity Runs
Smoke Claude resource_heavy_for_domain (×2) high 2/2
Smoke Claude poor_agentic_control (×2) medium 2/2
Smoke Copilot resource_heavy_for_domain (×2) medium 2/2
Smoke Copilot poor_agentic_control high 1/2

Suggested Route

workflow:Smoke Claude and workflow:Smoke Copilot — workflow owners should review prompt scope and execution posture. No broader orchestration chain involved (both are standalone episodes, confidence=high).

Full detail is in the linked observability discussion for Mar 16–30, 2026.

Generated by Agentic Observability Kit ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions