-
Notifications
You must be signed in to change notification settings - Fork 316
Description
Two smoke test workflows have crossed escalation thresholds 3 and/or 4 across their last two runs in the Mar 16–30 window. Because these are smoke tests — which should be lean, directed, and narrowly scoped — sustained heavy resource profiles and write_heavy actuation are structurally suspect.
Workflows Requiring Attention
1. Smoke Claude — thresholds 3 (resource_heavy ×2) and 4 (poor_control ×2)
Both runs completed successfully but consumed ~1.09 M tokens each ($0.77–$0.90 per run) over 9 minutes, with 28–33 turns. The behavior fingerprint is exploratory, broad tools, write_heavy, heavy resource profile.
resource_heavy_for_domain:high— both runs (§23730204448, §23730326948)poor_agentic_control:medium— both runspartially_reducible:medium— both runs- Second run classified
changedwithturns_increaseagainst a cohort baseline
Recommended action: Review the Smoke Claude prompt and task scope. A smoke test should validate basic engine connectivity, not perform full exploratory triage. Consider splitting into a lightweight connectivity check (lean + read) and a separate heavier integration test if broad exploration is genuinely needed.
2. Smoke Copilot — threshold 3 (resource_heavy ×2), and poor_agentic_control:high on most recent run
Both runs show resource_heavy_for_domain:medium despite agentic_fraction=0 (no tracked Copilot turns), with write_heavy actuation and 6.8–7.0 min duration. The most recent run additionally carries poor_agentic_control:high — the highest control severity in the dataset.
resource_heavy_for_domain:medium— both runs (§23730204451, §23730326985)poor_agentic_control:high— most recent run- Missing tool: Serena MCP server (
activate_project,find_symbol) — most recent run
Recommended action: Investigate why the workflow carries a write-heavy actuation profile even with agentic_fraction=0. Review whether the Serena MCP server dependency is intentional and, if so, add a graceful guard when it is unavailable (the tool was absent from the Actions environment).
Evidence Summary
| Workflow | Threshold crossed | Severity | Runs |
|---|---|---|---|
| Smoke Claude | resource_heavy_for_domain (×2) | high | 2/2 |
| Smoke Claude | poor_agentic_control (×2) | medium | 2/2 |
| Smoke Copilot | resource_heavy_for_domain (×2) | medium | 2/2 |
| Smoke Copilot | poor_agentic_control | high | 1/2 |
Suggested Route
workflow:Smoke Claude and workflow:Smoke Copilot — workflow owners should review prompt scope and execution posture. No broader orchestration chain involved (both are standalone episodes, confidence=high).
Full detail is in the linked observability discussion for Mar 16–30, 2026.
Generated by Agentic Observability Kit · ◷