raphaeltm · simple-agent-manager · Jun 2, 2026 · Jun 2, 2026
diff --git a/.claude/commands/workflow.md b/.claude/commands/workflow.md
@@ -35,22 +35,27 @@ The SAM control plane monitors ACP sessions for activity. If your session appear
    # Workflow State
 
    ## Goal
+
    <one-line summary>
 
    ## Subtasks
-   | # | Description | Task ID | Status | Branch | Notes |
-   |---|------------|---------|--------|--------|-------|
-   | 1 | ... | pending | ... | ... | ... |
-   | 2 | ... | pending | ... | ... | ... |
+
+   | #   | Description | Task ID | Status | Branch | Notes |
+   | --- | ----------- | ------- | ------ | ------ | ----- |
+   | 1   | ...         | pending | ...    | ...    | ...   |
+   | 2   | ...         | pending | ...    | ...    | ...   |
 
    ## Dependencies
+
    - Task 2 depends on Task 1
    - Tasks 3 and 4 can run in parallel
 
    ## Poll Count
+
    0
 
    ## Last Poll
+
    (not yet)
    ```
 
@@ -71,13 +76,16 @@ For each subtask that has no unmet dependencies:
 
 3. **Verify dispatch succeeded** — call `get_task_details` on the returned task ID within 10 seconds to confirm it was picked up. If it wasn't, retry once, then report the failure.
 
+   Before retrying the same prompt, inspect the failed task/session and check `list_tasks`/`list_project_agents` for active duplicates with the same title, branch, prompt, or PR. If a duplicate is already running, coordinate with it instead of creating another copy. Do not blindly redispatch after no-workspace/startup failures or transient provider failures.
+
 4. **Call `update_task_status`** after each dispatch: "Dispatched subtask N: <description>"
 
 ---
 
 ## Phase 3: Foreground Polling Loop (CRITICAL)
 
 This is the most important phase. You MUST poll actively to:
+
 - Keep the session alive (prevent timeout kills)
 - Detect subtask completion and trigger dependent work
 - Report progress to the user
@@ -101,7 +109,8 @@ REPEAT until all subtasks are complete or failed:
        - Call get_peer_agent_output(taskId) to review the result
     6. If any subtask failed:
        - Review the failure via get_task_details
-       - Decide: retry_subtask with adjusted description, or mark as failed
+       - Check for duplicate active work with the same prompt, branch, title, or PR
+       - Decide: retry with adjusted description only after diagnosing the failure, or mark as failed
        - Update .workflow-state.md
     7. If all subtasks are complete: exit loop
     8. If all remaining subtasks are failed and no retries are possible: exit loop
@@ -120,6 +129,7 @@ REPEAT until all subtasks are complete or failed:
 ### What to Do If Context Feels Fuzzy
 
 If after context compaction you're unsure what's happening:
+
 1. Read `.workflow-state.md` — it has the complete state
 2. Call `list_tasks` to see all your subtasks
 3. Call `get_task_details` for each active subtask
@@ -147,22 +157,26 @@ When all subtasks are complete (or all remaining ones have permanently failed):
 ## Handling Common Scenarios
 
 ### Subtask produces a PR that needs to merge before the next step
+
 - After the subtask completes, check if it created a PR via `get_task_details`
 - If the PR is merged, proceed with dependent subtasks
 - If the PR is open, note this in your status update — the dependent subtask should be dispatched to the PR's branch
 
 ### Subtask fails
+
 - Read the failure details via `get_task_details` and `get_peer_agent_output`
 - If it's a transient failure (timeout, resource issue), retry with `retry_subtask`
 - If it's a permanent failure (wrong approach, missing prerequisite), adjust the description and retry, or skip and note in the summary
 - Maximum 2 retries per subtask
 
 ### You're running out of time
+
 - Push all branches, update all task files
 - Call `update_task_status` with current state: what's done, what's in progress, what's remaining
 - Do NOT rush to merge incomplete work
 
 ### A subtask needs input from you
+
 - If a subtask calls `request_human_input`, you'll see a notification
 - Respond via `send_message_to_subtask` with the needed information
 - Resume your polling loop
@@ -174,12 +188,14 @@ When all subtasks are complete (or all remaining ones have permanently failed):
 User: "Refactor the auth middleware and update all routes that use it"
 
 Decomposition:
+
 1. Research current auth middleware usage (subtask)
 2. Implement new auth middleware (subtask, depends on 1)
 3. Update API routes to use new middleware (subtask, depends on 2)
 4. Update tests (subtask, depends on 2 and 3)
 
 Dispatch sequence:
+
 - Dispatch subtask 1 immediately
 - Poll every 300s until subtask 1 completes
 - Dispatch subtask 2 with subtask 1's output as context

diff --git a/.claude/rules/09-task-tracking.md b/.claude/rules/09-task-tracking.md
@@ -29,18 +29,18 @@ Findings that exist only in the Research section without a corresponding checkli
 
 Before moving ANY task from `tasks/active/` to `tasks/archive/`, you MUST run the `task-completion-validator` agent (`.claude/agents/task-completion-validator/`). This agent performs six cross-reference checks:
 
-| Check | What it catches |
-|-------|----------------|
-| **A: Research → Checklist** | Research findings that never became checklist items |
-| **B: Checklist → Diff** | Checklist items checked off but not actually in the code changes |
-| **C: Criteria → Tests** | Acceptance criteria with no test or manual verification |
-| **D: UI → Backend** | UI form fields that collect input but never send it to the API |
-| **E: Multi-Resource** | Selection functions that pick from a set without a discriminator |
-| **F: Vertical Slice** | Cross-boundary features tested only in isolation with empty mocks instead of vertical slice tests with realistic state (see `35-vertical-slice-testing.md`) |
+| Check                       | What it catches                                                                                                                                             |
+| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **A: Research → Checklist** | Research findings that never became checklist items                                                                                                         |
+| **B: Checklist → Diff**     | Checklist items checked off but not actually in the code changes                                                                                            |
+| **C: Criteria → Tests**     | Acceptance criteria with no test or manual verification                                                                                                     |
+| **D: UI → Backend**         | UI form fields that collect input but never send it to the API                                                                                              |
+| **E: Multi-Resource**       | Selection functions that pick from a set without a discriminator                                                                                            |
+| **F: Vertical Slice**       | Cross-boundary features tested only in isolation with empty mocks instead of vertical slice tests with realistic state (see `35-vertical-slice-testing.md`) |
 
 ### Validation Rules
 
-- **CRITICAL/HIGH findings block merge.** Fix them in the branch before merging. Filing a backlog task is NOT an acceptable alternative — the validator exists to catch gaps *before* they ship, not to generate follow-up work. The only exception is explicit human approval to defer a specific finding.
+- **CRITICAL/HIGH findings block merge.** Fix them in the branch before merging. Filing a backlog task is NOT an acceptable alternative — the validator exists to catch gaps _before_ they ship, not to generate follow-up work. The only exception is explicit human approval to defer a specific finding.
 - **A validator FAIL means the task is not complete.** Return to implementation. Do NOT proceed to PR creation or merge.
 - **Do NOT rationalize gaps.** "It works when I test it manually" is not an answer to "no test covers this acceptance criterion." Either add the test or document the manual verification with evidence.
 - **"Fix or defer" is not a real choice.** If you have time to write a backlog task file, you have time to write the test or fix the gap. The backlog escape hatch has been abused in every case where it was used (PR #568, PR #570) — the follow-up tasks add friction and delay but deliver the same work that should have been done in the original PR.
@@ -54,6 +54,7 @@ Before moving ANY task from `tasks/active/` to `tasks/archive/`, you MUST run th
 ## Acceptance Criteria Must Be Testable
 
 When writing acceptance criteria, each criterion must be verifiable by at least one of:
+
 - An automated test (unit, integration, or E2E)
 - A documented manual verification with evidence (screenshot, API response, log output)
 
@@ -63,6 +64,12 @@ Criteria like "User with both providers can select which provider to use" requir
 
 When dispatching a task to another agent (via `dispatch_task` or any other mechanism), the task description MUST instruct the receiving agent to execute the work using the `/do` skill. The `/do` skill is the standard end-to-end workflow for implementing tasks — it handles research, planning, implementation, review, staging verification, and PR creation.
 
+### Read-Only Requests Are Not Implementation Tasks
+
+PR status, PR history, task status, and diagnostic/investigation questions are read-only by default. Answer them in the current session using SAM MCP tools, GitHub/`gh`, logs, and local repo evidence.
+
+Do not create a task file, branch, commit, or PR for a read-only status/history request unless the user explicitly asks for code changes, config changes, a durable artifact, or a delegated task. Repeated recent failures came from treating simple status/history questions as full SAM task executions, which created branches and failed sessions without improving the answer.
+
 ### How to Write Dispatch Descriptions
 
 Include an explicit instruction to use `/do` in the task description. Example:
@@ -92,6 +99,18 @@ If the requested specialist/profile is not available or cannot be observed from
 
 When a dispatched task returns, treat its output as usable only after checking that it came from the intended task/profile and respected the original constraints. If the result was produced by the wrong profile, ignored `draft PR`/`do not merge`, dropped the requested branch, or skipped `/do` when required, document the mismatch and do not use it as validation evidence.
 
+### Before Retrying a Failed Dispatch
+
+Before retrying or redispatching the same work after a SAM task fails, diagnose the failed start:
+
+- Call `get_task_details` for the failed task and read any output summary, branch, PR URL, and status evidence.
+- If there is a session, read enough messages to distinguish no-workspace/startup failure, transient provider error, human-cancel recovery, wrong profile, or real task failure.
+- Call `list_tasks`/`list_project_agents` to check for active duplicates with the same prompt, title, branch, or PR.
+- If an active duplicate exists, inspect or coordinate with it instead of creating another copy.
+- If the failure was a transient provider or platform startup issue, adjust the retry only after confirming the current platform behavior has not already fixed it.
+
+Do not blindly submit the same prompt repeatedly after no-workspace/startup failures, provider overloads, or immediately failed sessions. If the cheapest evidence does not reveal why the task failed, report the failure with the exact task IDs and observed state instead of multiplying duplicate tasks.
+
 ### Why This Matters
 
 Without the `/do` instruction, a dispatched agent may skip critical phases like staging verification, specialist review, or proper PR creation. The `/do` workflow enforces all quality gates defined in this project's rules.

diff --git a/.codex/prompts/workflow.md b/.codex/prompts/workflow.md
@@ -48,22 +48,27 @@ Untested assumptions are not blockers.
    # Workflow State
 
    ## Goal
+
    <one-line summary>
 
    ## Subtasks
-   | # | Description | Task ID | Status | Branch | Notes |
-   |---|------------|---------|--------|--------|-------|
-   | 1 | ... | pending | ... | ... | ... |
-   | 2 | ... | pending | ... | ... | ... |
+
+   | #   | Description | Task ID | Status | Branch | Notes |
+   | --- | ----------- | ------- | ------ | ------ | ----- |
+   | 1   | ...         | pending | ...    | ...    | ...   |
+   | 2   | ...         | pending | ...    | ...    | ...   |
 
    ## Dependencies
+
    - Task 2 depends on Task 1
    - Tasks 3 and 4 can run in parallel
 
    ## Poll Count
+
    0
 
    ## Last Poll
+
    (not yet)
    ```
 
@@ -92,13 +97,16 @@ For each subtask that has no unmet dependencies:
 
    If any of these checks fail, do not wait on the subtask. Re-dispatch with corrected instructions or report the failure with exact status evidence.
 
+   Before retrying the same prompt, inspect the failed task/session and check `list_tasks`/`list_project_agents` for active duplicates with the same title, branch, prompt, or PR. If a duplicate is already running, coordinate with it instead of creating another copy. Do not blindly redispatch after no-workspace/startup failures or transient provider failures.
+
 4. **Call `update_task_status`** after each dispatch: "Dispatched subtask N: <description>"
 
 ---
 
 ## Phase 3: Foreground Polling Loop (CRITICAL)
 
 This is the most important phase. You MUST poll actively to:
+
 - Keep the session alive (prevent timeout kills)
 - Detect subtask completion and trigger dependent work
 - Report progress to the user
@@ -122,7 +130,8 @@ REPEAT until all subtasks are complete or failed:
        - Call get_peer_agent_output(taskId) to review the result
     6. If any subtask failed:
        - Review the failure via get_task_details
-       - Decide: retry_subtask with adjusted description, or mark as failed
+       - Check for duplicate active work with the same prompt, branch, title, or PR
+       - Decide: retry with adjusted description only after diagnosing the failure, or mark as failed
        - Update .workflow-state.md
     7. If all subtasks are complete: exit loop
     8. If all remaining subtasks are failed and no retries are possible: exit loop
@@ -141,6 +150,7 @@ REPEAT until all subtasks are complete or failed:
 ### What to Do If Context Feels Fuzzy
 
 If after context compaction you're unsure what's happening:
+
 1. Read `.workflow-state.md` — it has the complete state
 2. Call `list_tasks` to see all your subtasks
 3. Call `get_task_details` for each active subtask
@@ -168,22 +178,26 @@ When all subtasks are complete (or all remaining ones have permanently failed):
 ## Handling Common Scenarios
 
 ### Subtask produces a PR that needs to merge before the next step
+
 - After the subtask completes, check if it created a PR via `get_task_details`
 - If the PR is merged, proceed with dependent subtasks
 - If the PR is open, note this in your status update — the dependent subtask should be dispatched to the PR's branch
 
 ### Subtask fails
+
 - Read the failure details via `get_task_details` and `get_peer_agent_output`
 - If it's a transient failure (timeout, resource issue), retry with `retry_subtask`
 - If it's a permanent failure (wrong approach, missing prerequisite), adjust the description and retry, or skip and note in the summary
 - Maximum 2 retries per subtask
 
 ### You're running out of time
+
 - Push all branches, update all task files
 - Call `update_task_status` with current state: what's done, what's in progress, what's remaining
 - Do NOT rush to merge incomplete work
 
 ### A subtask needs input from you
+
 - If a subtask calls `request_human_input`, you'll see a notification
 - Respond via `send_message_to_subtask` with the needed information
 - Resume your polling loop
@@ -195,12 +209,14 @@ When all subtasks are complete (or all remaining ones have permanently failed):
 User: "Refactor the auth middleware and update all routes that use it"
 
 Decomposition:
+
 1. Research current auth middleware usage (subtask)
 2. Implement new auth middleware (subtask, depends on 1)
 3. Update API routes to use new middleware (subtask, depends on 2)
 4. Update tests (subtask, depends on 2 and 3)
 
 Dispatch sequence:
+
 - Dispatch subtask 1 immediately
 - Poll every 300s until subtask 1 completes
 - Dispatch subtask 2 with subtask 1's output as context