Problem
When a multi-agent orchestrator session's ID is shared with a running Copilot CLI terminal session, PolyPilot cannot restore the orchestrator on restart. The SDK's ResumeSessionAsync reports the session as "corrupted" because the events.jsonl file is locked by the CLI process.
Root Cause
- PolyPilot creates an orchestrator session (e.g.,
Implement & Challenge-orchestrator) via the Copilot SDK
- A Copilot CLI terminal session takes over that same session ID (writes to the same
events.jsonl)
- The CLI process holds a file lock (
inuse.<pid>.lock) on the session directory
- On PolyPilot restart,
RestorePreviousSessionsAsync calls ResumeSessionAsync which fails because the events file is locked by the CLI process
- The SDK error message says "session file is corrupted" — misleading, since the data isn't actually corrupt
Current Behavior
- The orchestrator session silently disappears from the multi-agent group after restart
- UI shows "Session data appears corrupted" if user manually tries to resume
- The multi-agent group is left without an orchestrator, making it non-functional
Expected Behavior
- PolyPilot should detect the lock conflict and either:
- a) Show a clear message: "Session is locked by Copilot CLI (PID XXXX). Close the CLI session to restore." with an option to force-create a new session
- b) Automatically detect the lock file, check if the owning PID is still alive, and create a fresh orchestrator session if the lock is stale
- c) Avoid sharing session IDs between PolyPilot-managed sessions and external Copilot CLI sessions entirely
Steps to Reproduce
- Create a multi-agent team (e.g., "MultiAgentPRetty") with OrchestratorReflect mode
- Open a Copilot CLI terminal that uses the orchestrator's session ID
- Restart PolyPilot (e.g., via
relaunch.ps1)
- Observe the orchestrator session is missing from the group
Technical Details
- Session ID:
74895dba-fd61-4a73-9a5d-4576a146aa0b
- Lock file:
~/.copilot/session-state/<id>/inuse.<pid>.lock
- Events file: 15MB
events.jsonl (actively written by CLI)
- The
IsCorruptSessionError check in SessionSidebar.razor catches this case
- Current fallback in
RestorePreviousSessionsAsync only handles "Session not found", not corruption/lock errors
Suggested Approach
- In
RestorePreviousSessionsAsync, before calling ResumeSessionAsync, check for inuse.*.lock files in the session directory. If a lock exists and the PID is alive, skip resume and log a warning.
- For multi-agent orchestrator sessions specifically, consider always creating a fresh session on restore (orchestrators are stateless planners — conversation history isn't critical).
- Improve the SDK error message or add PolyPilot-side lock detection to give users actionable information.
Related
Problem
When a multi-agent orchestrator session's ID is shared with a running Copilot CLI terminal session, PolyPilot cannot restore the orchestrator on restart. The SDK's
ResumeSessionAsyncreports the session as "corrupted" because theevents.jsonlfile is locked by the CLI process.Root Cause
Implement & Challenge-orchestrator) via the Copilot SDKevents.jsonl)inuse.<pid>.lock) on the session directoryRestorePreviousSessionsAsynccallsResumeSessionAsyncwhich fails because the events file is locked by the CLI processCurrent Behavior
Expected Behavior
Steps to Reproduce
relaunch.ps1)Technical Details
74895dba-fd61-4a73-9a5d-4576a146aa0b~/.copilot/session-state/<id>/inuse.<pid>.lockevents.jsonl(actively written by CLI)IsCorruptSessionErrorcheck inSessionSidebar.razorcatches this caseRestorePreviousSessionsAsynconly handles "Session not found", not corruption/lock errorsSuggested Approach
RestorePreviousSessionsAsync, before callingResumeSessionAsync, check forinuse.*.lockfiles in the session directory. If a lock exists and the PID is alive, skip resume and log a warning.Related
b45f88c: improved UI error message as interim fix