-
Notifications
You must be signed in to change notification settings - Fork 39
fix: Fresh client per turn + custom restart tool for Claude SDK #502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This comment has been minimized.
This comment has been minimized.
Claude Code ReviewSummaryThis PR reverts persistent Claude SDK client functionality due to blocking behavior on second calls to Verdict: ✅ Approved with minor recommendations The solution is pragmatic and well-reasoned. The PR properly handles SDK limitations and includes good defensive programming patterns. Issues by Severity🚫 Blocker IssuesNone - Code is safe to merge. 🔴 Critical IssuesNone - No security vulnerabilities or critical bugs detected. 🟡 Major Issues1. Token Redaction Pattern Incomplete (Security)Location: if github_token:
error_msg = error_msg.replace(github_token, "***REDACTED***")
if gitlab_token:
error_msg = error_msg.replace(gitlab_token, "***REDACTED***")Issue: Token redaction only happens in the error case. However, tokens are embedded in URLs passed to Security Risk: Medium - tokens could leak in subprocess output if git prints the URL. Recommendation: Also redact tokens from stdout: stdout_str = stdout.decode()
stderr_str = stderr.decode()
if github_token:
stdout_str = stdout_str.replace(github_token, "***REDACTED***")
stderr_str = stderr_str.replace(github_token, "***REDACTED***")
if gitlab_token:
stdout_str = stdout_str.replace(gitlab_token, "***REDACTED***")
stderr_str = stderr_str.replace(gitlab_token, "***REDACTED***")
logger.error(f"Failed to clone repo: {stderr_str}")2. Race Condition in Feature Branch CreationLocation: Issue: If two calls to Impact: Low probability but could cause transient failures. Recommendation: Use a file lock around the clone operation: from filelock import FileLock
lock_path = repos_dir / f"{name}.lock"
async with FileLock(lock_path, timeout=30):
if repo_final.exists():
return True, str(repo_final), False
# ... rest of clone logicAlternative: Keep current implementation but document the idempotency behavior in a comment. 🔵 Minor Issues1. Incomplete DocstringLocation: async def disconnect_client(self) -> None:
"""
No-op method for compatibility (client is no longer persistent).
Previously used when client was persistent across runs.
Now each run creates and destroys its own client.
"""
logger.debug("disconnect_client() called but client is not persistent (no-op)")Issue: Method exists solely for backward compatibility but doesn't explain why it's still needed. Recommendation: Add context about callers: """
No-op method for backward compatibility.
This method is called by main.py in change_workflow(), add_repo(), and remove_repo()
to ensure clean state transitions. With the fresh-client-per-turn pattern, no cleanup
is needed since clients self-destruct after each run.
Previously used when client was persistent across runs.
"""2. Missing Error Context in Feature Branch CreationLocation: if checkout_process.returncode \!= 0:
logger.warning(f"Failed to create feature branch '{feature_branch}': {checkout_stderr.decode()}")
# Continue anyway - repo is still usable on the original branchIssue: Warning logged but no indication to user why they might see different branch than expected. Recommendation: Emit a RawEvent notification (similar to adapter.py patterns) so the user sees it in the UI: if checkout_process.returncode \!= 0:
logger.warning(f"Failed to create feature branch '{feature_branch}': {checkout_stderr.decode()}")
# Consider emitting event via adapter notification system3. Inconsistent Logging VerbosityLocation: async for message in client.receive_response():
message_count += 1
logger.info(f"[ClaudeSDKClient Message #{message_count}]: {message}")Issue: This will log every single streaming message at INFO level, which could be very noisy (potentially hundreds of messages per turn). Recommendation: Reduce to DEBUG level: logger.debug(f"[ClaudeSDKClient Message #{message_count}]: {message}")Or add a throttle: if message_count % 10 == 0: # Log every 10th message
logger.info(f"Processed {message_count} messages...")4. Shell Script Error HandlingLocation: if (cd "" && git checkout -b "" 2>&1); then
echo " ✓ Created feature branch: "Issue: If the branch already exists (e.g., from a previous failed run), Recommendation: if (cd "" && git checkout -B "" 2>&1); then
echo " ✓ Created/switched to feature branch: "Positive Highlights1. ✅ Excellent Problem AnalysisThe PR description clearly explains the root cause (SDK limitation) and trade-offs. This kind of transparency is valuable for future maintainers. 2. ✅ Proper Revert StrategyRather than attempting complex workarounds, the team recognized when to revert. This shows good engineering judgment. 3. ✅ Reconnection LogicLocation: The query reconnection handler is defensive and well-structured: try:
await client.query(prompt)
except Exception as query_error:
if any(keyword in error_str for keyword in ["connection", "subprocess", "pipe", "broken"]):
# Reconnect with fresh client
client = create_sdk_client(options)
await client.connect()
await client.query(prompt)
else:
raiseThis follows the error handling patterns from 4. ✅ Idempotency in add_repoLocation: The if was_newly_cloned:
# Only update state and notify if actually cloned
os.environ["REPOS_JSON"] = json.dumps(repos)
asyncio.create_task(trigger_repo_added_notification(name, url))
else:
logger.info(f"Repo '{name}' already existed, skipping notification (idempotent call)")This is good defensive programming for concurrent operations. 5. ✅ Race Condition FixLocation: Moving # Per-run state - NOT instance variables to avoid race conditions with concurrent runs
current_message_id: Optional[str] = NoneThis is exactly the right fix per the commit message. Prevents cross-contamination between concurrent runs. 6. ✅ Feature Branch AutomationThe automatic creation of RecommendationsPriority 1: Fix Token Redaction (Security)Address the incomplete token redaction in stdout (see Major Issue #1). Priority 2: Reduce Logging VerbosityChange streaming message logs from INFO to DEBUG level (see Minor Issue #3). Priority 3: Documentation
Priority 4: TestingConsider adding integration tests for:
Code Quality Assessment
Final Verdict✅ LGTM with minor fixes recommended This PR correctly addresses a blocking SDK issue and includes thoughtful improvements to git workflow management. The code follows established project patterns and includes good defensive programming practices. Suggested merge strategy:
Estimated time to address recommendations: 15 minutes References
🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
Claude Code ReviewSummaryThis PR reverts an attempted optimization to reuse Claude SDK clients across message turns, which caused the system to hang indefinitely on the second message. The root cause is a limitation in the Python Claude SDK where the receive_response() iterator cannot be reused. The solution is sound: revert to fresh-client-per-turn while using continue_conversation=True to maintain state via disk persistence. Verdict: Acceptable with minor improvements recommended. The revert addresses a critical blocking issue, and the trade-off (1-2s latency vs. working system) is appropriate. Issues by SeverityBlocker IssuesNone. The PR addresses a blocker (hanging on second message) and unblocks the system. Critical Issues1. Race Condition Risk with current_message_id (adapter.py:292-295) The PR converts _current_message_id and _current_tool_id from instance variables to local variables to avoid race conditions. However:
Recommendation: Search for all uses of _current_tool_id and convert them to local variables, or remove the instance variable declaration if unused. Major Issues1. Incomplete Error Handling in Reconnection Logic (adapter.py:527-550) The reconnection logic is good but has gaps:
Recommendation: Add retry limits (MAX_RETRIES = 2) and handle timeout errors 2. Logging Verbosity May Impact Performance (adapter.py:561-563) The PR adds per-message logging in the hot path. For sessions with many tool calls or long responses, this could generate massive logs and impact performance. Recommendation: Use logger.debug() instead or add conditional logging (every 10th message) 3. Feature Branch Creation Without Error Recovery (main.py:628-643) The feature branch creation logic logs a warning if it fails but continues anyway, and doesn't update the user or context about the actual branch being used. Recommendation: Return the actual branch name and update prompt context 4. Duplicate Logic Between hydrate.sh and clone_repo_at_runtime Both create feature branches with nearly identical logic. This violates DRY and creates maintenance burden. Recommendation: Extract branch creation logic to a shared shell function or Python helper. Minor Issues
Positive Highlights
RecommendationsPriority 1 (Should Fix Before Merge)
Priority 2 (Address in Follow-up)
Priority 3 (Nice-to-Have)
Testing RecommendationsAlready Tested: Multiple consecutive messages (no hanging) Recommended Additional Tests:
Final VerdictApprove with Recommendations This PR fixes a critical blocking issue and makes the system usable again. The trade-off between latency and correctness is appropriate. The suggested improvements are mostly polish and future-proofing, not blockers. Confidence Level: High (reviewed against established patterns, tested for hanging issue) Review conducted using memory system context: backend-development.md, error-handling.md, and Python configuration standards from CLAUDE.md 🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
Claude Code ReviewSummaryThis PR successfully fixes the SDK hanging issue on second messages by implementing a fresh client per turn pattern with disk-based conversation state. The approach is pragmatic and well-reasoned given SDK behavior. Additional improvements include race condition fixes, feature branch automation, and reconnection handling. Issues by Severity🚫 Blocker IssuesNone - No blocking issues found. 🔴 Critical IssuesNone - No critical security or correctness issues. 🟡 Major Issues1. Python Code Standards - Missing Docstrings (adapter.py:292-293) Issue: While the comment is helpful, Python standards (PEP 257) require docstrings for functions. The function docstring exists but does not mention the race condition fix. Recommendation: Update the function docstring to explicitly mention the race condition fix. 2. Error Handling - Broad Exception Catching (adapter.py:529-551) Issue: Catching Exception is too broad and relies on string matching for error classification. This could mask unexpected errors. Recommendation: Document the specific SDK exceptions, consider catching more specific exception types, add logging for the else branch before re-raising. 3. Git Operations Without Depth Limit (main.py:608, hydrate.sh:175) Issue: Removed --depth 1 to support branch operations, but this means cloning full history which could be slow/large for repos with extensive history. Recommendation: Use --depth 1 initially, then use git fetch --unshallow only if branch creation fails. 🔵 Minor Issues1. Logging Verbosity Change (adapter.py:560-561) Changed from logger.info to logger.debug for message streaming. This reduces visibility into message processing. Recommendation: Consider keeping INFO level but throttling or using structured logging levels. 2. Feature Branch Name Pattern (main.py:631, hydrate.sh:180) Uses AGENTIC_SESSION_NAME which is typically a UUID. Document this branch naming convention. 3. Race Condition Fix Documentation (adapter.py:73-75) The comment mentions both _current_message_id and _current_tool_id but only current_message_id is visible. Clean up the comment. 4. Idempotency Return Value (main.py:789) Good: The newly_cloned field helps callers understand if this was a new clone. Consider documenting this in the endpoint docstring. Positive Highlights✅ Excellent Problem Analysis: Clear explanation of root cause and trade-offs RecommendationsImmediate Actions (Before Merge)
Follow-up Tasks (After Merge)
Testing Checklist
Architecture CompliancePython Standards:
Error Handling:
Code Quality:
ConclusionRecommendation: APPROVE with minor improvements This PR effectively solves the SDK hanging issue with a well-reasoned approach. The race condition fix, feature branch automation, and idempotency handling are excellent. The suggested improvements are minor and could be addressed in follow-up work. The trade-off of 1-2 second overhead per message for reliability is acceptable given the alternative is a non-functional system. Key Strengths:
Minor Improvements:
Great work on identifying and fixing a critical reliability issue! 🎉 🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
…tory handling - Updated ClaudeCodeAdapter to use local variables for message IDs to prevent race conditions during concurrent runs. - Introduced a new `restart_session_tool` for handling session restarts via MCP tools. - Enhanced repository cloning logic to create and checkout feature branches named 'ambient/<session-id>' when cloning at runtime. - Modified the add_repo function to only trigger notifications for newly cloned repositories, preventing duplicate notifications. - Improved logging for better observability of session and repository operations. This refactor enhances the reliability and clarity of session management and repository interactions within the ClaudeCodeAdapter.
07a8d79 to
9e4c4c0
Compare
Claude Code ReviewSummaryFresh client per turn pattern fixes SDK hanging. Well-documented with race condition fixes, feature branch automation, and idempotency improvements. Issues by Severity🚫 Blocker IssuesNone 🔴 Critical IssuesNone 🟡 Major Issues1. Git Clone Performance (main.py:608) 2. Branch Failure Handling (main.py:641-645) 🔵 Minor Issues
Positive Highlights✅ Excellent problem analysis RecommendationsBefore Merge: Restore --depth 1 with fallback, update workspace prompt After Merge: Add test coverage, monitor overhead ConclusionAPPROVE with Minor Improvements - Fixes critical SDK issue. Priority: git clone performance. 🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
sallyom
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only auto-generate a new branch if a user does not provide a branch in the context modal. Other than that this LGTM
| if repos_cfg: | ||
| session_id = os.getenv('AGENTIC_SESSION_NAME', '').strip() | ||
| feature_branch = f"ambient/{session_id}" if session_id else None | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update #498 to use this branch name - but I think we don't need to always auto-create a branch - we should only auto-create a branch if a user does not provide a branch name in the context modal.
Summary
Fixes hanging behaviour on second message by ensuring fresh Claude SDK client per message turn, plus adds a custom MCP tool for Claude to restart itself.
What Changed
Fresh Client Per Turn Pattern
ClaudeSDKClient(connect → query → receive → disconnect)continue_conversation=Truemaintains state from disk (.claude/directory)receive_response()hangs indefinitely on subsequent callsRace Condition Fix
current_message_idfrom instance variable to local variable in_run_claude_agent_sdkCustom Restart MCP Tool (NEW)
mcp__session__restart_sessiontool that Claude can call to request a session restart@tooldecorator andcreate_sdk_mcp_server()from the Claude Agent SDKsession_restart_requestedevent after run completesFeature Branch on Runtime Clone
clone_repo_at_runtimecreatesambient/<session-id>branches when repos are added dynamicallyhydrate.shstartup (repos stay on original branch at init)Idempotent Repo Addition
/add-repoendpoint now returnswas_newly_clonedflagTrade-offs
Performance:
Correctness:
Testing
Tested with multiple consecutive messages - no hanging, proper conversation flow maintained.