Skip to content

Conversation

@csfet9
Copy link

@csfet9 csfet9 commented Dec 16, 2025

Summary

Adds support for Anthropic Claude and LM Studio as LLM providers for Hindsight.

Changes

  • Anthropic Provider: Full async support for Claude models (Sonnet 4, Haiku 4.5, etc.)
  • LM Studio Provider: Support for local model inference via LM Studio's OpenAI-compatible API
  • JSON Compatibility Fix: Handle markdown-wrapped JSON responses from local models
  • Documentation: Updated .env.example with configuration examples for all providers

Tested With

  • ✅ Claude Sonnet 4 (claude-sonnet-4-20250514)
  • ✅ Claude Haiku 4.5 (claude-haiku-4-5-20251001)
  • ✅ Qwen 30B via LM Studio

Example Configuration

# Anthropic
HINDSIGHT_API_LLM_PROVIDER=anthropic
HINDSIGHT_API_LLM_API_KEY=your-anthropic-key
HINDSIGHT_API_LLM_MODEL=claude-haiku-4-5-20251001

# LM Studio
HINDSIGHT_API_LLM_PROVIDER=lmstudio
HINDSIGHT_API_LLM_API_KEY=lmstudio
HINDSIGHT_API_LLM_BASE_URL=http://localhost:1234/v1
HINDSIGHT_API_LLM_MODEL=your-model-name

Files Changed

  • hindsight-api/pyproject.toml - Added anthropic dependency
  • hindsight-api/hindsight_api/engine/llm_wrapper.py - Anthropic + LM Studio implementation
  • hindsight-api/hindsight_api/config.py - LM Studio default URL
  • hindsight/hindsight/server.py - Updated docstrings
  • .env.example - Configuration examples

- Add Anthropic as LLM provider with full async support
- Add LM Studio provider for local model inference
- Fix JSON response format compatibility for local models
- Update .env.example with configuration examples
- Update docstrings with all supported providers

Tested with:
- Claude Sonnet 4 (claude-sonnet-4-20250514)
- Claude Haiku 4.5 (claude-haiku-4-5-20251001)
- Qwen 30B via LM Studio
Copy link
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! I've left 2 small comments

csfet9 and others added 6 commits December 17, 2025 16:14
Add configurable timeout support for LLM API calls:
- Environment variable override via HINDSIGHT_API_LLM_TIMEOUT
- Dynamic heuristic for lmstudio/ollama: 20 mins for large models
  (30b, 33b, 34b, 65b, 70b, 72b, 8x7b, 8x22b), 5 mins for others
- Pass timeout to Anthropic, OpenAI, and local model clients

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove CLAUDE.md from .gitignore (should stay in repository)
- Pass max_completion_tokens to _call_anthropic instead of hardcoding 4096

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Provides project context and development commands for AI-assisted coding.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add docker-compose.yml for local development
- Add test_internal.py for local testing
- Sync uv.lock and llm_wrapper.py changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
csfet9 and others added 2 commits December 19, 2025 21:20
- Move LLM config to config.py with HINDSIGHT_API_ prefix
  - Add HINDSIGHT_API_LLM_MAX_CONCURRENT (default: 32)
  - Add HINDSIGHT_API_LLM_TIMEOUT (default: 120s)
- Remove fragile model-size timeout heuristic
- Apply markdown JSON extraction to all providers, not just local
- Fix Anthropic markdown extraction bug (missing split)
- Change LLM request/response logs from info to debug level

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@csfet9
Copy link
Author

csfet9 commented Dec 19, 2025

Thanks for the review! I've addressed all the feedback:

Changes made:
Moved LLM_MAX_CONCURRENT to config.py with proper HINDSIGHT_API_ prefix
Added HINDSIGHT_API_LLM_TIMEOUT config (removed the fragile model-size heuristic)
Applied markdown JSON extraction to all providers (not just local)
Fixed a bug in Anthropic's markdown extraction (was missing .split("```")[0])
Changed logger.info to logger.debug for LLM request/response logs
Removed the local dev docker-compose.yml

Testing note: Hindsight is working well with qwen/qwen3-vl-8b via LM Studio.

csfet9 and others added 3 commits December 19, 2025 21:35
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! we're almost there!

- Remove test_internal.py (debug file)
- Remove docker-compose.yml (to be moved to hindsight-cookbook repo)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants