Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .github/workflows/claude-code-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Claude Code Review

on:
pull_request:
types: [opened, synchronize, ready_for_review, reopened]
# Optional: Only run on specific file changes
# paths:
# - "src/**/*.ts"
# - "src/**/*.tsx"
# - "src/**/*.js"
# - "src/**/*.jsx"

jobs:
claude-review:
# Optional: Filter by PR author
# if: |
# github.event.pull_request.user.login == 'external-contributor' ||
# github.event.pull_request.user.login == 'new-developer' ||
# github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'

runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write

steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 1

- name: Run Claude Code Review
id: claude-review
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
plugin_marketplaces: 'https://github.com/anthropics/claude-code.git'
plugins: 'code-review@claude-code-plugins'
prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}'
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://code.claude.com/docs/en/cli-reference for available options

50 changes: 50 additions & 0 deletions .github/workflows/claude.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Claude Code

on:
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
issues:
types: [opened, assigned]
pull_request_review:
types: [submitted]

jobs:
claude:
if: |
(github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
(github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write
actions: read # Required for Claude to read CI results on PRs
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 1

- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}

# This is an optional setting that allows Claude to read CI results on PRs
additional_permissions: |
actions: read

# Optional: Give a custom prompt to Claude. If this is not specified, Claude will perform the instructions specified in the comment that tagged it.
# prompt: 'Update the pull request description to include a summary of changes.'

# Optional: Add claude_args to customize behavior and configuration
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://code.claude.com/docs/en/cli-reference for available options
# claude_args: '--allowed-tools Bash(gh pr:*)'

49 changes: 49 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Commands

```bash
# Install dependencies
uv sync

# Start dev server (from repo root)
cd backend && uv run uvicorn app:app --reload --port 8000

# Or use the startup script
./run.sh
```

Web UI: <http://localhost:8000> | API docs: <http://localhost:8000/docs>

Always use `uv run` to execute Python commands (never `pip` or bare `python`). For example: `uv run python some_script.py`, `uv run pytest`, `uv add <package>`.

No tests or linting are configured.

## Architecture

RAG chatbot that answers questions about course materials using ChromaDB vector search and Claude AI with tool-calling.

**Query flow:** Frontend `POST /api/query` → `app.py` → `RAGSystem.query()` → `AIGenerator` calls Claude API with `search_course_content` tool → Claude may invoke tool → `CourseSearchTool` → `VectorStore.search()` (ChromaDB) → results fed back to Claude for final answer → response with sources returned to frontend.

**Backend modules (all in `backend/`):**

- `app.py` — FastAPI entry point. Serves frontend static files from `../frontend`. On startup, auto-loads course docs from `../docs`. Two API endpoints: `POST /api/query`, `GET /api/courses`.
- `rag_system.py` — Orchestrator that wires together all components. The `query()` method is the main pipeline entry.
- `ai_generator.py` — Claude API wrapper. Sends user query with tool definitions, handles tool_use loop (executes tool, sends results back to Claude for final answer). System prompt is a static class variable.
- `search_tools.py` — `Tool` ABC + `CourseSearchTool` implementation + `ToolManager` registry. Tool tracks `last_sources` for the response; sources are reset after each query.
- `vector_store.py` — ChromaDB wrapper with two collections: `course_catalog` (course metadata, used for fuzzy course name resolution) and `course_content` (text chunks, used for semantic search). Uses `SentenceTransformerEmbeddingFunction`.
- `document_processor.py` — Reads `.txt/.pdf/.docx` files, extracts course/lesson metadata via regex, chunks text with sentence-boundary splitting.
- `session_manager.py` — In-memory conversation history per session. History is injected into the system prompt as plain text, not as message array.
- `models.py` — Pydantic models: `Course`, `Lesson`, `CourseChunk`.
- `config.py` — Reads `.env`, exposes constants (model names, chunk size/overlap, max results).

**Frontend (`frontend/`):** Vanilla HTML/CSS/JS single-page app. Uses `marked.js` for markdown rendering. Manages `currentSessionId` for multi-turn conversations.

**Key design details:**

- Course title is used as the unique identifier (ChromaDB document ID).
- `_resolve_course_name()` does a vector similarity search on `course_catalog` to fuzzy-match user-provided course names to actual titles.
- The AI generator makes two Claude API calls when tools are used: first with tools enabled, second without tools (using tool results as context).
- Conversation history is capped at `MAX_HISTORY * 2` messages (default: 4) and formatted as `"User: ...\nAssistant: ..."` strings appended to the system prompt.
31 changes: 30 additions & 1 deletion backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@
from pydantic import BaseModel
from typing import List, Optional
import os
import logging
import anthropic

logger = logging.getLogger(__name__)

from config import config
from rag_system import RAGSystem
Expand Down Expand Up @@ -70,8 +74,33 @@ async def query_documents(request: QueryRequest):
sources=sources,
session_id=session_id
)
except anthropic.AuthenticationError as e:
logger.error(f"Anthropic API authentication failed: {e}")
raise HTTPException(
status_code=503,
detail="AI service authentication failed. Please check that ANTHROPIC_API_KEY is set correctly in the .env file."
)
except anthropic.RateLimitError as e:
logger.error(f"Anthropic API rate limit: {e}")
raise HTTPException(
status_code=429,
detail="AI service rate limit reached. Please try again in a moment."
)
except anthropic.APIConnectionError as e:
logger.error(f"Anthropic API connection error: {e}")
raise HTTPException(
status_code=503,
detail="Could not connect to AI service. Please check your internet connection."
)
except anthropic.APIError as e:
logger.error(f"Anthropic API error: {e}")
raise HTTPException(
status_code=502,
detail=f"AI service error: {str(e)}"
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
logger.error(f"Unexpected error processing query: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Internal error: {str(e)}")

@app.get("/api/courses", response_model=CourseStats)
async def get_course_stats():
Expand Down
20 changes: 18 additions & 2 deletions backend/config.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
import os
from dataclasses import dataclass
from pathlib import Path
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()
# Search backend/ directory first, then project root
_backend_dir = Path(__file__).resolve().parent
_project_root = _backend_dir.parent

if (_backend_dir / ".env").exists():
load_dotenv(_backend_dir / ".env")
elif (_project_root / ".env").exists():
load_dotenv(_project_root / ".env")
else:
load_dotenv()

@dataclass
class Config:
Expand All @@ -26,4 +36,10 @@ class Config:

config = Config()


if not config.ANTHROPIC_API_KEY:
print("\n" + "=" * 60)
print("WARNING: ANTHROPIC_API_KEY is not set!")
print("The /api/query endpoint will fail.")
print("Please create a .env file with your API key.")
print("See .env.example for the expected format.")
print("=" * 60 + "\n")
6 changes: 5 additions & 1 deletion frontend/script.js
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,11 @@ async function sendMessage() {
})
});

if (!response.ok) throw new Error('Query failed');
if (!response.ok) {
const errorData = await response.json().catch(() => null);
const detail = errorData?.detail || `Server error (${response.status})`;
throw new Error(detail);
}

const data = await response.json();

Expand Down