Skip to content

sutheesh/OpenChat

Repository files navigation

πŸ€– AI Chat Assistant with Weather + Confluence RAG

A local AI chat assistant built with Qwen 2.5-1.5B, featuring real-time weather via Apify MCP and semantic search over any website/Confluence using RAG (ChromaDB).


✨ Features

  • πŸ’¬ AI Chat β€” Qwen 2.5-1.5B running fully on CPU (Mac Intel)
  • 🌀️ Weather Tool β€” Live weather via Apify MCP (Model Context Protocol)
  • πŸ“š Confluence RAG β€” Semantic search over any website or Confluence instance
  • πŸ“‘ Streaming β€” Token-by-token streaming responses
  • πŸ”§ Pattern 2 Tool Calling β€” Model decides when to use tools (no hardcoded rules)

πŸ—‚οΈ Project Structure

β”œβ”€β”€ app.py                      # Main Flask app (entry point)
β”œβ”€β”€ apify_weather_mcp.py        # Apify Weather MCP client
β”œβ”€β”€ confluence_crawler.py       # Recursive web crawler
β”œβ”€β”€ confluence_rag.py           # Chunking, embedding, ChromaDB search
β”œβ”€β”€ confluence_tool.py          # search_confluence() tool definition
β”œβ”€β”€ ingest.py                   # One-time crawl + index script
β”œβ”€β”€ templates/
β”‚   └── index.html              # Chat UI
β”œβ”€β”€ static/                     # CSS, JS assets
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ .env.example                # Environment variable template
β”œβ”€β”€ .gitignore
└── README.md

πŸš€ Quick Start

1. Clone the repo

git clone https://github.com/YOUR_USERNAME/ai-chat-assistant.git
cd ai-chat-assistant

2. Create virtual environment

python -m venv venv
source venv/bin/activate        # Mac/Linux
# venv\Scripts\activate         # Windows

3. Install dependencies

pip install -r requirements.txt

4. Set environment variables

cp .env.example .env
# Edit .env and add your APIFY_TOKEN

5. Build the knowledge base (one time)

# Crawl and index the default URL
python ingest.py

# Or use your own URL
python ingest.py --url https://your-confluence.atlassian.net/wiki/spaces/SPACE

# Limit pages for a quick test
python ingest.py --max-pages 20

6. Run the app

python app.py

Open http://localhost:8000 in your browser.


βš™οΈ Environment Variables

Create a .env file (copy from .env.example):

APIFY_TOKEN=<get your token>

Get your Apify token at: https://console.apify.com/account/integrations


πŸ”§ Configuration

All settings are in app.py under CONFIG:

Key Default Description
base_model Qwen/Qwen2.5-1.5B-Instruct HuggingFace model
max_length 200 Max tokens in response
temperature 0.7 Response creativity
tool_temperature 0.0 Tool decision (deterministic)
max_tool_rounds 3 Max tool call iterations

RAG settings are in confluence_rag.py:

Key Default Description
CHUNK_SIZE 500 Characters per chunk
CHUNK_OVERLAP 100 Overlap between chunks
EMBEDDING_MODEL all-MiniLM-L6-v2 Sentence transformer model

πŸ“š Knowledge Base Management

# Initial crawl + index
python ingest.py --url https://your-site.com/docs

# Re-index without re-crawling (uses cached crawl_results.json)
python ingest.py --from-file crawl_results.json

# Wipe and re-index from scratch
python ingest.py --reset

# Check how many chunks are indexed
python -c "from confluence_rag import ConfluenceRAG; r = ConfluenceRAG(); print(r.stats())"

Refresh via API endpoint

curl -X POST http://localhost:8000/api/refresh-kb

πŸ” Using a Private Confluence Instance

If you have a real Atlassian Confluence Cloud instance, update confluence_crawler.py:

import base64

EMAIL = "your@email.com"
API_TOKEN = "your_confluence_api_token"
credentials = base64.b64encode(f"{EMAIL}:{API_TOKEN}".encode()).decode()

headers = {
    "Authorization": f"Basic {credentials}",
    "User-Agent": "RAG-Crawler/1.0"
}

Get your Confluence API token: https://id.atlassian.com/manage-profile/security/api-tokens


πŸ—οΈ Architecture

User Query
    ↓
Flask Backend (app.py)
    ↓
Qwen 2.5-1.5B decides:
    β”œβ”€β”€ Weather query?   β†’ Apify MCP β†’ Open-Meteo API
    β”œβ”€β”€ Docs/knowledge?  β†’ ChromaDB semantic search β†’ Top 5 chunks
    └── Direct answer?   β†’ Stream response
    ↓
Final answer streamed token by token

Architecture (Full)

Architecture

1. System Architecture (Full)

graph TB
    subgraph FRONTEND["β‘  FRONTEND β€” HTML/CSS/JS"]
        UI["πŸ’¬ Chat UI<br/>index.html"]
        FETCH["πŸ“‘ Fetch API<br/>POST /api/chat"]
        SSE["πŸ”„ SSE Stream Reader<br/>token-by-token"]
        STATUS["πŸ“Š Status Check<br/>/api/status"]
    end

    subgraph FLASK["β‘‘ FLASK BACKEND β€” app.py"]
        ROUTE["πŸ”€ Route Handler<br/>/api/chat"]
        STREAM["πŸ“€ Streaming Response<br/>text/event-stream"]
        DISPATCH["βš™οΈ Tool Dispatcher<br/>call_tool()"]
        REFRESH["πŸ” Refresh KB<br/>/api/refresh-kb"]
    end

    subgraph MODEL["β‘’ AI MODEL β€” Qwen 2.5-1.5B CPU"]
        QWEN["🧠 Qwen 2.5-1.5B Instruct<br/>~3GB RAM · 2-5 tok/sec"]
        DECIDE["πŸ€” decide_tool_or_answer()<br/>tool_temperature=0.0"]
        STREAMER["⚑ TextIteratorStreamer<br/>token streaming"]
        TEMPLATE["πŸ“ Chat Template<br/>apply_chat_template()"]
    end

    subgraph TOOLS["β‘£ TOOLS LAYER"]
        subgraph WEATHER_T["🌀️ Weather Tool"]
            WT["get_weather(city)<br/>β†’ ApifyWeatherMCP<br/>β†’ new event loop"]
        end
        subgraph RAG_T["πŸ“š Confluence RAG Tool"]
            RT["search_confluence(query)<br/>β†’ ChromaDB semantic search<br/>β†’ Top 5 chunks + URLs"]
        end
    end

    subgraph EXTERNAL["β‘€ EXTERNAL β€” APIs & Storage"]
        APIFY["☁️ Apify MCP Server<br/>jiri-spilka/weather-mcp-server<br/>Streamable HTTP · JSON-RPC 2.0"]
        METEO["🌍 Open-Meteo API<br/>open-meteo.com<br/>Free · No key needed"]
        CHROMA["πŸ—„οΈ ChromaDB Local<br/>./chroma_db/<br/>1433 chunks Β· cosine similarity"]
        EMBED["πŸ”’ Sentence Transformers<br/>all-MiniLM-L6-v2<br/>80MB Β· 384 dimensions"]
    end

    UI -->|user message| FETCH
    FETCH -->|HTTP POST| ROUTE
    ROUTE --> STREAM
    STREAM -->|SSE tokens| SSE
    STATUS -.->|health check| ROUTE

    ROUTE --> DISPATCH
    DISPATCH --> DECIDE
    DECIDE --> TEMPLATE
    TEMPLATE --> QWEN
    QWEN --> STREAMER
    STREAMER -->|stream tokens| STREAM

    DECIDE -->|tool call JSON| DISPATCH
    DISPATCH -->|get_weather| WT
    DISPATCH -->|search_confluence| RT
    REFRESH -.->|re-index| RT

    WT -->|async MCP call| APIFY
    APIFY -->|forwards| METEO
    METEO -->|weather data| APIFY
    APIFY -->|SSE response| WT

    RT -->|embed query| EMBED
    EMBED -->|query vector| CHROMA
    CHROMA -->|top 5 chunks + URLs| RT


Loading

2. Tool Calling Flow

sequenceDiagram
    actor User
    participant UI as Chat UI
    participant Flask as Flask Backend
    participant Model as Qwen 2.5-1.5B
    participant Tool as Tool Layer
    participant Ext as External API/DB

    User->>UI: "How do I connect Confluence with Slack?"
    UI->>Flask: POST /api/chat {message, stream:true}

    Flask->>Model: build_chat_prompt(messages + tool defs)
    Model-->>Flask: {"tool_call": {"name": "search_confluence", "arguments": {"query": "..."}}}

    Flask-->>UI: SSE: "Let me search the knowledge base..."
    Note over Flask,UI: Stream acknowledgment token by token

    Flask->>Tool: call_tool("search_confluence", {query})
    Tool->>Ext: embed(query) β†’ ChromaDB.query(top_k=5)
    Ext-->>Tool: [{text, url, title, score}, ...]
    Tool-->>Flask: formatted context + source URLs

    Flask->>Model: inject tool result into context
    Model-->>Flask: final answer tokens (streaming)
    Flask-->>UI: SSE: answer with source links
    UI-->>User: Rendered response + links
Loading

3. RAG Ingestion Pipeline

flowchart LR
    A["πŸ”— Base URL<br/>confluence/resources"] 
    --> B["πŸ•·οΈ confluence_crawler.py<br/>Recursive BFS crawl<br/>same-domain Β· path-scoped"]
    --> C["πŸ“„ Raw HTML Pages<br/>title + text + url"]
    --> D["βœ‚οΈ Chunker<br/>500 chars Β· 100 overlap"]
    --> E["πŸ”’ SentenceTransformer<br/>all-MiniLM-L6-v2<br/>β†’ 384-dim vectors"]
    --> F["πŸ—„οΈ ChromaDB<br/>cosine similarity index<br/>1433 chunks stored"]

    G["πŸ’Ύ crawl_results.json<br/>cache"] -.->|skip re-crawl| D
    H["ingest.py --reset"] -.->|wipe + reindex| F

Loading

4. Confluence RAG Query Flow

flowchart TD
    Q["User Query"] --> E1["Embed query<br/>all-MiniLM-L6-v2"]
    E1 --> VS["ChromaDB<br/>cosine similarity search"]
    VS --> R["Top 5 chunks<br/>with scores + URLs"]
    R --> CTX["format_context()<br/>chunk text + source links"]
    CTX --> INJ["Inject into<br/>model context"]
    INJ --> ANS["Qwen generates<br/>answer with citations"]

Loading

5. Flow Initialization

flowchart LR
    REQ["Incoming Request"]
    --> CK1{"model_loaded?"}

    CK1 -->|No| LM["load_model_once()<br/>Qwen 2.5-1.5B"]
    CK1 -->|Yes| USE["Use cached model"]
    LM --> USE

    USE --> CK2{"tool needed?"}
    CK2 -->|weather| CK3{"weather_mcp?"}
    CK2 -->|confluence| CK4{"confluence_rag?"}

    CK3 -->|No| LW["get_weather_mcp()<br/>ApifyWeatherMCP()"]
    CK3 -->|Yes| UW["Use cached MCP"]
    LW --> UW

    CK4 -->|No| LR["get_confluence_rag()<br/>ConfluenceRAG()"]
    CK4 -->|Yes| UR["Use cached RAG"]
    LR --> UR

Loading

πŸ“Š Performance

Component Spec
Model Qwen 2.5-1.5B (~3GB RAM)
Device CPU (Mac Intel)
Inference speed 2–5 tokens/sec
Weather tool latency ~2–3 seconds
RAG search latency <100ms
Total response time 10–30 seconds

πŸ§ͺ API Reference

POST /api/chat

{
  "message": "How do I integrate Confluence with Slack?",
  "stream": true
}

Streaming response (stream: true): text/event-stream

data: {"token": "Let"}
data: {"token": " me"}
data: {"token": " search..."}
data: {"done": true}

Non-streaming response (stream: false):

{
  "response": "To integrate Confluence with Slack...",
  "model": "Qwen 2.5-1.5B"
}

GET /api/status

{
  "model_loaded": true,
  "model_name": "Qwen/Qwen2.5-1.5B-Instruct",
  "device": "CPU (Mac Intel)",
  "streaming_supported": true,
  "confluence_chunks": 1433
}

POST /api/refresh-kb

Triggers a fresh crawl and re-index of the knowledge base.


πŸ”œ Future Enhancements

  • Multi-turn conversation memory
  • More MCP servers (news, calendar, stocks)
  • Cloud deployment (Render, Fly.io)
  • Better UI with chat history
  • Support for PDF / file uploads
  • Scheduled KB refresh (cron)

πŸ“¦ Dependencies

  • flask β€” Web framework
  • torch + transformers β€” Qwen model inference
  • sentence-transformers β€” Text embeddings
  • chromadb β€” Local vector database
  • httpx + beautifulsoup4 β€” Web crawling
  • peft β€” Model adapter support

🎬 See It in Action


πŸ“„ License

MIT License β€” feel free to use and modify.

About

An AI chat bot

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors