A modern, production-ready AI assistant platform for integrating advanced language models with your website's knowledge base.
Easily build, manage, and query a multilingual knowledge base with robust evaluation and a sleek, responsive frontend.
-
Knowledge Base Initialization
- Extracts and chunks content from your website using a base URL or sitemap file upload.
- Semantic chunking and FAISS vector search for efficient, accurate retrieval.
-
Multilingual Language Detection
- Detects user query language (Arabic, English, French, etc.) using FastText.
- Automatically adapts prompts and responses for Arabic/Persian or English.
-
LLM Integration
- Uses Groq's Llama-3.3-70b-versatile model for chat responses.
- Context-aware answers based on retrieved knowledge base content.
-
Validation & Evaluation Metrics
- Computes retrieval metrics (Precision, Recall, F1), LLM metrics (ROUGE, BERTScore), and cross-encoder relevance.
- Metrics are displayed in the frontend and logged to Comet ML for experiment tracking.
-
Modern Frontend
- Responsive Bootstrap UI with sidebar for knowledge base setup, chat, and collapsible metrics section.
- Real-time chat and feedback.
backend/: FastAPI app, API routes, vector store, evaluation, and utilities.models/: Pre-trained FastText language detection model.static/: CSS, JS, and images for the frontend.templates/: Jinja2 HTML templates.
-
Install uv:
pip install uv
-
Create a Virtual Environment:
uv venv
-
Sync Dependencies:
uv sync --all-extras
-
Configure Environment Variables:
Copy.env.sampleto.envand fill in your API keys and base URL. -
Run the Application:
uvicorn backend.main:app --reload
-
Open in Browser:
Visit http://127.0.0.1:8000
- Use the sidebar form to enter a base URL or upload a sitemap file.
- The backend extracts, chunks, and indexes your website content for retrieval.
- Enter your question in any supported language.
- The system detects the language and adapts the prompt for the LLM.
- Arabic and Persian queries get native-language prompts; others default to English.
- Metrics: Precision, Recall, F1, ROUGE, BERTScore, Cross-Encoder Relevance, Latency.
- Comet ML Integration:
- Logs all evaluation metrics and parameters for experiment tracking.
- Configure your Comet ML credentials in
.env.
GROQ_API_KEY: Groq LLM API key.BASE_URL: Default website for sitemap extraction.COMET_ML_API_KEY,COMET_ML_PROJECT_NAME,COMET_ML_WORKSPACE: Comet ML tracking.- (Optional)
HUGGINGFACE_API_KEY,TAVILY_API_KEY,FIRECRAWL_API_KEYfor extra integrations.
MIT License.
See LICENSE for details.
For questions, suggestions, or contributions, please open an issue or pull request.