Skip to content

This repository provides an AI-powered assistant for website integration, featuring knowledge base management, multilingual language detection, and validation metrics. It includes a modular backend, pre-trained models, and Comet ML integration for experiment tracking. Designed for robust evaluation and enhancing user interaction.

License

Notifications You must be signed in to change notification settings

BouajilaHamza/site-ai-assistant-integration

Repository files navigation

Site AI Assistant Integration

A modern, production-ready AI assistant platform for integrating advanced language models with your website's knowledge base.
Easily build, manage, and query a multilingual knowledge base with robust evaluation and a sleek, responsive frontend.


Features

  • Knowledge Base Initialization

    • Extracts and chunks content from your website using a base URL or sitemap file upload.
    • Semantic chunking and FAISS vector search for efficient, accurate retrieval.
  • Multilingual Language Detection

    • Detects user query language (Arabic, English, French, etc.) using FastText.
    • Automatically adapts prompts and responses for Arabic/Persian or English.
  • LLM Integration

    • Uses Groq's Llama-3.3-70b-versatile model for chat responses.
    • Context-aware answers based on retrieved knowledge base content.
  • Validation & Evaluation Metrics

    • Computes retrieval metrics (Precision, Recall, F1), LLM metrics (ROUGE, BERTScore), and cross-encoder relevance.
    • Metrics are displayed in the frontend and logged to Comet ML for experiment tracking.
  • Modern Frontend

    • Responsive Bootstrap UI with sidebar for knowledge base setup, chat, and collapsible metrics section.
    • Real-time chat and feedback.

Project Structure

  • backend/: FastAPI app, API routes, vector store, evaluation, and utilities.
  • models/: Pre-trained FastText language detection model.
  • static/: CSS, JS, and images for the frontend.
  • templates/: Jinja2 HTML templates.

Quickstart

  1. Install uv:

    pip install uv
  2. Create a Virtual Environment:

    uv venv
  3. Sync Dependencies:

    uv sync --all-extras
  4. Configure Environment Variables:
    Copy .env.sample to .env and fill in your API keys and base URL.

  5. Run the Application:

    uvicorn backend.main:app --reload
  6. Open in Browser:
    Visit http://127.0.0.1:8000


Usage

Knowledge Base Initialization

  • Use the sidebar form to enter a base URL or upload a sitemap file.
  • The backend extracts, chunks, and indexes your website content for retrieval.

Chat & Language Detection

  • Enter your question in any supported language.
  • The system detects the language and adapts the prompt for the LLM.
  • Arabic and Persian queries get native-language prompts; others default to English.

Validation & Experiment Tracking

  • Metrics: Precision, Recall, F1, ROUGE, BERTScore, Cross-Encoder Relevance, Latency.
  • Comet ML Integration:
    • Logs all evaluation metrics and parameters for experiment tracking.
    • Configure your Comet ML credentials in .env.

Environment Variables

  • GROQ_API_KEY: Groq LLM API key.
  • BASE_URL: Default website for sitemap extraction.
  • COMET_ML_API_KEY, COMET_ML_PROJECT_NAME, COMET_ML_WORKSPACE: Comet ML tracking.
  • (Optional) HUGGINGFACE_API_KEY, TAVILY_API_KEY, FIRECRAWL_API_KEY for extra integrations.

License

MIT License.
See LICENSE for details.


Contact

For questions, suggestions, or contributions, please open an issue or pull request.


About

This repository provides an AI-powered assistant for website integration, featuring knowledge base management, multilingual language detection, and validation metrics. It includes a modular backend, pre-trained models, and Comet ML integration for experiment tracking. Designed for robust evaluation and enhancing user interaction.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published