A RAG-powered chatbot for answering questions about West End and Broadway theatre. Praxa combines a curated PDF knowledge base with a retrieval-augmented generation pipeline to provide accurate, source-backed answers about shows, productions, venues, and the people behind them.
- Conversational chat interface built with Streamlit
- Retrieval-Augmented Generation (RAG) — answers are grounded in your own documents, not just LLM training data
- Source citations with every answer (filename + page number)
- Local embeddings via
all-MiniLM-L6— no API call needed for indexing - Powered by
gemma-3-27bvia OpenRouter
| Layer | Tool |
|---|---|
| UI | Streamlit |
| LLM Orchestration | LangChain |
| LLM | gemma-3-27b via OpenRouter |
| Embedding Model | all-MiniLM-L6-v2 (sentence-transformers) |
| Vector Database | Chroma |
| Document Loader | LangChain PyPDFLoader |
| Text Splitting | RecursiveCharacterTextSplitter |
Praxa/
├── praxa_client.py # Streamlit UI — chat interface
├── praxa_rag.py # RAG chain — retrieval, prompt, LLM, sources
├── context.py # Chroma vector store setup and PDF indexing
├── model.py # OpenRouter LLM initialisation
├── data/ # PDF documents (your knowledge base)
├── chroma_db/ # Persisted Chroma vector index (auto-generated)
└── requirements.txt
git clone https://github.com/your-username/praxa.git
cd praxapython -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root:
OPENROUTER_API_KEY=your_api_key_here
You can get a free API key at openrouter.ai.
Run the following command to set up Praxa's knowledge base. This will automatically download the required PDFs and build the Chroma vector index:
python context.pyThis only needs to be done once. The PDFs will be saved to context_data/
and the vector index to chromadb/.
streamlit run praxa_client.pyThen open http://localhost:8501 in your browser.
User question
│
▼
Embed question ← all-MiniLM-L6-v2
│
▼
Similarity search ← Chroma finds top-k most relevant chunks
│
▼
Build prompt ← LangChain formats question + retrieved context
│
▼
LLM generates answer ← gemma-3-27b via OpenRouter
│
▼
Return answer + sources ← filename and page number for each chunk used
At indexing time, each PDF is loaded, split into overlapping chunks using RecursiveCharacterTextSplitter, and embedded using all-MiniLM-L6-v2. The resulting vectors are stored in a local Chroma database alongside the original text and metadata.
At query time, the user's question is embedded using the same model, and Chroma performs a cosine similarity search to retrieve the most relevant chunks. These are passed to gemma-3-27b as context alongside the question, and the model generates a grounded answer. The source documents are returned alongside the answer so the user can verify the information.
streamlit
langchain
langchain-community
langchain-chroma
langchain-openai
sentence-transformers
chromadb
pypdf
python-dotenv