A simple educational project showing how RAG (Retrieval Augmented Generation) and Function Calling work with LLMs.
- How to convert documents into vectors (embeddings)
- How to find relevant documents using similarity search
- How to use retrieved documents as context for better answers
- How LLMs can call external tools/functions
- Simple examples: time lookup and calculations
- How to route queries to appropriate functions
-
Install Ollama:
# Visit https://ollama.ai and install ollama serve ollama pull llama3.2:3b -
Install dependencies:
pip install -r requirements.txt
-
Run the demo:
python chatbot.py
The demo shows both concepts in action:
User: What time is it?
Bot: The current time is 2024-01-15 14:30:25
User: Calculate 15 * 7 + 23
Bot: 15 * 7 + 23 = 128
Adding documents: "Python is a programming language...", "Machine learning..."
User: Who created Python?
Bot: Based on the documents, Python was created by Guido van Rossum in 1991.
src/main.py- Simple demo runnersrc/function_agent.py- Function calling logic (50 lines)src/rag_agent.py- RAG implementation (100 lines)test_demo.py- Basic tests
- Convert documents to embeddings (vectors)
- Convert user query to embedding
- Find most similar documents (cosine similarity)
- Use relevant documents as context in LLM prompt
- Analyze user query for function needs
- Extract parameters (e.g., math expression)
- Call appropriate function
- Format result into natural response
This project strips away complexity to show the core concepts:
- No complex frameworks (minimal dependencies)
- Clear, readable code with comments
- Step-by-step process demonstration
- Easy to modify and experiment with
- Python 3.7+
- Ollama running locally
- Dependencies:
ollama,sentence-transformers,numpy
Perfect for learning how modern AI applications work under the hood!