📚 Research Paper Assistant using Agentic RAG (Retrieval Augmented Generation) with LangChain and Supabase
This project is a research paper assistant that allows users to upload PDFs and chat with their contents using Retrieval Augmented Generation (RAG). By combining LangChain, OpenAI embeddings, and Supabase vector database, the system retrieves relevant information from the uploaded document and provides accurate responses in a chat interface. It demonstrates a practical use of AI-powered document understanding and agentic workflows.
- Python 3.11+ – programming language
- Streamlit – web app interface
- LangChain – for RAG-based retrieval and agent orchestration
- OpenAI (gpt-4o & embeddings) – LLM for question answering and vector embeddings
- Supabase – PostgreSQL + pgvector for vector storage and similarity search
- dotenv – for environment variable management
- Python 3.11+
git clone
python -m venv venv
venv\Scripts\Activate
(or on Mac): source venv/bin/activate
pip install -r requirements.txt
- Create a free account on Supabase: https://supabase.com/
- Create an API key for OpenAI: https://platform.openai.com/api-keys
Execute the following SQL query in Supabase:
-- Enable the pgvector extension to work with embedding vectors
create extension if not exists vector;
-- Create a table to store your documents
create table
documents (
id uuid primary key,
content text, -- corresponds to Document.pageContent
metadata jsonb, -- corresponds to Document.metadata
embedding vector (1536) -- 1536 works for OpenAI embeddings, change if needed
);
-- Create a function to search for documents
create function match_documents (
query_embedding vector (1536),
filter jsonb default '{}'
) returns table (
id uuid,
content text,
metadata jsonb,
similarity float
) language plpgsql as $$
#variable_conflict use_column
begin
return query
select
id,
content,
metadata,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where metadata @> filter
order by documents.embedding <=> query_embedding;
end;
$$;
- Rename .env.example to .env
- Add the API keys for Supabase and OpenAI to the .env file
-
Open a terminal in VS Code
-
Execute the following command:
streamlit run app.py