Skip to content

Agentic RAG Chatbot lets you upload PDFs and ask questions. It uses RAG with LangChain and GPT-4o to retrieve relevant content from documents stored in Supabase. Streamlit provides a chat UI. RAG combines retrieval and generation for accurate, context-aware answers.

Notifications You must be signed in to change notification settings

aashritha2001/AgenticRAG_ResearchPaperChatBot

Repository files navigation

📚 Research Paper Assistant using Agentic RAG (Retrieval Augmented Generation) with LangChain and Supabase

Project Description

This project is a research paper assistant that allows users to upload PDFs and chat with their contents using Retrieval Augmented Generation (RAG). By combining LangChain, OpenAI embeddings, and Supabase vector database, the system retrieves relevant information from the uploaded document and provides accurate responses in a chat interface. It demonstrates a practical use of AI-powered document understanding and agentic workflows.

Tools Used

  • Python 3.11+ – programming language
  • Streamlit – web app interface
  • LangChain – for RAG-based retrieval and agent orchestration
  • OpenAI (gpt-4o & embeddings) – LLM for question answering and vector embeddings
  • Supabase – PostgreSQL + pgvector for vector storage and similarity search
  • dotenv – for environment variable management

Prerequisites

  • Python 3.11+

Installation

1. Clone the repository:

git clone

2. Create a virtual environment

python -m venv venv

3. Activate the virtual environment

venv\Scripts\Activate
(or on Mac): source venv/bin/activate

4. Install libraries

pip install -r requirements.txt

5. Create accounts

6. Execute SQL queries in Supabase

Execute the following SQL query in Supabase:

-- Enable the pgvector extension to work with embedding vectors
create extension if not exists vector;

-- Create a table to store your documents
create table
  documents (
    id uuid primary key,
    content text, -- corresponds to Document.pageContent
    metadata jsonb, -- corresponds to Document.metadata
    embedding vector (1536) -- 1536 works for OpenAI embeddings, change if needed
  );

-- Create a function to search for documents
create function match_documents (
  query_embedding vector (1536),
  filter jsonb default '{}'
) returns table (
  id uuid,
  content text,
  metadata jsonb,
  similarity float
) language plpgsql as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where metadata @> filter
  order by documents.embedding <=> query_embedding;
end;
$$;

7. Add API keys to .env file

  • Rename .env.example to .env
  • Add the API keys for Supabase and OpenAI to the .env file

Executing the scripts

  • Open a terminal in VS Code

  • Execute the following command:

streamlit run app.py

About

Agentic RAG Chatbot lets you upload PDFs and ask questions. It uses RAG with LangChain and GPT-4o to retrieve relevant content from documents stored in Supabase. Streamlit provides a chat UI. RAG combines retrieval and generation for accurate, context-aware answers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages