Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 25 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,72 +1,48 @@
# PDF Intelligence System with Retrieval Augmented Generation (RAG)
# ChatPDF

## Overview
A **Streamlit** web application that allows users to upload a PDF file, extract the content, and interact with **Google Gemini AI** to ask questions based on the extracted PDF content and receive AI-generated responses.

The goal of this project is to create a user-centric and intelligent system that enhances information retrieval from PDF documents through natural language queries. The project focuses on streamlining the user experience by developing an intuitive interface, allowing users to interact with PDF content using language they are comfortable with. To achieve this, we leverage the Retrieval Augmented Generation (RAG) methodology introduced by Meta AI researchers.
## Features

- **PDF File Upload**: Upload a PDF file, and the app extracts its content.
- **PDF Content Preview**: Preview the extracted text from the uploaded PDF.
- **Interactive AI Chatbot**: Ask questions based on the PDF content, and Google Gemini AI provides answers.
- **Response Generation**: The app uses the Google Gemini AI API to generate answers from the PDF.

https://github.com/ArmaanSeth/ChatPDF/assets/99117431/2500f636-c66d-46ad-bb68-1d55f04ce753
## How to Use

### 1. Navigate to the project directory:

## Retrieval Augmented Generation (RAG)

### Introduction
### 2. Install required dependencies:

RAG is a method designed to address knowledge-intensive tasks, particularly in information retrieval. It combines an information retrieval component with a text generator model to achieve adaptive and efficient knowledge processing. Unlike traditional methods that require retraining the entire model for knowledge updates, RAG allows for fine-tuning and modification of internal knowledge without extensive retraining.

### Workflow
### 3. Set up Google Gemini API:

1. **Input**: RAG takes multiple pdf as input.
2. **VectoreStore**: The pdf's are then converted to vectorstore using FAISS and all-MiniLM-L6-v2 Embeddings model from Hugging Face.
3. **Memory**: Conversation buffer memory is used to maintain a track of previous conversation which are fed to the llm model along with the user query.
4. **Text Generation with GPT-3.5 Turbo**: The embedded input is fed to the GPT-3.5 Turbo model from the OpenAI API, which produces the final output.
5. **User Interface**: Streamlit is used to create the interface for the application.
- Sign up for the **Google Gemini API** and obtain your API key.
- Open the `app.py` file and configure the API key:

### Benefits

- **Adaptability**: RAG adapts to situations where facts may evolve over time, making it suitable for dynamic knowledge domains.
- **Efficiency**: By combining retrieval and generation, RAG provides access to the latest information without the need for extensive model retraining.
- **Reliability**: The methodology ensures reliable outputs by leveraging both retrieval-based and generative approaches.

## Project Features
Replace `"YourAPIKEY"` with your actual API key.

1. **User-friendly Interface**: An intuitive interface designed to accommodate natural language queries, simplifying the interaction with PDF documents.
### 4. Run the app:

2. **Seamless Navigation**: The system streamlines information retrieval, reducing complexity and enhancing the overall user experience.

## Getting Started
### 5. Open a browser and visit:

To use the PDF Intelligence System:

1. Clone the repository to your local machine.
```bash
git clone https://github.com/ArmaanSeth/ChatPDF.git
```
## How It Works

2. Install dependencies.
```bash
pip install -r requirements.txt
```
- **Upload PDF**: The user uploads a PDF file, and the app extracts the text content.
- **PDF Content Preview**: The first 1500 characters of the extracted text are displayed.
- **User Question**: The user asks a question related to the PDF content.
- **AI Response**: The app uses the Google Gemini AI API to generate a response, which is then displayed.

3. Run the application.
```bash
streamlit run app.py
```
## Dependencies

4. Open your browser and navigate to `http://localhost:8000` to access the user interface.
- **Streamlit**: For building the web app interface.
- **PyMuPDF (fitz)**: For extracting text from the uploaded PDF.
- **Google Gemini AI**: For generating responses to user queries.

## Contributing

We welcome contributions to enhance the PDF Intelligence System. If you're interested in contributing, please follow our [Contribution Guidelines](CONTRIBUTING.md).

## License

This project is licensed under the [Apache License](LICENSE).

## Acknowledgments

We would like to express our gratitude to the Hugging Face community for the all-MiniLM-L6-v2 Embeddings model, and OpenAI for providing the GPT-3.5 Turbo model through their API.

---

Feel free to explore and enhance the capabilities of the PDF Intelligence System. Happy querying!
156 changes: 61 additions & 95 deletions app.py
Original file line number Diff line number Diff line change
@@ -1,106 +1,72 @@
# importing dependencies
from dotenv import load_dotenv
import streamlit as st
from PyPDF2 import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import faiss
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain.chat_models import ChatOpenAI
from htmlTemplates import css, bot_template, user_template
import fitz # PyMuPDF for PDF extraction
import google.generativeai as genai

# creating custom template to guide llm model
custom_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.
Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
# Manually pass the API key for Google Gemini API
genai.configure(api_key="YourAPIKEY")

CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)

# extracting text from pdf
def get_pdf_text(docs):
text=""
for pdf in docs:
pdf_reader=PdfReader(pdf)
for page in pdf_reader.pages:
text+=page.extract_text()
return text

# converting text to chunks
def get_chunks(raw_text):
text_splitter=CharacterTextSplitter(separator="\n",
chunk_size=1000,
chunk_overlap=200,
length_function=len)
chunks=text_splitter.split_text(raw_text)
return chunks
# Streamlit interface
def main():
st.set_page_config(page_title="Gemini AI Chatbot with PDF Support", page_icon=":robot_face:", layout="wide")

# using all-MiniLm embeddings model and faiss to get vectorstore
def get_vectorstore(chunks):
embeddings=HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2",
model_kwargs={'device':'cpu'})
vectorstore=faiss.FAISS.from_texts(texts=chunks,embedding=embeddings)
return vectorstore
st.title("Gemini AI Chatbot with PDF Support")
st.markdown("""
This app allows you to upload a PDF file, ask questions based on the PDF content, and get answers using the **Google Gemini AI API**.
""")

# File upload section
st.subheader("Step 1: Upload Your PDF File")
uploaded_file = st.file_uploader("Upload a PDF file", type="pdf", label_visibility="collapsed")

# If PDF is uploaded
if uploaded_file:
# PDF Extraction
with st.spinner("Extracting text from PDF..."):
pdf_text = extract_pdf_text(uploaded_file)

# PDF Preview
st.subheader("PDF Content Preview")
st.text_area("Preview of PDF Content", pdf_text[:1500], height=300)

# generating conversation chain
def get_conversationchain(vectorstore):
llm=ChatOpenAI(temperature=0.2)
memory = ConversationBufferMemory(memory_key='chat_history',
return_messages=True,
output_key='answer') # using conversation buffer memory to hold past information
conversation_chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectorstore.as_retriever(),
condense_question_prompt=CUSTOM_QUESTION_PROMPT,
memory=memory)
return conversation_chain
# User question input
st.subheader("Step 2: Enter Your Question")
user_input = st.text_input("Ask a question based on the PDF content:")

# generating response from user queries and displaying them accordingly
def handle_question(question):
response=st.session_state.conversation({'question': question})
st.session_state.chat_history=response["chat_history"]
for i,msg in enumerate(st.session_state.chat_history):
if i%2==0:
st.write(user_template.replace("{{MSG}}",msg.content,),unsafe_allow_html=True)
else:
st.write(bot_template.replace("{{MSG}}",msg.content),unsafe_allow_html=True)
if st.button("Generate Response"):
if user_input:
with st.spinner("Generating response..."):
response = generate_response(user_input, pdf_text)
st.subheader("AI Response:")
st.write(response)
else:
st.warning("Please enter a question to get a response!")

# Extract PDF text function
def extract_pdf_text(uploaded_file):
"""Extract text from PDF using PyMuPDF (fitz)."""
doc = fitz.open(stream=uploaded_file.read(), filetype="pdf")
pdf_text = ""
for page_num in range(len(doc)):
page = doc.load_page(page_num)
pdf_text += page.get_text("text")
return pdf_text

def main():
load_dotenv()
st.set_page_config(page_title="Chat with multiple PDFs",page_icon=":books:")
st.write(css,unsafe_allow_html=True)
if "conversation" not in st.session_state:
st.session_state.conversation=None
# Generate response function using Google Gemini API
def generate_response(prompt, pdf_text):
"""Generate response based on PDF content using Google Gemini API."""
try:
# Generate a response by combining the PDF content with the user prompt
context = f"Here is the content extracted from the PDF:\n\n{pdf_text}\n\nUser's Question: {prompt}\nAnswer:"

if "chat_history" not in st.session_state:
st.session_state.chat_history=None

st.header("Chat with multiple PDFs :books:")
question=st.text_input("Ask question from your document:")
if question:
handle_question(question)
with st.sidebar:
st.subheader("Your documents")
docs=st.file_uploader("Upload your PDF here and click on 'Process'",accept_multiple_files=True)
if st.button("Process"):
with st.spinner("Processing"):

#get the pdf
raw_text=get_pdf_text(docs)

#get the text chunks
text_chunks=get_chunks(raw_text)

#create vectorstore
vectorstore=get_vectorstore(text_chunks)

#create conversation chain
st.session_state.conversation=get_conversationchain(vectorstore)
# Use the correct generate_content method for text-only input
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(context)

# Return the response's text
return response.text
except Exception as e:
return f"Error occurred: {str(e)}"

if __name__ == '__main__':
main()
# Run the app
if __name__ == "__main__":
main()