ArmaanSeth · mash786 · Oct 19, 2024 · Oct 19, 2024
diff --git a/README.md b/README.md
@@ -1,72 +1,48 @@
-# PDF Intelligence System with Retrieval Augmented Generation (RAG)
+# ChatPDF
 
-## Overview
+A **Streamlit** web application that allows users to upload a PDF file, extract the content, and interact with **Google Gemini AI** to ask questions based on the extracted PDF content and receive AI-generated responses.
 
-The goal of this project is to create a user-centric and intelligent system that enhances information retrieval from PDF documents through natural language queries. The project focuses on streamlining the user experience by developing an intuitive interface, allowing users to interact with PDF content using language they are comfortable with. To achieve this, we leverage the Retrieval Augmented Generation (RAG) methodology introduced by Meta AI researchers.
+## Features
 
+- **PDF File Upload**: Upload a PDF file, and the app extracts its content.
+- **PDF Content Preview**: Preview the extracted text from the uploaded PDF.
+- **Interactive AI Chatbot**: Ask questions based on the PDF content, and Google Gemini AI provides answers.
+- **Response Generation**: The app uses the Google Gemini AI API to generate answers from the PDF.
 
-https://github.com/ArmaanSeth/ChatPDF/assets/99117431/2500f636-c66d-46ad-bb68-1d55f04ce753
+## How to Use
 
+### 1. Navigate to the project directory:
 
-## Retrieval Augmented Generation (RAG)
 
-### Introduction
+### 2. Install required dependencies:
 
-RAG is a method designed to address knowledge-intensive tasks, particularly in information retrieval. It combines an information retrieval component with a text generator model to achieve adaptive and efficient knowledge processing. Unlike traditional methods that require retraining the entire model for knowledge updates, RAG allows for fine-tuning and modification of internal knowledge without extensive retraining.
 
-### Workflow
+### 3. Set up Google Gemini API:
 
-1. **Input**: RAG takes multiple pdf as input.
-2. **VectoreStore**: The pdf's are then converted to vectorstore using FAISS and all-MiniLM-L6-v2 Embeddings model from Hugging Face.
-3. **Memory**: Conversation buffer memory is used to maintain a track of previous conversation which are fed to the llm model along with the user query.
-4. **Text Generation with GPT-3.5 Turbo**: The embedded input is fed to the GPT-3.5 Turbo model from the OpenAI API, which produces the final output.
-5. **User Interface**: Streamlit is used to create the interface for the application.
+- Sign up for the **Google Gemini API** and obtain your API key.
+- Open the `app.py` file and configure the API key:
 
-### Benefits
 
-- **Adaptability**: RAG adapts to situations where facts may evolve over time, making it suitable for dynamic knowledge domains.
-- **Efficiency**: By combining retrieval and generation, RAG provides access to the latest information without the need for extensive model retraining.
-- **Reliability**: The methodology ensures reliable outputs by leveraging both retrieval-based and generative approaches.
 
-## Project Features
+Replace `"YourAPIKEY"` with your actual API key.
 
-1. **User-friendly Interface**: An intuitive interface designed to accommodate natural language queries, simplifying the interaction with PDF documents.
+### 4. Run the app:
 
-2. **Seamless Navigation**: The system streamlines information retrieval, reducing complexity and enhancing the overall user experience.
 
-## Getting Started
+### 5. Open a browser and visit:
 
-To use the PDF Intelligence System:
 
-1. Clone the repository to your local machine.
-   ```bash
-   git clone https://github.com/ArmaanSeth/ChatPDF.git
-   ```
+## How It Works
 
-2. Install dependencies.
-   ```bash
-   pip install -r requirements.txt
-   ```
+- **Upload PDF**: The user uploads a PDF file, and the app extracts the text content.
+- **PDF Content Preview**: The first 1500 characters of the extracted text are displayed.
+- **User Question**: The user asks a question related to the PDF content.
+- **AI Response**: The app uses the Google Gemini AI API to generate a response, which is then displayed.
 
-3. Run the application.
-   ```bash
-   streamlit run app.py
-   ```
+## Dependencies
 
-4. Open your browser and navigate to `http://localhost:8000` to access the user interface.
+- **Streamlit**: For building the web app interface.
+- **PyMuPDF (fitz)**: For extracting text from the uploaded PDF.
+- **Google Gemini AI**: For generating responses to user queries.
 
-## Contributing
 
-We welcome contributions to enhance the PDF Intelligence System. If you're interested in contributing, please follow our [Contribution Guidelines](CONTRIBUTING.md).
-
-## License
-
-This project is licensed under the [Apache License](LICENSE).
-
-## Acknowledgments
-
-We would like to express our gratitude to the Hugging Face community for the all-MiniLM-L6-v2 Embeddings model, and OpenAI for providing the GPT-3.5 Turbo model through their API.
-
----
-
-Feel free to explore and enhance the capabilities of the PDF Intelligence System. Happy querying!
diff --git a/app.py b/app.py
@@ -1,106 +1,72 @@
-# importing dependencies
-from dotenv import load_dotenv
 import streamlit as st
-from PyPDF2 import PdfReader
-from langchain.text_splitter import CharacterTextSplitter
-from langchain.embeddings import HuggingFaceEmbeddings
-from langchain.vectorstores import faiss
-from langchain.prompts import PromptTemplate
-from langchain.memory import ConversationBufferMemory
-from langchain.chains import ConversationalRetrievalChain
-from langchain.chat_models import ChatOpenAI
-from htmlTemplates import css, bot_template, user_template
+import fitz  # PyMuPDF for PDF extraction
+import google.generativeai as genai
 
-# creating custom template to guide llm model
-custom_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.
-Chat History:
-{chat_history}
-Follow Up Input: {question}
-Standalone question:"""
+# Manually pass the API key for Google Gemini API
+genai.configure(api_key="YourAPIKEY")
 
-CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)
-
-# extracting text from pdf
-def get_pdf_text(docs):
-    text=""
-    for pdf in docs:
-        pdf_reader=PdfReader(pdf)
-        for page in pdf_reader.pages:
-            text+=page.extract_text()
-    return text
-
-# converting text to chunks
-def get_chunks(raw_text):
-    text_splitter=CharacterTextSplitter(separator="\n",
-                                        chunk_size=1000,
-                                        chunk_overlap=200,
-                                        length_function=len)   
-    chunks=text_splitter.split_text(raw_text)
-    return chunks
+# Streamlit interface
+def main():
+    st.set_page_config(page_title="Gemini AI Chatbot with PDF Support", page_icon=":robot_face:", layout="wide")
 
-# using all-MiniLm embeddings model and faiss to get vectorstore
-def get_vectorstore(chunks):
-    embeddings=HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2",
-                                     model_kwargs={'device':'cpu'})
-    vectorstore=faiss.FAISS.from_texts(texts=chunks,embedding=embeddings)
-    return vectorstore
+    st.title("Gemini AI Chatbot with PDF Support")
+    st.markdown("""
+    This app allows you to upload a PDF file, ask questions based on the PDF content, and get answers using the **Google Gemini AI API**.
+    """)
+
+    # File upload section
+    st.subheader("Step 1: Upload Your PDF File")
+    uploaded_file = st.file_uploader("Upload a PDF file", type="pdf", label_visibility="collapsed")
+
+    # If PDF is uploaded
+    if uploaded_file:
+        # PDF Extraction
+        with st.spinner("Extracting text from PDF..."):
+            pdf_text = extract_pdf_text(uploaded_file)
+
+        # PDF Preview
+        st.subheader("PDF Content Preview")
+        st.text_area("Preview of PDF Content", pdf_text[:1500], height=300)
 
-# generating conversation chain  
-def get_conversationchain(vectorstore):
-    llm=ChatOpenAI(temperature=0.2)
-    memory = ConversationBufferMemory(memory_key='chat_history', 
-                                      return_messages=True,
-                                      output_key='answer') # using conversation buffer memory to hold past information
-    conversation_chain = ConversationalRetrievalChain.from_llm(
-                                llm=llm,
-                                retriever=vectorstore.as_retriever(),
-                                condense_question_prompt=CUSTOM_QUESTION_PROMPT,
-                                memory=memory)
-    return conversation_chain
+        # User question input
+        st.subheader("Step 2: Enter Your Question")
+        user_input = st.text_input("Ask a question based on the PDF content:")
 
-# generating response from user queries and displaying them accordingly
-def handle_question(question):
-    response=st.session_state.conversation({'question': question})
-    st.session_state.chat_history=response["chat_history"]
-    for i,msg in enumerate(st.session_state.chat_history):
-        if i%2==0:
-            st.write(user_template.replace("{{MSG}}",msg.content,),unsafe_allow_html=True)
-        else:
-            st.write(bot_template.replace("{{MSG}}",msg.content),unsafe_allow_html=True)
+        if st.button("Generate Response"):
+            if user_input:
+                with st.spinner("Generating response..."):
+                    response = generate_response(user_input, pdf_text)
+                    st.subheader("AI Response:")
+                    st.write(response)
+            else:
+                st.warning("Please enter a question to get a response!")
 
+# Extract PDF text function
+def extract_pdf_text(uploaded_file):
+    """Extract text from PDF using PyMuPDF (fitz)."""
+    doc = fitz.open(stream=uploaded_file.read(), filetype="pdf")
+    pdf_text = ""
+    for page_num in range(len(doc)):
+        page = doc.load_page(page_num)
+        pdf_text += page.get_text("text")
+    return pdf_text
 
-def main():
-    load_dotenv()
-    st.set_page_config(page_title="Chat with multiple PDFs",page_icon=":books:")
-    st.write(css,unsafe_allow_html=True)
-    if "conversation" not in st.session_state:
-        st.session_state.conversation=None
+# Generate response function using Google Gemini API
+def generate_response(prompt, pdf_text):
+    """Generate response based on PDF content using Google Gemini API."""
+    try:
+        # Generate a response by combining the PDF content with the user prompt
+        context = f"Here is the content extracted from the PDF:\n\n{pdf_text}\n\nUser's Question: {prompt}\nAnswer:"
 
-    if "chat_history" not in st.session_state:
-        st.session_state.chat_history=None
-
-    st.header("Chat with multiple PDFs :books:")
-    question=st.text_input("Ask question from your document:")
-    if question:
-        handle_question(question)
-    with st.sidebar:
-        st.subheader("Your documents")
-        docs=st.file_uploader("Upload your PDF here and click on 'Process'",accept_multiple_files=True)
-        if st.button("Process"):
-            with st.spinner("Processing"):
-
-                #get the pdf
-                raw_text=get_pdf_text(docs)
-
-                #get the text chunks
-                text_chunks=get_chunks(raw_text)
-
-                #create vectorstore
-                vectorstore=get_vectorstore(text_chunks)
-
-                #create conversation chain
-                st.session_state.conversation=get_conversationchain(vectorstore)
+        # Use the correct generate_content method for text-only input
+        model = genai.GenerativeModel("gemini-1.5-flash")
+        response = model.generate_content(context)
 
+        # Return the response's text
+        return response.text
+    except Exception as e:
+        return f"Error occurred: {str(e)}"
 
-if __name__ == '__main__':
-    main()
+# Run the app
+if __name__ == "__main__":
+    main()