A tool for summarizing PDF documents using various AI models including OpenAI, Azure OpenAI, and Llama.
- Extract text from PDF documents
- Generate summaries using different AI models:
- OpenAI (GPT-4)
- Azure OpenAI
- Llama (local model)
- Create formatted Word documents with the summaries
- Customizable section templates
- Clone the repository:
git clone https://github.com/hermesdev0131/Generative_AI_Document_Summarizer.git
cd Generative_AI_Document_Summarizer- Install the required dependencies:
pip install -r requirements.txt- Configure your environment variables by creating a
.envfile:
# Model type: "openai", "azure", or "llama"
MODEL_TYPE=openai
# OpenAI configuration (used when MODEL_TYPE=openai)
OPENAI_API_KEY=your_openai_api_key_here
# Azure OpenAI configuration (used when MODEL_TYPE=azure)
AZURE_OPENAI_API_KEY=your_azure_api_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=your-deployment-name
# Llama configuration (used when MODEL_TYPE=llama)
LLAMA_MODEL_PATH=/path/to/your/models/model.gguf
-
Place your PDF documents in the
input_docdirectory. -
Run the main script:
python main.py- The summarized documents will be saved as Word files in the
output_docdirectory.
You can choose between three different AI models by setting the MODEL_TYPE environment variable in your .env file:
openai: Uses the OpenAI GPT-4 modelazure: Uses Azure OpenAI servicesllama: Uses a local Llama model
You can customize the template sections by modifying the template_sections.py file or by providing a template document in the template_doc directory.
- Python 3.8+
- Required packages (see requirements.txt)
- For Llama model: A compatible GGUF model file