An AI-powered system that automates compliance checking for structural design drawings (PDFs). The system parses RCC (Reinforced Cement Concrete) foundation drawings, extracts key parameters using computer vision, and generates comprehensive compliance reports aligned with Indian Standards (IS 456:2000 and SP 34:1987).
- PDF to Image Conversion: Converts structural design PDFs to high-quality images at 200 DPI
- AI-Powered Analysis: Uses Google Gemini Vision AI (via OpenRouter) to extract structural information
- Comprehensive Data Extraction: Extracts dimensions, reinforcement details, material specifications, and compliance parameters
- Automated Report Generation: Creates detailed markdown reports with extracted data and compliance status
- RAG-Enhanced Analysis: Retrieval-Augmented Generation (RAG) system for accurate IS code clause citations
- Vector Database Integration: ChromaDB for semantic search of IS code provisions
- Vision-Assisted Analysis: Google Gemini 2.5 Flash vision model for drawing interpretation
- PDF Processing: PyMuPDF for high-quality PDF to image conversion
- Vector Database: ChromaDB with sentence-transformers embeddings for code provision retrieval
- CLI Workflow: Simple command-line interface for batch processing
- Comprehensive Extraction: 22+ compliance criteria extraction including:
- Concrete and reinforcement grades
- Cover requirements and development lengths
- Foundation specifications
- Seismic zone considerations
- Detailing requirements
- Professional Reports: Markdown reports with timestamped versions
- Python 3.10+ (Python 3.11 recommended)
- Git (for cloning the repository)
- Internet access (for AI API calls)
- API Key: OpenRouter API key for Gemini Vision access
git clone <repository-url>
cd WO-380# Create virtual environment
python -m venv .venv
# Activate virtual environment
# On Windows
.venv\Scripts\activate
# On macOS/Linux
source .venv/bin/activatepip install -r requirements.txtCreate a .env file in the project root directory:
# Required: OpenRouter API Key for Gemini Vision
OPENROUTER_API_KEY=your_openrouter_api_key_here
# Optional: For direct Gemini API access (if needed)
GEMINI_API_KEY=your_gemini_api_key_here- OpenRouter API Key: Visit OpenRouter to create an account and get your API key
- Gemini API Key (optional): Visit Google AI Studio for direct Gemini access
-
Place your PDF file in the
uploads/directory:# Ensure the uploads directory exists mkdir -p uploads # Copy your PDF file cp your_drawing.pdf uploads/
-
Run the analysis:
python app.py your_drawing.pdf
-
View the report: The system will:
- Convert PDF to images
- Analyze using Gemini Vision AI
- Generate a comprehensive compliance report
- Save results to
reports/initial_report_<timestamp>.md
--- RCC Design Compliance Check AI ---
Note: Please make sure you have created a .env file with your OPENROUTER_API_KEY.
Found PDF file: uploads/7.pdf
Starting initial analysis...
Initial analysis complete.
--- INITIAL REPORT ---
[Comprehensive markdown report with compliance checklist]
Report saved to reports/initial_report_1234567890.md
Compliance check process finished.The main entry point is app.py, which provides a simple CLI:
python app.py <filename>Arguments:
filename: The name of the PDF file in theuploads/directory (e.g.,7.pdf)
What it does:
- Validates the PDF file exists in
uploads/ - Converts PDF pages to high-resolution images (200 DPI)
- Sends images to Gemini Vision AI for analysis
- Extracts compliance-relevant information
- Generates a structured markdown report
- Saves the report with a timestamp
You can also use the modules directly in your Python code:
from llm_handler import analyze_rcc_drawing
from prompt import INITIAL_EXTRACTION_PROMPT
# Analyze a PDF
pdf_path = "uploads/your_drawing.pdf"
report = analyze_rcc_drawing(pdf_path, INITIAL_EXTRACTION_PROMPT)
print(report)from pdf_to_image import pdf_to_image
# Convert a specific page to image
image_path = pdf_to_image(
pdf_path="uploads/drawing.pdf",
output_image="output.jpg",
page_number=1, # Page to convert (1-indexed)
zoom=2 # Zoom factor for quality
)from vector_db import VectorDB
from embedding_service import generate_embedding
# Initialize vector database
db = VectorDB()
# Generate embedding for query
query = "What are the cover requirements for foundations?"
embedding = generate_embedding(query)
# Search for relevant code provisions
results = db.query(embedding, top_k=5)- Markdown Report: Comprehensive extraction report with timestamp
- Location:
reports/initial_report_<timestamp>.md - Format: GitHub-flavored markdown with tables and structured sections
- Location:
The generated report includes:
- Document Type Verification: Confirms if the drawing is a foundation drawing
- Site Location: Extracted or flagged as missing
- Code Standards: References to IS 456:2000 and SP 34:1987
- NOTES Section: Verification of presence and completeness
- Compliance Checklist: 22+ criteria with status indicators:
- โ Compliant
- โ Non-Compliant
โ ๏ธ Missing Information- โ Cannot Verify
- โ Not Applicable
- Extracted Values: All dimensions, specifications, and annotations
- Summary Statistics: Compliance percentage and breakdown
## Compliance Checklist
| Criterion | Extracted Value | Code Requirement | Status | Notes |
|-----------|----------------|------------------|--------|-------|
| Concrete Grade | M25 | M20 minimum | โ
Compliant | - |
| Reinforcement Grade | Fe415 | Fe415/Fe500 | โ
Compliant | - |
| Clear Cover | 50mm | 40mm minimum | โ
Compliant | - |app.py: CLI entrypoint; orchestrates extraction and saves reportsllm_handler.py: High-level LLM workflow for RCC drawing analysisanalyze_rcc_drawing(): Main function for PDF analysispdf_to_images(): PDF to PIL Image conversionanalyze_rcc_drawing_from_images(): Vision API integration
llm_service.py: LLM service helpers and report refinementgemini_vision.py: Gemini Vision integration utilitiespdf_to_image.py: PDF to image conversion utilitiesvector_db.py: ChromaDB vector store integrationembedding_service.py: Embedding generation using sentence-transformersprompt.py: Prompt templates for extraction and analysisdata_loader.py: Data loading utilities for vector database
requirements.txt: Python package dependenciesMETHODOLOGY.md: Detailed technical methodology and architectureDOCKER_SETUP.md: Docker development environment setup
WO-380/
โโโ app.py # CLI entrypoint
โโโ llm_handler.py # Main LLM workflow
โโโ llm_service.py # LLM service helpers
โโโ gemini_vision.py # Gemini Vision integration
โโโ embedding_service.py # Embeddings generation
โโโ vector_db.py # ChromaDB integration
โโโ pdf_to_image.py # PDF/image utilities
โโโ prompt.py # Prompt templates
โโโ data_loader.py # Data loading utilities
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
โโโ METHODOLOGY.md # Technical documentation
โโโ DOCKER_SETUP.md # Docker setup guide
โโโ .env # API keys (create this)
โ
โโโ uploads/ # Place PDFs here
โโโ reports/ # Generated markdown reports
โโโ chroma_db/ # Vector DB data (auto-created)
โโโ convertedimages/ # Generated images (auto-created)
โโโ SP34_md/ # SP 34 code documents
โโโ sample_pdfs/ # Sample test PDFs
Note: Large/binary/data folders (
uploads/,reports/,chroma_db/,convertedimages/) are ignored by git via.gitignore.
The project includes Docker support for a consistent development environment. See DOCKER_SETUP.md for detailed instructions.
# Build the Docker image
docker compose build
# Start the development environment
docker compose up -d
# Access the container shell
docker exec -it python-dev-env bash
# Inside container: install dependencies and run
pip install -r requirements.txt
python app.py 7.pdf-
"File not found" Error
- Ensure the PDF is in the
uploads/directory - Check that the filename matches exactly (case-sensitive)
- Verify the file extension is
.pdf
- Ensure the PDF is in the
-
"OPENROUTER_API_KEY not found"
- Create a
.envfile in the project root - Add
OPENROUTER_API_KEY=your_key_here - Ensure the key is valid and has sufficient credits
- Create a
-
"Error converting PDF to image"
- Check if the PDF is password-protected
- Verify the PDF is not corrupted
- Ensure PyMuPDF is installed correctly
-
"ChromaDB errors"
- Delete and re-create
chroma_db/directory if the index is corrupted - Ensure write permissions in the project directory
- Delete and re-create
-
Poor extraction results
- Use higher quality PDFs with clear text/drawings
- Ensure drawings follow standard engineering notation
- Check that the NOTES section is clearly visible
-
Import errors
- Ensure virtual environment is activated
- Run
pip install -r requirements.txtagain - Check Python version (3.10+ required)
- Check the generated image files in
convertedimages/to verify PDF conversion - Review console output for specific error messages
- Verify all dependencies are installed:
pip list - Ensure API keys are valid and have sufficient credits
- Check
METHODOLOGY.mdfor detailed technical information
# Required
OPENROUTER_API_KEY=your_openrouter_api_key
# Optional
GEMINI_API_KEY=your_gemini_api_key # For direct Gemini accessIn pdf_to_image.py, you can adjust:
- Page number: Which page to convert (default: 1)
- Zoom factor: Image quality (higher = better quality, larger file)
- Output format: JPG, PNG, etc.
Edit prompt.py to customize extraction criteria:
INITIAL_EXTRACTION_PROMPT: Main extraction promptprompt1: Alternative detailed extraction promptREFINEMENT_PROMPT_TEMPLATE: Report refinement prompt
The vector database uses:
- Embedding Model:
all-MiniLM-L6-v2(384 dimensions) - Database: ChromaDB with persistent storage
- Retrieval: Top 5 most relevant provisions by cosine similarity
# Test environment setup
python test_env.py
# Test PDF download/processing
python test_pdf_download.pyThe project includes Jupyter notebooks for exploration:
main.ipynb: Main exploration notebookFINAL.ipynb: Final analysis notebook
To use Jupyter:
# Install Jupyter (if not already installed)
pip install jupyter jupyterlab
# Start Jupyter Lab
jupyter labThe codebase follows a modular architecture:
- Input Processing: PDF conversion and image preprocessing
- Vision Analysis: AI-powered drawing interpretation
- Data Extraction: Structured information extraction
- RAG Enhancement: Code provision retrieval and citation
- Report Generation: Markdown report creation
README.md: This file - quick start and usage guideMETHODOLOGY.md: Comprehensive technical documentation including:- System architecture
- RAG implementation details
- AI model specifications
- Workflow diagrams
- Compliance criteria details
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with clear commits
- Add/update documentation where relevant
- Test thoroughly
- Submit a pull request
[Add your license information here]
- Google Gemini Vision AI for image analysis capabilities
- OpenRouter for unified AI API access
- IS 456:2000 and SP 34:1987 code provisions for compliance checking
- ChromaDB and sentence-transformers for vector search capabilities
- The civil engineering community for domain expertise
- Support for additional Indian Standards (IS 1893, IS 13920)
- Web interface using Streamlit
- Batch processing for multiple drawings
- Interactive report refinement
- 3D drawing support
- Handwritten text recognition
- Multi-language support
Ready to automate your structural design compliance checks?
Run python app.py <your-file.pdf> and get your first report in minutes! ๐