Skip to content

mady20/devops-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Sales Insight Automator

CI Pipeline

A secure, containerized application that transforms raw sales data into AI-powered executive summaries โ€” delivered straight to your inbox.

Built for Rabbitt AI โ€” AI Cloud DevOps Engineer Case Study


๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           โ”‚     โ”‚                  FastAPI Backend                    โ”‚
โ”‚  React    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  SPA      โ”‚ API โ”‚  โ”‚ Security  โ”‚  โ”‚  Parser   โ”‚  โ”‚  AI Engine    โ”‚  โ”‚
โ”‚ (Vite+TS) โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚  โ”‚ Middlewareโ”‚โ”€โ–ถโ”‚ (pandas)  โ”‚โ”€โ–ถโ”‚ (Gemini API)  โ”‚  โ”‚
โ”‚           โ”‚     โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚       โ”‚                              โ”‚             โ”‚
                  โ”‚       โ”‚         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚             โ”‚
                  โ”‚       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚ Email Service โ”‚โ—€โ”€โ”€โ”€โ”˜             โ”‚
                  โ”‚                 โ”‚  (Resend)     โ”‚                  โ”‚
                  โ”‚                 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ”‚
                  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Flow

  1. User uploads .csv/.xlsx file + enters recipient email
  2. Backend validates inputs (file type, size, email format)
  3. Parser extracts structured business metrics via pandas
  4. Only structured metrics (not raw data) are sent to Gemini LLM
  5. Gemini generates a professional executive summary
  6. Summary is emailed via Resend and previewed in the UI

๐Ÿ›  Tech Stack

Layer Technology
Frontend React 18 + Vite + TypeScript
Backend FastAPI (Python 3.12)
AI Engine Google Gemini 2.0 Flash
Email Resend
Containerization Docker + Docker Compose
CI/CD GitHub Actions
Frontend Host Vercel
Backend Host Render

๐Ÿš€ Quick Start โ€” Docker Compose (Recommended)

# 1. Clone the repository
git clone https://github.com/YOUR_USERNAME/devops-project.git
cd devops-project

# 2. Configure environment
cp .env.example .env
# Edit .env and add your API keys:
#   GEMINI_API_KEY=your-gemini-key
#   RESEND_API_KEY=your-resend-key
#   API_KEY=any-strong-random-string

# 3. Build and run
docker-compose up --build

# 4. Open the app
#   Frontend: http://localhost:3000
#   API Docs: http://localhost:8000/docs
#   Health:   http://localhost:8000/health

๐Ÿ’ป Local Development (Without Docker)

Backend

cd backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt

# Create .env in the project root with your keys
uvicorn app.main:app --reload --port 8000

Frontend

cd frontend
npm install

# Create .env with:
#   VITE_API_URL=http://localhost:8000
#   VITE_API_KEY=your-api-key
npm run dev

๐Ÿ“ก API Documentation

Interactive Swagger docs available at /docs when the backend is running.

Endpoints

Method Path Auth Description
GET /health None Health check with service configuration status
POST /api/v1/sales/insights API Key Upload file + email, get AI summary
GET /docs None Swagger UI
GET /redoc None ReDoc UI
GET /openapi.json None OpenAPI spec

Sample curl

curl -X POST http://localhost:8000/api/v1/sales/insights \
  -H "X-API-Key: your-api-key" \
  -F "file=@sample_data/sales_q1_2026.csv" \
  -F "email=recipient@example.com"

Response Shape

{
  "status": "success",
  "message": "Summary generated and emailed successfully.",
  "recipient": "recipient@example.com",
  "summary_preview": "## Executive Summary\n...",
  "metrics": {
    "total_revenue": 684000,
    "total_units_sold": 640,
    "total_records": 6,
    "date_range": "2026-01-05 to 2026-03-10",
    "top_region": "North",
    "top_product_category": "Electronics",
    "revenue_by_region": { "North": 466500, "South": 20250, "East": 88000, "West": 109250 },
    "revenue_by_category": { "Electronics": 639750, "Home Appliances": 44250 },
    "status_breakdown": { "Shipped": 3, "Delivered": 2, "Cancelled": 1 },
    "cancellation_count": 1,
    "cancellation_revenue": 24000,
    "best_transaction": { "category": "Electronics", "region": "North", "revenue": 262500, "units": 210 }
  }
}

๐Ÿ”’ Security Measures

# Protection Implementation
1 API Key Authentication X-API-Key header required on protected endpoints
2 Rate Limiting 10 req/min per IP via slowapi
3 CORS Restriction Origins configurable via ALLOWED_ORIGINS env var
4 Security Headers X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, HSTS, Referrer-Policy
5 File Validation Extension whitelist (.csv/.xlsx), size limit (5MB), row limit (10,000)
6 Input Sanitization Server-side email regex validation, filename sanitization
7 Non-root Containers Backend Docker image runs as appuser, not root
8 Data Privacy Only computed metrics sent to LLM โ€” never raw uploaded data
9 Request Tracing UUID request ID on every request for observability
10 Safe Error Messages Internal errors are sanitized before reaching the client
11 Timeout Protection Configurable timeouts on LLM and email service calls
12 Resource Limits Docker memory limits per container

โš™๏ธ CI/CD Pipeline

The GitHub Actions workflow (.github/workflows/ci.yml) runs on every PR and push to main:

Job Description
Backend Lint flake8 + black --check
Backend Tests pytest with full test suite
Frontend Lint & Build eslint + tsc + vite build
Docker Validation Builds both Docker images + validates docker-compose.yml
Security Scan pip-audit on backend dependencies

๐ŸŒ Deployment

Backend โ†’ Render

  1. Connect your GitHub repo to Render
  2. Use the included render.yaml for one-click setup, or:
    • Create a new Web Service
    • Set Docker build context to backend/
    • Add environment variables: API_KEY, GEMINI_API_KEY, RESEND_API_KEY, ALLOWED_ORIGINS, EMAIL_FROM
    • Health check path: /health

Frontend โ†’ Vercel

  1. Import the repo on Vercel
  2. Set Root Directory to frontend
  3. Framework Preset: Vite
  4. Add environment variables:
    • VITE_API_URL = your Render backend URL
    • VITE_API_KEY = your API key
  5. Deploy

๐Ÿง  Architecture Decisions & Tradeoffs

Decision Why
Modular monorepo over microservices Under a 3-hour sprint, a clean modular backend is more reliable and easier to debug than inter-service HTTP calls. Demonstrates the same separation of concerns without the operational overhead.
Structured metrics โ†’ LLM vs raw file Sending computed metrics instead of raw CSV to Gemini is more secure (no data leakage), produces more consistent outputs, uses fewer tokens (cost-efficient), and allows deterministic prompt engineering.
Resend for email Free tier, simplest Python SDK, excellent DX. Note: Free tier only delivers to the verified sender email โ€” upgrade for production use.
Gemini 2.0 Flash Free tier available, fast response times, good summary quality.
slowapi for rate limiting Lightweight, integrates directly with FastAPI, no external dependencies like Redis needed.
API Key (not full auth) Proportionate to the challenge scope. Full JWT/OAuth is overkill for a prototype with no user management.

๐Ÿ”ฎ Future Improvements

  • Message queue (RabbitMQ/Redis) for async processing of large files
  • JWT authentication for multi-user support
  • S3 file storage for audit trail and compliance
  • Prometheus + Grafana for monitoring and alerting
  • Kubernetes manifests for production orchestration
  • Caching layer for repeated analyses of the same file
  • Webhook notifications in addition to email
  • Multiple LLM providers with fallback support

๐Ÿ”ง Environment Variables

Variable Required Default Description
API_KEY Yes change-me... API key for endpoint authentication
GEMINI_API_KEY Yes โ€” Google Gemini API key
RESEND_API_KEY No โ€” Resend email API key (email skipped if absent)
EMAIL_FROM No onboarding@resend.dev Sender email address
ALLOWED_ORIGINS No http://localhost:3000,... Comma-separated CORS origins
MAX_FILE_SIZE_MB No 5 Maximum upload file size
MAX_ROW_COUNT No 10000 Maximum rows in uploaded file
REQUEST_TIMEOUT_SECONDS No 30 Timeout for LLM/email calls
RATE_LIMIT No 10/minute Rate limit for insights endpoint
VITE_API_URL Yes (FE) http://localhost:8000 Backend URL for frontend
VITE_API_KEY Yes (FE) โ€” API key used by frontend

๐Ÿ“ Project Structure

devops-project/
โ”œโ”€โ”€ backend/
โ”‚   โ”œโ”€โ”€ app/
โ”‚   โ”‚   โ”œโ”€โ”€ api/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ routes.py          # API endpoint definitions
โ”‚   โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ __init__.py        # Settings & configuration
โ”‚   โ”‚   โ”œโ”€โ”€ middleware/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ __init__.py        # Security, rate limit, request ID
โ”‚   โ”‚   โ”œโ”€โ”€ schemas/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ __init__.py        # Pydantic models
โ”‚   โ”‚   โ”œโ”€โ”€ services/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ai_engine.py       # Gemini LLM integration
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ email_service.py   # Resend email delivery
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ parser.py          # CSV/XLSX parsing & metrics
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ””โ”€โ”€ main.py                # FastAPI app entry point
โ”‚   โ”œโ”€โ”€ tests/
โ”‚   โ”‚   โ”œโ”€โ”€ conftest.py            # Shared fixtures
โ”‚   โ”‚   โ”œโ”€โ”€ test_health.py
โ”‚   โ”‚   โ”œโ”€โ”€ test_parser.py
โ”‚   โ”‚   โ”œโ”€โ”€ test_security.py
โ”‚   โ”‚   โ””โ”€โ”€ test_upload.py
โ”‚   โ”œโ”€โ”€ Dockerfile
โ”‚   โ”œโ”€โ”€ .dockerignore
โ”‚   โ”œโ”€โ”€ requirements.txt
โ”‚   โ””โ”€โ”€ requirements-dev.txt
โ”œโ”€โ”€ frontend/
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ components/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ StatusMessage.tsx
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ SummaryPreview.tsx
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ UploadForm.tsx
โ”‚   โ”‚   โ”œโ”€โ”€ services/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ api.ts
โ”‚   โ”‚   โ”œโ”€โ”€ App.css
โ”‚   โ”‚   โ”œโ”€โ”€ App.tsx
โ”‚   โ”‚   โ”œโ”€โ”€ main.tsx
โ”‚   โ”‚   โ”œโ”€โ”€ types.ts
โ”‚   โ”‚   โ””โ”€โ”€ vite-env.d.ts
โ”‚   โ”œโ”€โ”€ Dockerfile
โ”‚   โ”œโ”€โ”€ .dockerignore
โ”‚   โ”œโ”€โ”€ nginx.conf
โ”‚   โ”œโ”€โ”€ index.html
โ”‚   โ”œโ”€โ”€ package.json
โ”‚   โ”œโ”€โ”€ tsconfig.json
โ”‚   โ”œโ”€โ”€ vite.config.ts
โ”‚   โ””โ”€โ”€ eslint.config.js
โ”œโ”€โ”€ sample_data/
โ”‚   โ””โ”€โ”€ sales_q1_2026.csv
โ”œโ”€โ”€ scripts/
โ”‚   โ””โ”€โ”€ healthcheck.sh
โ”œโ”€โ”€ .github/workflows/
โ”‚   โ””โ”€โ”€ ci.yml
โ”œโ”€โ”€ docker-compose.yml
โ”œโ”€โ”€ render.yaml
โ”œโ”€โ”€ .env.example
โ”œโ”€โ”€ .flake8
โ”œโ”€โ”€ .gitignore
โ””โ”€โ”€ README.md

๐Ÿ› Troubleshooting

Issue Solution
CORS errors in browser Ensure ALLOWED_ORIGINS includes your frontend URL
401 on upload Check that VITE_API_KEY matches the backend API_KEY
Email not received Resend free tier only sends to verified emails. Check Resend dashboard.
Gemini 503 error Verify GEMINI_API_KEY is valid and has quota remaining
Docker build fails Ensure Docker Desktop is running; try docker-compose build --no-cache
Rate limited Default is 10 req/min. Wait or adjust RATE_LIMIT env var

๏ฟฝ OWASP Top 10:2025 Security Posture

This project has been hardened against the OWASP Top 10:2025 categories.

# Category Status Implementation
A01 Broken Access Control โœ… Mitigated API key auth (timing-safe hmac.compare_digest), filename sanitization (path traversal, null bytes, .. removal)
A02 Cryptographic Failures / Security Misconfiguration โœ… Mitigated HSTS with preload, secrets via env vars, default-key CRITICAL log, DOCS_ENABLED=false for prod, allow_headers restricted
A03 Injection โœ… Mitigated Formula injection detection in CSV (=,+,-,@,TAB,CR prefixes), Pydantic input validation, parameterized queries N/A (no DB)
A04 Insecure Design โš ๏ธ Partial Rate limiting, file size/row caps, timeout enforcement. Recommend: add request-level cost budgets
A05 Security Misconfiguration โœ… Mitigated CSP, Permissions-Policy, COOP, CORP on both backend + nginx, X-XSS-Protection: 0, read-only containers, cap_drop: ALL
A06 Vulnerable & Outdated Components โœ… Mitigated Magic-byte file validation, pip-audit + npm audit in CI, pinned GitHub Actions by SHA
A07 Identification & Authentication Failures โœ… Mitigated Timing-safe API key comparison, SECURITY_EVENT logging on auth failures
A08 Software & Data Integrity Failures โœ… Mitigated GitHub Actions pinned by commit SHA, npm ci --ignore-scripts, no-new-privileges in Docker
A09 Security Logging & Monitoring Failures โœ… Mitigated SECURITY_EVENT prefix on all security events (auth failures, rejected files, formula injection, path traversal, unhandled exceptions)
A10 Server-Side Request Forgery (SSRF) / Mishandling Exceptional Conditions โœ… Mitigated Global exception handler hides stack traces, generic 500 response, logger.exception() server-side only

Remaining Risks & Recommendations

  • Secret rotation: Implement automated API key rotation; consider OAuth2/JWT for multi-user scenarios
  • WAF: Deploy behind a Web Application Firewall (Cloudflare, AWS WAF) for DDoS + advanced injection filtering
  • Dependency pinning: Generate requirements.txt with pip-compile --generate-hashes for reproducible builds
  • SBOM: Add Software Bill of Materials generation to CI pipeline
  • Penetration testing: Schedule periodic external pen tests

๏ฟฝ๐Ÿ“„ License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors