π Senior AI/ML Engineer | LLM & Agentic AI Specialist | MLOps | Deep Learning | Computer Vision | Fine-Tuning | CI/CD
π Senior AI Researcher & Developer β 5+ years architecting production-grade AI systems across healthcare, EdTech, legal, finance, and sales intelligence domains. Currently leading 8 concurrent enterprise AI/ML engagements at Chetu Inc.
- π Senior AI/ML Engineer with deep expertise in Agentic AI, LLM/SLM Fine-Tuning, RAG Pipelines, and MLOps
- π Promoted twice in 18 months at Chetu Inc. β Team Member β Senior TM β Technical Team Lead
- π» Skilled in LangChain, LangGraph, Google ADK, A2A, MCP, Unsloth QLoRA, LoRA, vLLM, RAGAS, LangSmith, Ollama
- π€ Experienced in On-Prem LLM Deployment (LLaMA 3.2 70B, 94% GPT-4o parity), Multi-Agent Orchestration, and Context Engineering
- π Built production AI systems across healthcare, EdTech, legal, finance, recruitment, and equestrian domains
- π Expert in MLOps β MLflow, Docker, DVC, AWS EKS, GitHub Actions, GitLab CI/CD
- π Passionate about intelligent system design, enterprise AI integration, and measurable AI outcomes
- π Always learning & building scalable AI pipelines with real-world impact
- Built a GPT-4o + LangGraph multi-agent system integrated with Automatiq B2B API for intelligent ticket routing.
- Designed an intent-routing state machine with a regex JSON sanitizer achieving 98% routing accuracy across 320+ test cases including adversarial and E2E journeys.
- Stack: LangGraph, GPT-4o, Automatiq API, Python
- Developed a real-time AI tutoring system with Whisper STT, OpenAI TTS-1-HD, and Weaviate Hybrid RAG (BM25 + Vector search).
- Implemented grade-aware LLM-as-Judge guardrails with WebRTC sub-300ms latency.
- Deployed on Amazon EKS with multi-stage Docker builds.
- Stack: Whisper, OpenAI TTS, Weaviate, WebRTC, AWS EKS, Docker
- Architected a 3-node LangGraph pipeline: Supervisor β DB Extraction β Self-RAG Vault with dual-layer guardrails (keyword + LLM).
- Achieved 98% routing accuracy and 100% guardrail block rate across 500+ validation tests.
- Stack: LangGraph, Self-RAG, Guardrails AI, Python
- Built a 7-node LangGraph StateGraph with Qdrant hierarchical memory (7 streams, MD5 dedup) and a 4-layer retrieval cascade.
- Integrated an LLM quality judge (0.75 threshold), improving retry accuracy by 30%.
- Stack: LangGraph, Qdrant, GPT-4o, Python
- Developed an Advanced Self-RAG chatbot using LangChain, LangGraph, MLflow, and Django.
- Supports multi-modal query handling (text, images, tables, flowcharts) extracted from PDFs.
- Achieved 87% answer accuracy on 100 human-evaluated Q&A pairs; BGE-M3 embeddings (nDCG 0.91); RBAC with 100% enforcement.
- Integrated Web Scraping with Selenium for real-time referenced data.
- Stack: LangChain, LangGraph, BGE-M3, ChromaDB, Django, MLflow
- Eliminated vector DB overhead entirely using JSON/in-memory cache (PageIndex approach).
- Built a dual-LLM pipeline (GPT-4o Vision + GPT-4o Text) with Gmail/IMAP integration for automated invoice extraction.
- Stack: Flask, SQLite, LangChain, GPT-4o Vision, PageIndex
- Designed a LangGraph multi-agent PPTX generator: Master Architect β Quantitative Analyst (Matplotlib charts) β Image Node (DALL-E 3) β Layout Reconciliation.
- Implemented AABB collision detection and dynamic font-scaling for polished slide layouts.
- Stack: LangGraph, GPT-4o, DALL-E 3, python-pptx
- Built a SequentialAgent pipeline: FintechAnalyst (Gemini 2.5 Flash) β LogisticsExpert (LLaMA 3.3/Groq) using Agent-to-Agent (A2A) protocol.
- Developed a Kafka KRaft FinBERT trading pipeline with BUY/SELL rules and a WebSocket real-time dashboard.
- Stack: Google ADK, A2A, Gemini 2.5 Flash, LLaMA 3.3, Groq, Kafka, FinBERT, WebSocket
- Deployed LLaMA 3.2 70B with 4-bit NF4 (bitsandbytes + Flash Attention 2) on 80GB GPU; achieved 94% GPT-4o parity while eliminating API costs.
- Built vLLM/Ollama OpenAI-compatible wrapper with GitHub Actions CI/CD for hot-swappable model iterations.
- Fine-tuned SLM (Llama 3.2 3B) via Unsloth QLoRA β 60% GPU memory reduction, 2Γ training speed; improved nDCG 73%β82%, MRR 68β71.
- Stack: LLaMA 3.2 70B, vLLM, Ollama, Unsloth, QLoRA, bitsandbytes, Flash Attention 2, GitHub Actions
- Built a Fusion RAG system (Pinecone + FAISS/SerpAPI) with MMR and cross-encoder reranking (ms-marco-MiniLM).
- Reduced report generation time by 60% (270s β 100s) via
asyncio.gather; successfully stress-tested for 50 concurrent users. - Integrated Stripe Connect for multi-tenant billing.
- Stack: Pinecone, FAISS, MiniLM, Django, Celery, Stripe Connect
- Built Text-to-Image generation models with Stable Diffusion, Pix2Pix, and Transformer-based architectures.
- Spatio-temporal crowd forecasting using YOLOv8 + CNN-LSTM, Gaussian density maps (ShanghaiTech dataset).
- AI-driven Image Captioning using BLIP2 and vision-language models; real-time object detection with Detectron2.
- U-Net + DLNN brain tumor segmentation/classification web app; ResNet-V3 plant disease detection REST API.
- Stack: YOLO, Detectron2, Stable Diffusion, BLIP2, U-Net, ResNet, PyTorch, OpenCV
- AI-powered meeting analytics system using YOLO (face detection), WhisperX (transcription, WER <8%), T5 (summarization, 92% grammar correction), Llama 3.3 (context analysis).
- Implemented Emotion and Sentiment Detection; automated Report Generation via Google Cloud API.
- Achieved 99% task reliability across 100 pipeline runs using Django Celery + RabbitMQ.
- Stack: YOLO, WhisperX, T5, Llama 3.3, Django, Celery, RabbitMQ, Streamlit
- Developed comprehensive web scraping frameworks using Selenium, BeautifulSoup, and Scrapy.
- Created intelligent data collection pipelines with cleaning and transformation techniques.
- Implemented ethical scraping practices (robots.txt compliance).
- BERT Autoencoder for query-based data extraction (ontology + PCA + WaOA clustering).
- NL-to-SQL pipeline using Seq2SQL + PSO for natural language database querying.
- Neo4j intent classification for graph-based reasoning; improved DL model accuracy 80% β 90%.
- Stack: BERT, RoBERTa, T5, Seq2SQL, Neo4j, PyTorch, Scikit-learn
| Role | Company | Period |
|---|---|---|
| Technical Team Lead β AI/ML | Chetu Inc. | Sep 2025 β Present |
| Sr. Software Developer β AI/ML | Chetu Inc. | Mar 2025 β Sep 2025 |
| Software Developer β AI/ML | Chetu Inc. | Mar 2024 β Mar 2025 |
| ML/AI Engineer & Data Scientist | Research Developers | Sep 2021 β Mar 2024 |
π― "Transforming Industries with Intelligent AI Solutions" π
