Dhruv Sharma dhruvsh1997

Myself Dhruv Sharma, Welcome to my Profile 👋

📍 Senior AI Researcher & Developer — 5+ years architecting production-grade AI systems across healthcare, EdTech, legal, finance, and sales intelligence domains. Currently leading 8 concurrent enterprise AI/ML engagements at Chetu Inc.

📊 Dashboard Overview

🔹 About Me

🎓 Senior AI/ML Engineer with deep expertise in Agentic AI, LLM/SLM Fine-Tuning, RAG Pipelines, and MLOps
🏆 Promoted twice in 18 months at Chetu Inc. — Team Member → Senior TM → Technical Team Lead
💻 Skilled in LangChain, LangGraph, Google ADK, A2A, MCP, Unsloth QLoRA, LoRA, vLLM, RAGAS, LangSmith, Ollama
🤖 Experienced in On-Prem LLM Deployment (LLaMA 3.2 70B, 94% GPT-4o parity), Multi-Agent Orchestration, and Context Engineering
🚀 Built production AI systems across healthcare, EdTech, legal, finance, recruitment, and equestrian domains
📈 Expert in MLOps — MLflow, Docker, DVC, AWS EKS, GitHub Actions, GitLab CI/CD
🛠 Passionate about intelligent system design, enterprise AI integration, and measurable AI outcomes
📚 Always learning & building scalable AI pipelines with real-world impact

🏆 Key AI Projects & Contributions

1️⃣ LangGraph Multi-Agent Ticket Concierge (Enterprise — Chetu Inc.)

Built a GPT-4o + LangGraph multi-agent system integrated with Automatiq B2B API for intelligent ticket routing.
Designed an intent-routing state machine with a regex JSON sanitizer achieving 98% routing accuracy across 320+ test cases including adversarial and E2E journeys.
Stack: LangGraph, GPT-4o, Automatiq API, Python

2️⃣ AI Tutor with Animated Avatars (Enterprise — EdTech)

Developed a real-time AI tutoring system with Whisper STT, OpenAI TTS-1-HD, and Weaviate Hybrid RAG (BM25 + Vector search).
Implemented grade-aware LLM-as-Judge guardrails with WebRTC sub-300ms latency.
Deployed on Amazon EKS with multi-stage Docker builds.
Stack: Whisper, OpenAI TTS, Weaviate, WebRTC, AWS EKS, Docker

3️⃣ IRS Section 125 Benefits Compliance Agent (Enterprise — Legal/Finance)

Architected a 3-node LangGraph pipeline: Supervisor → DB Extraction → Self-RAG Vault with dual-layer guardrails (keyword + LLM).
Achieved 98% routing accuracy and 100% guardrail block rate across 500+ validation tests.
Stack: LangGraph, Self-RAG, Guardrails AI, Python

4️⃣ Equestrian AI Companion v2 (Enterprise)

Built a 7-node LangGraph StateGraph with Qdrant hierarchical memory (7 streams, MD5 dedup) and a 4-layer retrieval cascade.
Integrated an LLM quality judge (0.75 threshold), improving retry accuracy by 30%.
Stack: LangGraph, Qdrant, GPT-4o, Python

5️⃣ Agentic AI-Based Self-RAG Chatbot with Role-Based Authentication

Developed an Advanced Self-RAG chatbot using LangChain, LangGraph, MLflow, and Django.
Supports multi-modal query handling (text, images, tables, flowcharts) extracted from PDFs.
Achieved 87% answer accuracy on 100 human-evaluated Q&A pairs; BGE-M3 embeddings (nDCG 0.91); RBAC with 100% enforcement.
Integrated Web Scraping with Selenium for real-time referenced data.
Stack: LangChain, LangGraph, BGE-M3, ChromaDB, Django, MLflow

6️⃣ InvoiceIQ — PageIndex Vectorless RAG (PoC)

Eliminated vector DB overhead entirely using JSON/in-memory cache (PageIndex approach).
Built a dual-LLM pipeline (GPT-4o Vision + GPT-4o Text) with Gmail/IMAP integration for automated invoice extraction.
Stack: Flask, SQLite, LangChain, GPT-4o Vision, PageIndex

7️⃣ Agentic PPT Architect (PoC)

Designed a LangGraph multi-agent PPTX generator: Master Architect → Quantitative Analyst (Matplotlib charts) → Image Node (DALL-E 3) → Layout Reconciliation.
Implemented AABB collision detection and dynamic font-scaling for polished slide layouts.
Stack: LangGraph, GPT-4o, DALL-E 3, python-pptx

8️⃣ Google ADK A2A Multi-Agent Pipeline (PoC)

Built a SequentialAgent pipeline: FintechAnalyst (Gemini 2.5 Flash) → LogisticsExpert (LLaMA 3.3/Groq) using Agent-to-Agent (A2A) protocol.
Developed a Kafka KRaft FinBERT trading pipeline with BUY/SELL rules and a WebSocket real-time dashboard.
Stack: Google ADK, A2A, Gemini 2.5 Flash, LLaMA 3.3, Groq, Kafka, FinBERT, WebSocket

9️⃣ On-Prem LLM Infrastructure — LLaMA 3.2 70B (Enterprise — Chetu Inc.)

Deployed LLaMA 3.2 70B with 4-bit NF4 (bitsandbytes + Flash Attention 2) on 80GB GPU; achieved 94% GPT-4o parity while eliminating API costs.
Built vLLM/Ollama OpenAI-compatible wrapper with GitHub Actions CI/CD for hot-swappable model iterations.
Fine-tuned SLM (Llama 3.2 3B) via Unsloth QLoRA — 60% GPU memory reduction, 2× training speed; improved nDCG 73%→82%, MRR 68→71.
Stack: LLaMA 3.2 70B, vLLM, Ollama, Unsloth, QLoRA, bitsandbytes, Flash Attention 2, GitHub Actions

🔟 Multi-Tenant Veterinary SaaS (Enterprise)

Built a Fusion RAG system (Pinecone + FAISS/SerpAPI) with MMR and cross-encoder reranking (ms-marco-MiniLM).
Reduced report generation time by 60% (270s → 100s) via asyncio.gather; successfully stress-tested for 50 concurrent users.
Integrated Stripe Connect for multi-tenant billing.
Stack: Pinecone, FAISS, MiniLM, Django, Celery, Stripe Connect

1️⃣1️⃣ Computer Vision & Generative AI Models

Built Text-to-Image generation models with Stable Diffusion, Pix2Pix, and Transformer-based architectures.
Spatio-temporal crowd forecasting using YOLOv8 + CNN-LSTM, Gaussian density maps (ShanghaiTech dataset).
AI-driven Image Captioning using BLIP2 and vision-language models; real-time object detection with Detectron2.
U-Net + DLNN brain tumor segmentation/classification web app; ResNet-V3 plant disease detection REST API.
Stack: YOLO, Detectron2, Stable Diffusion, BLIP2, U-Net, ResNet, PyTorch, OpenCV

1️⃣2️⃣ Meeting Video Processing & AI-Based Report Generation

AI-powered meeting analytics system using YOLO (face detection), WhisperX (transcription, WER <8%), T5 (summarization, 92% grammar correction), Llama 3.3 (context analysis).
Implemented Emotion and Sentiment Detection; automated Report Generation via Google Cloud API.
Achieved 99% task reliability across 100 pipeline runs using Django Celery + RabbitMQ.
Stack: YOLO, WhisperX, T5, Llama 3.3, Django, Celery, RabbitMQ, Streamlit

1️⃣3️⃣ Web Scraping & Data Intelligence

Developed comprehensive web scraping frameworks using Selenium, BeautifulSoup, and Scrapy.
Created intelligent data collection pipelines with cleaning and transformation techniques.
Implemented ethical scraping practices (robots.txt compliance).

1️⃣4️⃣ NLP Research & Deep Learning Models

BERT Autoencoder for query-based data extraction (ontology + PCA + WaOA clustering).
NL-to-SQL pipeline using Seq2SQL + PSO for natural language database querying.
Neo4j intent classification for graph-based reasoning; improved DL model accuracy 80% → 90%.
Stack: BERT, RoBERTa, T5, Seq2SQL, Neo4j, PyTorch, Scikit-learn

🛠 Tech Stack & Tools

Category	Technologies
Languages
Agentic AI & Orchestration
LLMs & Generative AI
Fine-Tuning & Embeddings
RAG & Vector Stores
Computer Vision
ML & Deep Learning
Web Frameworks
Messaging & Streaming
Databases
Data Processing
MLOps & Deployment
Cloud

💼 Professional Experience

Role	Company	Period
Technical Team Lead — AI/ML	Chetu Inc.	Sep 2025 – Present
Sr. Software Developer — AI/ML	Chetu Inc.	Mar 2025 – Sep 2025
Software Developer — AI/ML	Chetu Inc.	Mar 2024 – Mar 2025
ML/AI Engineer & Data Scientist	Research Developers	Sep 2021 – Mar 2024

📊 GitHub Contribution Activity

📫 Connect with Me

🎯 "Transforming Industries with Intelligent AI Solutions" 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly