Dhruv Garg DhruvGarg111

Dhruv Garg

AI / ML Engineer • Computer Vision • Generative AI

Building practical AI systems, one focused iteration at a time.

Building intelligent systems that see, understand, and create.

🔬 Engineering Profile

I am a Machine Learning Engineer focused on Computer Vision and Agentic AI, with a strong foundation in scalable backend systems. My engineering philosophy revolves around translating complex research papers into optimized, production-ready code.

🎯 Focus: Bypassing computational bottlenecks in high-resolution (4K) object detection using Explainable AI (XAI).
🤖 AI Engineering: Building local LLM agents that seamlessly interact with third-party ecosystems (Google APIs, etc.).
⚙️ Infrastructure: Architecting robust database migrations and building backend profilers.
💡 Goal: I build systems that are not just intelligent, but fast, scalable, and resilient.

🚀 Featured Projects

🌟 Flagship Projects

⚡ PixelQueue

Vision Intelligence Infrastructure: A high-performance, async control panel for human-in-the-loop AI annotation.

A sleek, dark-themed control panel designed for decoupled ML microservices and robust task queues, eliminating UX bottlenecks with pure speed and instantaneous rendering.

Key Innovations:

🚀 Asynchronous ML: Non-blocking AI auto-labeling via PyTorch, LayerCAM & YOLO.
⚡ Zero-Latency UI: Hardware-accelerated React-Konva staging canvas.
🔄 Decoupled Workers: Infinite horizontal scaling using Celery message brokers.
🔒 Isolated Workspaces: Robust Role-Based Access Control (RBAC) circuits.

🔦 The Searchlight Protocol

"Finding the needle in the haystack, from 400ft above."

A novel coarse-to-fine computer vision pipeline designed for efficient small object detection in high-resolution (2K/4K) aerial imagery. Tackles the critical trade-off between resolution and latency in drone forensics.

Key Innovations:

Uses LayerCAM to identify semantic "hotspots" before processing.
Intelligently slices and zooms into regions of interest—skipping 80%+ of empty backgrounds.
Outperforms blind sliding-window approaches (SAHI) in both speed and accuracy.

🎨 Neural Canvas

Transform any image into a masterpiece — in real-time.

A fast neural style transfer implementation that generates stylized images using a feed-forward CNN trained with perceptual loss. Performs instant stylization in a single forward pass.

Key Features:

🚀 Real-time inference with a custom residual architecture.
🧠 Perceptual content & style loss using a pretrained VGG-16 network.
🔁 Instance Normalization integrated for high-quality, artifact-free outputs.
📦 ONNX export supported, ready for edge deployment.

📦 More Projects

🧭 pygog (Google CLI Agent)
A powerful CLI for Google services (Gmail, Drive, Calendar). Features a built-in natural language AI agent supporting Gemini, DeepSeek, & OpenAI.
<Python> <Google APIs> <LLM Agents>

📐 Depth Estimation + Semantic Seg.
Multi-modal depth completion using RGB + sparse depth + semantic maps. Features a DepthNet-style encoder-decoder trained on NYU Depth v2 with multi-scale supervision.
<PyTorch> <NYU-Depth-v2> <Encoder-Decoder>

🛠️ Stack Matrix

🌐 Open Source Contributions

I actively contribute to the broader developer ecosystem, with recent merged work spanning agent frameworks, AI infrastructure, developer tooling, and performance-focused ML apps:

🤖 SynapseKit/SynapseKit: Shipped 22 merged PRs covering native observability, VoiceAgent audio pipelines, graph-builder tooling, benchmark suites, CronTrigger scheduling, self-healing cost-aware agents, persistent agent memory, multimodal RAG ingestion, and new cloud/data loaders plus local/self-hosted model integrations.
⚡ DhruvGarg111/PixelQueue: Landed 10 merged PRs focused on annotation-platform performance, including Zustand history optimizations, React memoization boundaries, callback stabilization, faster YOLO export formatting, and database/Celery fixes that remove N+1 insert bottlenecks.
🎨 DhruvGarg111/Neural-Style-Transfer: Merged 5 improvements that sharpen both UX and inference efficiency, including clearer style-selection flows, in-place tensor operations, reflect-padding convolution simplifications, and modern Pillow compatibility fixes.
🧭 DhruvGarg111/py-goog-cli: Added safer and more reliable Google Workspace CLI workflows with a Drive query-injection fix, --dry-run support for destructive actions, stronger config/output test coverage, and targeted cleanup.
🔎 lancedb/lancedb: Updated LanceDB’s Python Gemini embedding provider to the newer google-genai SDK, keeping vector-search integrations aligned with Google’s latest API stack.

📊 Telemetry

🔗 Connect & Explore

Website • Searchlight Live App • Email Me

^{Built by DhruvGarg111}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly