Skip to content
View DhruvGarg111's full-sized avatar

Highlights

  • Pro

Block or report DhruvGarg111

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DhruvGarg111/README.md

Dhruv Garg

AI / ML Engineer β€’ Computer Vision β€’ Generative AI

Building practical AI systems, one focused iteration at a time.

Roles

Computer Vision Deep Learning Backend Systems

Dhruv Garg tech ecosystem

Building intelligent systems that see, understand, and create.


πŸ”¬ Engineering Profile

I am a Machine Learning Engineer focused on Computer Vision and Agentic AI, with a strong foundation in scalable backend systems. My engineering philosophy revolves around translating complex research papers into optimized, production-ready code.

  • 🎯 Focus: Bypassing computational bottlenecks in high-resolution (4K) object detection using Explainable AI (XAI).
  • πŸ€– AI Engineering: Building local LLM agents that seamlessly interact with third-party ecosystems (Google APIs, etc.).
  • βš™οΈ Infrastructure: Architecting robust database migrations and building backend profilers.
  • πŸ’‘ Goal: I build systems that are not just intelligent, but fast, scalable, and resilient.

πŸš€ Featured Projects

🌟 Flagship Projects

⚑ PixelQueue

Vision Intelligence Infrastructure: A high-performance, async control panel for human-in-the-loop AI annotation.

A sleek, dark-themed control panel designed for decoupled ML microservices and robust task queues, eliminating UX bottlenecks with pure speed and instantaneous rendering.

Key Innovations:

  • πŸš€ Asynchronous ML: Non-blocking AI auto-labeling via PyTorch, LayerCAM & YOLO.
  • ⚑ Zero-Latency UI: Hardware-accelerated React-Konva staging canvas.
  • πŸ”„ Decoupled Workers: Infinite horizontal scaling using Celery message brokers.
  • πŸ”’ Isolated Workspaces: Robust Role-Based Access Control (RBAC) circuits.

"Finding the needle in the haystack, from 400ft above."

A novel coarse-to-fine computer vision pipeline designed for efficient small object detection in high-resolution (2K/4K) aerial imagery. Tackles the critical trade-off between resolution and latency in drone forensics.

Key Innovations:

  • Uses LayerCAM to identify semantic "hotspots" before processing.
  • Intelligently slices and zooms into regions of interestβ€”skipping 80%+ of empty backgrounds.
  • Outperforms blind sliding-window approaches (SAHI) in both speed and accuracy.

🎨 Neural Canvas

Transform any image into a masterpiece β€” in real-time.

A fast neural style transfer implementation that generates stylized images using a feed-forward CNN trained with perceptual loss. Performs instant stylization in a single forward pass.

Key Features:

  • πŸš€ Real-time inference with a custom residual architecture.
  • 🧠 Perceptual content & style loss using a pretrained VGG-16 network.
  • πŸ” Instance Normalization integrated for high-quality, artifact-free outputs.
  • πŸ“¦ ONNX export supported, ready for edge deployment.

πŸ“¦ More Projects

🧭 pygog (Google CLI Agent)
A powerful CLI for Google services (Gmail, Drive, Calendar). Features a built-in natural language AI agent supporting Gemini, DeepSeek, & OpenAI.
<Python> <Google APIs> <LLM Agents>

πŸ“ Depth Estimation + Semantic Seg.
Multi-modal depth completion using RGB + sparse depth + semantic maps. Features a DepthNet-style encoder-decoder trained on NYU Depth v2 with multi-scale supervision.
<PyTorch> <NYU-Depth-v2> <Encoder-Decoder>


πŸ› οΈ Stack Matrix

stack-icons

vision modeling serving interface

🌐 Open Source Contributions

I actively contribute to the broader developer ecosystem, with recent merged work spanning agent frameworks, AI infrastructure, developer tooling, and performance-focused ML apps:

  • πŸ€– SynapseKit/SynapseKit: Shipped 22 merged PRs covering native observability, VoiceAgent audio pipelines, graph-builder tooling, benchmark suites, CronTrigger scheduling, self-healing cost-aware agents, persistent agent memory, multimodal RAG ingestion, and new cloud/data loaders plus local/self-hosted model integrations.
  • ⚑ DhruvGarg111/PixelQueue: Landed 10 merged PRs focused on annotation-platform performance, including Zustand history optimizations, React memoization boundaries, callback stabilization, faster YOLO export formatting, and database/Celery fixes that remove N+1 insert bottlenecks.
  • 🎨 DhruvGarg111/Neural-Style-Transfer: Merged 5 improvements that sharpen both UX and inference efficiency, including clearer style-selection flows, in-place tensor operations, reflect-padding convolution simplifications, and modern Pillow compatibility fixes.
  • 🧭 DhruvGarg111/py-goog-cli: Added safer and more reliable Google Workspace CLI workflows with a Drive query-injection fix, --dry-run support for destructive actions, stronger config/output test coverage, and targeted cleanup.
  • πŸ”Ž lancedb/lancedb: Updated LanceDB’s Python Gemini embedding provider to the newer google-genai SDK, keeping vector-search integrations aligned with Google’s latest API stack.

πŸ“Š Telemetry

github-stats streak
streak most-commit-language
activity-graph

πŸ”— Connect & Explore

Website Β β€’Β  Searchlight Live App Β β€’Β  Email Me


Built by DhruvGarg111

Pinned Loading

  1. Neural-Style-Transfer Neural-Style-Transfer Public

    Python

  2. The-Searchlight-Protocol The-Searchlight-Protocol Public

    Python