Software Engineer · Data Engineer · Distributed Systems Enthusiast
MS Computer Science @ University of Southern California (GPA: 3.57)
2+ years building scalable pipelines, cloud-native backends, and ML-powered systems
I'm a Computer Science graduate student at USC with a background in building high-throughput data pipelines, distributed backend systems, and AI-integrated applications. I've processed 100M+ records, reduced system latency by 80%+, and shipped containerized microservices in production environments.
I'm passionate about the intersection of systems engineering, data at scale, and applied ML — and I love building things that are both technically rigorous and practically useful.
- 🔬 Currently: Research Assistant in Bioinformatics @ USC — building ETL pipelines for RNA-seq data
- 🏎️ Latest project: F1 Podium Prediction Model with 82.35% true positive rate & 0.98 AUC
- 📖 Coursework: Distributed Systems, High-Performance Computing, Deep Learning, Advanced CV
- 🌏 Previously: Full-stack & automation engineering @ IMFS, Mumbai
LangChain · RAG · Apache Kafka · WebSockets · Docker · Next.js · UnSloth
A distributed, production-grade AI interview system built on event-driven architecture. Leveraged Retrieval-Augmented Generation (RAG) for contextual, intelligent responses and Kafka for async messaging across containerized microservices. Designed with real-world scalability in mind — fault-tolerant, low-latency, and deployable at scale.
Python · R · Apache Spark · Parallel Processing · Bioinformatics Pipeline
End-to-end bioinformatics pipeline processing 100M+ RNA-seq records with parallelization and chunking strategies. Achieved 76% sequence alignment accuracy with cross-tool benchmarking and profiling. Automated data download (ENA), caching layers, and reproducible analysis scripts for clinical interpretation.
Python · SQL · AWS SageMaker · Scikit-learn · Power BI · Statsmodel API
Pre- and post-qualifying race outcome predictor using Logistic Regression on historical F1 datasets. Achieved 82.35% true positive rate and 0.98 AUC. Built reusable data pipelines, structured data models, and an interactive Power BI dashboard for race analysis.
Python · XGBoost · Random Forest · AWS SageMaker · FastAPI · Node.js
Ensemble ML pipeline (Random Forest + XGBoost) for multi-class admissions classification. Achieved ROC AUC of 0.81 (train) / 0.78 (test). Deployed via FastAPI with REST endpoints for real-time inference and cloud-based model serving on AWS SageMaker.
Solidity · C++ · Truffle · Remix · Embedded Systems (Cortex-A53) · Distributed Systems
Peer-reviewed research turned into a working prototype for P2P renewable energy trading using smart contracts and decentralized architectures. Designed real-time IoT sensor integration and developed optimization algorithms for distributed energy system orchestration. Published research: "Bridging Energy Gaps: Blockchain-Enabled P2P Trading for Renewable Energy" (2024).
| Role | Organization | Period | Key Impact |
|---|---|---|---|
| Bioinformatics Research Assistant | USC | Sep 2025 – Mar 2026 | Processed 100M+ RNA-seq records; built scalable ETL pipelines |
| Software Developer (Full-Time) | IMFS, Mumbai | Jul 2024 – May 2025 | Reduced system latency by 83% via Docker/K8s containerization |
| Web Developer Intern | Mabella SkinCare | Jun 2023 – May 2024 | Built CRM handling 1000+ req/hr; developed OCR recommendation feature |
M.S. Computer Science — University of Southern California (Aug 2025 – May 2027)
GPA: 3.57 | Algorithms · HPC · Distributed Systems · Deep Learning · Advanced CV
B.Tech Information Technology (Blockchain Honors) — University of Mumbai (Aug 2020 – May 2024)
GPA: 4.0 (Magna Cum Laude Equivalent) | DSA · OS · Cloud Computing · IoT · Blockchain
Bridging Energy Gaps: Blockchain-Enabled P2P Trading for Renewable Energy (2024)
Peer-reviewed research on decentralized energy trading with distributed smart contracts and IoT sensor integration. Developed optimization algorithms for real-time energy distribution system architectures.

