I'm a Generative AI Engineer with 2.6+ years building production LLM systems — from continued pre-training on 125 GB corpora to vLLM-based serving and inference benchmarking. Currently a Senior GenAI & NLP Engineer at BharatGen (IIT Bombay × IIM Indore), India's Sovereign AI Initiative, where I lead the training of AyurParam 2, a 17B Mixture-of-Experts model running on a 6-node, 48-GPU NVIDIA A6000 cluster.
I've contributed code to Meta's pytorch/examples (22k ⭐), co-authored 3 arXiv papers evaluating 20+ frontier models across Indic languages, and published honeypotllm — an open-source LLM security SDK — on PyPI.
vivek = {
"role" : "Senior GenAI & NLP Engineer @ BharatGen (IIT Bombay × IIM Indore)",
"research" : ["IndicParam", "ParamBench", "SectEval"], # 3 arXiv papers
"training" : "AyurParam 2 — 17B MoE on 48× NVIDIA A6000",
"open_source": ["honeypotllm (PyPI)", "pytorch/examples (Meta AI)"],
"location" : "Mumbai / Bengaluru / Remote",
"languages" : ["Python", "Hindi", "Gujarati", "English"],
}Co-authored 3 papers at BharatGen benchmarking frontier models on Indic languages and AI safety.
Fine-tuned LLaMA-3.2-3B on 130k curated Ayurvedic QA pairs using QLoRA + Unsloth. Achieved 41.91% on BhashaBench-Ayur (+1.17% over base). Deployed via vLLM with full TTFT and throughput benchmarking.
🔐 honeypotllm — pip install honeypotllm
Open-source Python SDK that defends LLM APIs from model extraction and data theft by embedding forensic watermarks — making stolen training data the verifiable evidence of an attack.
🧬 Param2-Clinical-17B-MoE — CPT Training Stack
Custom multi-GPU CPT pipeline for a 17B MoE model trained on a 125 GB Ayurvedic and clinical corpus. Full stack: DeepSpeed ZeRO-2, Flash Attention 2, NUMA-tuned torchrun launch scripts, and a fault-tolerant distributed checkpoint manager.
Domain-specific BPE tokenizer trained on 125 GB of Sanskrit and clinical text. Achieves 40%+ fertility improvement over LLaMA-3 tokenizer on Ayurvedic terminology, reducing inference cost and improving model comprehension of classical medical terms.
End-to-end VLM pipeline using Qwen-VL to extract structured records (name, religion, address, voter ID) from scanned government PDFs. LLM post-correction layer resolves OCR errors and name normalisation at scale.
⚡ PyTorch Forward-Forward Algorithm — Merged to pytorch/examples
Implemented Hinton's Forward-Forward Algorithm in Meta AI's official pytorch/examples repository (22k ⭐). Collaborated directly with Soumith Chintala (PyTorch founder) on GPU optimisation and test coverage.
Training paradigms: CPT · SFT · Instruction Tuning · RLHF · DPO / ORPO · MoE Architecture
Techniques: Speculative Decoding · Quantization (GPTQ / AWQ / BnB) · TTFT & Throughput Benchmarking · RAG · Knowledge Graphs
Vision Language Models · Document Understanding · OCR · Structured Extraction
DeepSpeed ZeRO-2/3 · torchrun · NUMA Topology Tuning · CI/CD · DVC
- Under the Hood: VaidhLlama Architecture & Training Pipeline — Deep dive into fine-tuning a 3B Ayurvedic LLM
- From a Dumb Student to a PyTorch Contributor — How great teachers changed my trajectory
- Python & Data Science tutorials on dev.to/vivekcodes
2021 ─── B.Tech AI & Data Science, Uka Tarsadia University (First Class with Distinction)
2023 ─── Research Intern @ Goa Institute of Management (Big Data Analytics)
2023 ─── PyTorch Contributor — pytorch/examples merged (Meta AI, 22k ⭐)
2023 ─── Backend Intern @ Axelor, Surat
2024 ─── SDE-I NLP & ML @ NowFloats by Reliance Industries, Hyderabad (130k+ daily users)
2025 ─── Senior GenAI & NLP Engineer @ BharatGen (IIT Bombay × IIM Indore)
↳ AyurParam 2: 17B MoE CPT on 48-GPU A6000 cluster
↳ 3 arXiv papers: IndicParam · ParamBench · SectEval
↳ honeypotllm on PyPI
↳ 8 years on GitHub and still shipping 🚀



