Skip to content
View viveks-codes's full-sized avatar
😃
😃

Block or report viveks-codes

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
viveks-codes/README.md

LinkedIn HuggingFace Portfolio Dev.to PyPI Email

Profile Views GitHub followers


🧠 About Me

I'm a Generative AI Engineer with 2.6+ years building production LLM systems — from continued pre-training on 125 GB corpora to vLLM-based serving and inference benchmarking. Currently a Senior GenAI & NLP Engineer at BharatGen (IIT Bombay × IIM Indore), India's Sovereign AI Initiative, where I lead the training of AyurParam 2, a 17B Mixture-of-Experts model running on a 6-node, 48-GPU NVIDIA A6000 cluster.

I've contributed code to Meta's pytorch/examples (22k ⭐), co-authored 3 arXiv papers evaluating 20+ frontier models across Indic languages, and published honeypotllm — an open-source LLM security SDK — on PyPI.

vivek = {
    "role"       : "Senior GenAI & NLP Engineer @ BharatGen (IIT Bombay × IIM Indore)",
    "research"   : ["IndicParam", "ParamBench", "SectEval"],  # 3 arXiv papers
    "training"   : "AyurParam 2 — 17B MoE on 48× NVIDIA A6000",
    "open_source": ["honeypotllm (PyPI)", "pytorch/examples (Meta AI)"],
    "location"   : "Mumbai / Bengaluru / Remote",
    "languages"  : ["Python", "Hindi", "Gujarati", "English"],
}

🔬 arXiv Research

Co-authored 3 papers at BharatGen benchmarking frontier models on Indic languages and AI safety.

Paper Description Link
IndicParam LLM benchmark for 11 low-resource Indic languages — 13k+ MCQs, 20 models evaluated (GPT-5, Gemini 2.5, DeepSeek) arXiv HF
ParamBench Graduate-level Hindi benchmark — 17k questions across 21 culturally grounded Indian subjects arXiv HF
SectEval First study on latent sectarian bias in LLMs — 88 bilingual questions, 15 models including GPT-4o and Claude 3.5 arXiv

🚀 Highlight Projects

Fine-tuned LLaMA-3.2-3B on 130k curated Ayurvedic QA pairs using QLoRA + Unsloth. Achieved 41.91% on BhashaBench-Ayur (+1.17% over base). Deployed via vLLM with full TTFT and throughput benchmarking.

🔐 honeypotllmpip install honeypotllm

Open-source Python SDK that defends LLM APIs from model extraction and data theft by embedding forensic watermarks — making stolen training data the verifiable evidence of an attack.

🧬 Param2-Clinical-17B-MoE — CPT Training Stack

Custom multi-GPU CPT pipeline for a 17B MoE model trained on a 125 GB Ayurvedic and clinical corpus. Full stack: DeepSpeed ZeRO-2, Flash Attention 2, NUMA-tuned torchrun launch scripts, and a fault-tolerant distributed checkpoint manager.

🗣️ Ayurvedic Domain Tokenizer

Domain-specific BPE tokenizer trained on 125 GB of Sanskrit and clinical text. Achieves 40%+ fertility improvement over LLaMA-3 tokenizer on Ayurvedic terminology, reducing inference cost and improving model comprehension of classical medical terms.

👁️ Multimodal Government Document Extraction

End-to-end VLM pipeline using Qwen-VL to extract structured records (name, religion, address, voter ID) from scanned government PDFs. LLM post-correction layer resolves OCR errors and name normalisation at scale.

PyTorch Forward-Forward Algorithm — Merged to pytorch/examples

Implemented Hinton's Forward-Forward Algorithm in Meta AI's official pytorch/examples repository (22k ⭐). Collaborated directly with Soumith Chintala (PyTorch founder) on GPU optimisation and test coverage.


🛠️ Tech Stack

LLM Training & Alignment

PyTorch HuggingFace DeepSpeed LoRA Unsloth Axolotl LLaMA-Factory NVIDIA NeMo Flash Attention

Training paradigms: CPT · SFT · Instruction Tuning · RLHF · DPO / ORPO · MoE Architecture

Inference & Serving

vLLM LangChain FastAPI FAISS

Techniques: Speculative Decoding · Quantization (GPTQ / AWQ / BnB) · TTFT & Throughput Benchmarking · RAG · Knowledge Graphs

Multimodal

Qwen-VL Vision Language Models · Document Understanding · OCR · Structured Extraction

MLOps & Infrastructure

Slurm NVIDIA Weights & Biases Docker AWS Azure

DeepSpeed ZeRO-2/3 · torchrun · NUMA Topology Tuning · CI/CD · DVC


📊 GitHub Stats


✍️ Technical Writing


🌐 Experience Timeline

2021 ─── B.Tech AI & Data Science, Uka Tarsadia University (First Class with Distinction)
2023 ─── Research Intern @ Goa Institute of Management (Big Data Analytics)
2023 ─── PyTorch Contributor — pytorch/examples merged (Meta AI, 22k ⭐)
2023 ─── Backend Intern @ Axelor, Surat
2024 ─── SDE-I NLP & ML @ NowFloats by Reliance Industries, Hyderabad (130k+ daily users)
2025 ─── Senior GenAI & NLP Engineer @ BharatGen (IIT Bombay × IIM Indore)
         ↳ AyurParam 2: 17B MoE CPT on 48-GPU A6000 cluster
         ↳ 3 arXiv papers: IndicParam · ParamBench · SectEval
         ↳ honeypotllm on PyPI
         ↳ 8 years on GitHub and still shipping 🚀

Building the future of Indic AI — one token at a time.

Open to collaborations on Indic NLP, LLM safety, and domain-specific foundation models.

LinkedIn Email

Pinned Loading

  1. pytorch/examples pytorch/examples Public

    A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

    Python 23.9k 9.8k

  2. TMDB-Recommendation-System-Flask TMDB-Recommendation-System-Flask Public

    Jupyter Notebook 1 1

  3. DCML-Azure DCML-Azure Public

    DCML

    Jupyter Notebook 1

  4. resnet50-Deep-Learning-image-classifier resnet50-Deep-Learning-image-classifier Public

    Python 3 2

  5. Linear-Regression-from-scratch Linear-Regression-from-scratch Public

    Linear Regression from scratch without using sklearn using python 3

    Jupyter Notebook 4 2

  6. age_dekho age_dekho Public

    Jupyter Notebook 3 2