"30% theory, 70% practice. Every project here is a concept I actually understand β built from scratch, no shortcuts."
Sanket Talekar β Final Year B.Tech CSE @ D.Y. Patil College of Engineering & Technology, Kolhapur
AI Automation Engineer & Full Stack Developer (AI Focus)
This repo is my structured, self-driven AI Engineering practice ground. After completing the full AI Engineering 2025 book by Akshay Pachaar & Avi Chawla (DailyDoseofDS), I shifted from theory-first to practice-first learning.
Every project here is:
- Standalone β independent codebase, independent domain, no copy-paste between projects
- Progressive β each project introduces new concepts while building on previous ones
- From scratch β no LangChain abstractions until I understand what's being abstracted
- Job-signal focused β built to demonstrate real AI engineering skills to product-first companies
| # | Project | Concepts Covered | Stack | Status |
|---|---|---|---|---|
| 01 | AI Study Buddy | LLM params, generation strategies, prompt engineering, JSON output, CoT, session memory | Node.js | π’ Complete |
| 02 | Study Buddy v2 β Stateful Tutor | Multi-turn memory, G-eval, LLM-as-judge, self-evaluation loop, component evals | Node.js | π’ Complete |
| 03 | News Research Agent | ReAct pattern, tool use, agent loop, agentic memory, context engineering | Node.js | π‘ In Progress |
| 04 | Research Agent + Knowledge Base | RAG from scratch, chunking strategies, ChromaDB, HyDE, Agentic RAG | Node.js + Python | βͺ Planned |
| 05 | MCP Server + Observability | MCP architecture, MCP tools/resources, LLM tracing, Opik, multi-turn evals | Python | βͺ Planned |
| 06 | AI Code Review System | Multi-agent orchestration, A2A protocol, red teaming, FastAPI deployment | Python | βͺ Planned |
| 07 | LoRA Fine-tuning Experiment | LoRA, IFT dataset generation, SFT, FT vs RAG vs prompting comparison | Python | βͺ Planned |
Legend: π’ Complete Β· π‘ In Progress Β· π΄ Blocked Β· βͺ Planned
ai-engineering-practice/
β
βββ README.md β You are here
βββ .gitignore β Covers all projects (node_modules, .env, logs)
β
βββ 01 Study Buddy/ β Node.js Β· LLM fundamentals
β βββ README.md
β βββ .env.example
β βββ package.json
β βββ index.js
β βββ src/
β βββ geminiClient.js
β βββ promptBuilder.js
β βββ explain.js
β βββ quiz.js
β βββ params.js
β βββ strategies.js
β βββ memory.js
β βββ logger.js
β
βββ 02 Study Buddy v2/ β Node.js Β· Evaluation & memory
βββ 03 News Research Agent/ β Node.js Β· Agents from scratch
βββ 04 Agent + Knowledge Base/ β Node.js + Python Β· RAG pipeline
βββ 05 MCP Server/ β Python Β· MCP + Observability
βββ 06 Code Review Agents/ β Python Β· Multi-agent systems
βββ 07 LoRA Finetuning/ β Python Β· Fine-tuning experiments
The foundation. Direct LLM interaction β no frameworks, no abstractions.
Domain: Education / CLI tool
Language: Node.js (JavaScript)
Duration: 2 weeks
Status: π’ Complete
A command-line tool where you type any technical topic and the system generates an explanation, an interactive quiz, and comparative outputs across different generation strategies. Every API call is written by hand β no wrapper libraries hiding what's happening.
| Command | What it does |
|---|---|
explain <topic> |
Plain-text explanation with LLM |
quiz <topic> |
JSON-structured interactive quiz |
compare <topic> |
Same prompt, 3 different temperature configs |
strategies <topic> |
Greedy vs sampling vs CoT comparison |
history |
Show topics asked this session |
exit |
Save session log and quit |
temperature,top_p,max_tokensβ what they actually do to output- Greedy decoding vs nucleus sampling vs beam search
- Zero-shot, few-shot, chain-of-thought prompting
- Forcing JSON output and parsing it reliably
- Verbalized sampling technique
- In-memory session state (no database)
- LLM self-evaluation as a first eval loop
| File | Responsibility |
|---|---|
geminiClient.js |
All Gemini API calls β single source of truth |
promptBuilder.js |
Prompt templates β explain, quiz, CoT |
strategies.js |
Generation strategy comparison via Promise.all() |
quiz.js |
JSON output forcing + safe parsing with retry |
memory.js |
In-memory session store (plain JS object) |
logger.js |
Writes session-[timestamp].json on exit |
| Book Chapter | Implemented In |
|---|---|
| 7 LLM Generation Parameters | params.js, geminiClient.js |
| 4 LLM Text Generation Strategies | strategies.js |
| What is Prompt Engineering | promptBuilder.js |
| 3 Prompting Techniques for Reasoning | strategies.js (CoT) |
| JSON Prompting for LLMs | quiz.js |
| Verbalized Sampling | Day 8 experiment |
Same project, upgraded. Evaluation and persistent memory layer added.
Domain: Education / CLI tool
Language: Node.js (JavaScript)
Duration: 1 week
Status: π’ Complete
- Full multi-turn conversation history passed to every API call
- G-eval style self-scoring: after every explanation, the LLM grades itself on accuracy, clarity, and completeness (1β5 each), returning structured JSON
- Auto-regeneration: if any score is below 3, retry with chain-of-thought prompt
- Running score average displayed in terminal
- Session data includes scores alongside responses in
session.json
- Multi-turn conversation management
- G-eval evaluation framework
- LLM-as-judge pattern
- Component-level evaluation
- Iterative self-improvement loop
First real agent. ReAct loop built entirely by hand β no LangChain.
Domain: News research / automated reporting
Language: Node.js (JavaScript)
Duration: 3 weeks
Status: π‘ In Progress
Takes a research question. Autonomously decides which tools to call. Reasons step-by-step using the ReAct pattern (Thought β Action β Observation β repeat). Produces a structured JSON report with sources.
web_search(query)β searches for current informationsummarize_text(text)β condenses long contentcheck_claim(claim)β verifies a statement against search results
- ReAct agent loop implemented from scratch
- Tool definition and tool-call parsing
- Agent working memory (scratchpad)
- Context window management (trimming old observations)
- Agentic design patterns
RAG pipeline from scratch. The agent gets a long-term memory.
Domain: Personal knowledge base + research
Language: Node.js + Python
Duration: 3 weeks
Status: βͺ Planned
- 4th tool added:
knowledge_base_search(query)β searches a local vector store - Chunking pipeline: fixed-size, sentence-based, and semantic chunking β compared
- Embeddings via Gemini
embedding-001 - Local vector store: ChromaDB
- HyDE (Hypothetical Document Embeddings): generate a hypothetical answer, embed it, then retrieve β compared against direct query retrieval
- RAG architecture from scratch
- 5 chunking strategies and their tradeoffs
- Vector databases (ChromaDB locally)
- HyDE vs standard retrieval
- Agentic RAG vs traditional RAG
- When to use RAG vs prompting vs fine-tuning
Expose the agent as a service. Add full tracing so every call is visible.
Domain: Developer tooling
Language: Python
Duration: 2 weeks
Status: βͺ Planned
Wraps the Research Agent from Project 04 as an MCP (Model Context Protocol) server. Any MCP-compatible client (Claude Desktop, etc.) can call it. Every LLM call, tool call, and retrieval step is traced and logged using Opik.
research(question)β runs the full agentadd_to_kb(pdf_path)β ingests a document into the knowledge baseget_report(topic)β returns a cached report
- MCP architecture (host, client, server)
- MCP primitives: tools, resources, prompts
- LLM observability vs evaluation
- Trace instrumentation with Opik
- Latency, cost, and quality metrics per call
Multi-agent. An orchestrator spawns specialist sub-agents in parallel.
Domain: Developer tooling / code quality
Language: Python
Duration: 4 weeks
Status: βͺ Planned
GitHub PR diff
β
Orchestrator Agent
βββ Agent 1: Logic reviewer β finds bugs
βββ Agent 2: Security reviewer β finds vulnerabilities
βββ Agent 3: Style reviewer β checks naming, complexity
β
Synthesized final review (JSON + Markdown)
β
FastAPI endpoint
- Multi-agent orchestration pattern
- Agent-to-Agent (A2A) communication protocol
- 7 patterns in multi-agent systems
- Red teaming LLM apps
- FastAPI deployment of agent systems
- 5 levels of agentic AI systems
Fine-tuning last, not first. Because now I have an evaluator to measure if it actually helps.
Domain: Code review (same as Project 06 β for direct comparison)
Language: Python
Duration: 4 weeks
Status: βͺ Planned
Fine-tunes Phi-3-mini using LoRA on a dataset of (code, review) pairs. Runs the same evaluation harness from Projects 02 and 06 on: base model vs fine-tuned model. Documents where fine-tuning wins, where RAG wins, where prompting wins.
- LoRA (Low-Rank Adaptation) from scratch understanding
- IFT (Instruction Fine-Tuning) dataset generation
- SFT vs RFT
- Full fine-tuning vs LoRA vs RAG β practical comparison
- Hugging Face
transformers+peft
| Layer | Projects 01β03 | Projects 04β07 |
|---|---|---|
| Language | Node.js (JS) | Python |
| LLM | Gemini 1.5 Flash (free tier) | Gemini 1.5 Flash (free tier) |
| Vector DB | β | ChromaDB (local) |
| Agent Framework | None β from scratch | None β from scratch |
| Observability | β | Opik |
| Serving | β | FastAPI |
| Fine-tuning | β | HuggingFace + PEFT |
| Concept Area | Theory | Practice | Project |
|---|---|---|---|
| LLM fundamentals & generation | β | π‘ | 01 |
| Prompt engineering | β | π‘ | 01 |
| LLM evaluation | β | βͺ | 02 |
| AI Agents (ReAct) | β | βͺ | 03 |
| Context engineering | β | βͺ | 03 |
| RAG pipeline | β | βͺ | 04 |
| MCP protocol | β | βͺ | 05 |
| LLM observability | β | βͺ | 05 |
| Multi-agent systems | β | βͺ | 06 |
| Fine-tuning (LoRA) | β | βͺ | 07 |
Each project is self-contained. Navigate into any project folder and follow its own README.md.
General pattern for JS projects (01β03):
cd "01 Study Buddy"
npm install
cp .env.example .env
# Add your GEMINI_API_KEY to .env
node index.jsGeneral pattern for Python projects (04β07):
cd "04 Agent + Knowledge Base"
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Add your GEMINI_API_KEY to .env
python main.py
β οΈ Never commit your.envfile. Always use.env.exampleto share required variable names.
Book: AI Engineering 2025 Edition β Akshay Pachaar & Avi Chawla (DailyDoseofDS.com)
Topics covered in the book (all studied before starting this practice): LLMs Β· Prompt Engineering Β· Fine-tuning Β· RAG Β· Context Engineering Β· AI Agents Β· MCP Β· LLM Optimization Β· LLM Evaluation Β· LLM Deployment Β· LLM Observability
If you're a recruiter, hiring manager, or fellow builder β feel free to reach out.
Sanket Talekar
B.Tech CSE Β· D.Y. Patil College of Engineering & Technology, Kolhapur
Graduating: June 2026 Β· CGPA: 8.32
Experience: Junior Software Developer at 'The Business Legacy' -- Pune , Maharashtra (Jan 2026 - Present)
Targeting: AI Automation Engineer Β· Full Stack Developer (AI Focus)
Β· LinkedIn Β· Email Β· Portfolio Β· GitHub