Building a coding agent using LangChain and Google Gemini with a ReACT (Reasoning and Acting) architecture. The project will follow an iterative, minimal implementation approach.
Create a basic ReACT agent that can read and write files using Google Gemini as the LLM.
-
Read File Tool: Reads content from a specified file path
- Input: file path (string)
- Output: file content (string) with line numbering upfront, or error message
- Error handling: file not found, permission errors
-
Write File Tool: Writes content to a specified file path
- Input: file path (string), new content (string), line range in which the new content will be placed
- Output: success confirmation or error message
- Error handling: permission errors, directory creation if needed
- Use LangChain's ReACT agent implementation
- Integrate Google Gemini via langchain-google-genai
- Agent should be able to:
- Reason about what actions to take
- Use the read/write tools appropriately
- Provide thought process in responses
- Environment variables for API keys (GOOGLE_API_KEY), use .env for this
- Basic agent configuration (model name, temperature, etc.)
-
Setup Environment ✅
- Initialize Python 3.12 venv with uv
- Install dependencies (langchain, langchain-google-genai, pylint, astroid)
- Configure Git repository
-
Create Tool Implementations ✅
- Implement read_file tool as LangChain tool
- Implement write_file tool as LangChain tool
- Add basic validation and error handling
- Add home directory support for sandboxing
-
Initialize Gemini LLM ✅
- Configure Google Gemini model
- Set up API key management
- Test basic LLM functionality
-
Build ReACT Agent ✅
- Create ReACT agent with Gemini + tools (use the default from langchain/langgraph)
- Configure agent parameters
- Add prompt engineering for coding tasks
-
Create Main Interface ✅
- Simple CLI or Python script to interact with agent
- Accept user prompts
- Display agent reasoning and actions
- Format output for better readability
- Agent can successfully read files from the file system ✅
- Agent can successfully write files to the file system ✅
- Agent demonstrates ReACT pattern (thought → action → observation) ✅
- Agent can complete simple coding tasks (e.g., "read config.py and create a summary in summary.txt") ✅
- Agent supports home directory for sandboxed operations ✅
- Clean formatted output ✅
Add short-term memory to the agent so it can remember previous messages in the conversation and maintain context across multiple exchanges.
Currently, if the agent creates a plan and asks for confirmation, when the user replies "yes", the agent has no memory of what it was asking about. We need conversation history.
- Use
ConversationBufferMemoryfrom LangChain (simplest option) - Stores all messages in the current session
- No persistence needed (memory clears when program exits)
- Modify
create_coding_agent()inreact_agent.py - Pass memory to the agent via checkpointer
- Use LangGraph's
MemorySaverfor state persistence during session
-
Add Memory to Agent ✅
- Import
MemorySaverfrom langgraph.checkpoint.memory - Create memory checkpointer
- Pass checkpointer to
create_react_agent()
- Import
-
Update Main Loop ✅
- Maintain a
thread_idfor the conversation session - Pass
thread_idin config when invoking agent - All messages in same session will share memory
- Maintain a
-
Test Memory ✅
- Agent asks question → user confirms → agent remembers context
# In react_agent.py
from langgraph.checkpoint.memory import MemorySaver
def create_coding_agent(home_directory=None):
llm = ChatGoogleGenerativeAI(...)
tools = [read_file_func, write_file_func]
# Add memory
memory = MemorySaver()
agent = create_react_agent(llm, tools, checkpointer=memory)
return agent
# In main.py
thread_id = "default_session" # Single session for CLI
result = agent.invoke(
{"messages": [("user", user_input)]},
config={"configurable": {"thread_id": thread_id}}
)- Agent remembers previous messages in the conversation ✅
- User can confirm/deny without re-explaining context ✅
- Memory persists throughout CLI session ✅
- Memory clears when program restarts (expected behavior) ✅
- Persistent memory (save to disk/database)
- Summary-based memory (for long conversations)
- Multiple conversation threads
- Memory trimming (keep last N messages)
Add pylint validation as a tool so the agent can check if the Python code it generates is properly linted and follows best practices.
The agent can create and modify Python files, but has no way to validate that the code it generates:
- Has no syntax errors
- Follows Python best practices
- Has no linting issues (unused imports, undefined variables, etc.)
Adding pylint as a tool allows the agent to self-validate its code generation.
- New tool:
validate_codeorlint_file - Input: file path to Python file
- Output: Linting results (errors, warnings, info)
- Uses pylint's programmatic API (already installed)
- Run pylint on a specific Python file
- Return structured output: score, errors, warnings
- Format output to be readable for the LLM
- Allow agent to fix issues based on lint results
- Add new tool to
src/tools/directory - Register tool with agent in
react_agent.py - Agent can use tool as part of its workflow
-
Create Lint Tool ✅
- Create
src/tools/lint_tools.py - Implement
lint_filetool using pylint API - Parse pylint output into structured format
- Handle errors gracefully
- Create
-
Integrate with Agent ✅
- Import lint tool in
react_agent.py - Add to tools list
- Agent now has access to validation
- Import lint tool in
-
Add System Prompt ✅
- Create comprehensive system prompt
- Instructs agent to always validate code
- Defines best practices and workflow
-
Test Integration ✅
- Agent creates Python file
- Agent runs lint on file
- Agent sees issues and can fix them
@tool
def lint_file(file_path: str) -> str:
"""Run pylint on a Python file and return linting results.
Args:
file_path: Path to Python file to lint
Returns:
Formatted linting results with score, errors, and warnings
"""
# Run pylint
# Parse results
# Return formatted outputPylint Score: 8.5/10
Errors:
Line 5: undefined-variable - Undefined variable 'x'
Warnings:
Line 10: unused-import - Unused import 'sys'
Line 15: line-too-long - Line too long (120/100)
Info:
No style issues found
- Tool can successfully run pylint on Python files ✅
- Returns clear, actionable feedback ✅
- Agent can use tool to validate its own code ✅
- Agent can iterate: create → lint → fix → lint again ✅
- Works with home directory restrictions ✅
- System prompt guides agent to use linting automatically ✅
You: Create a Python script with a function to calculate factorial
Agent: [Creates file]
Agent: [Uses lint_file tool to validate]
Agent: "I've created factorial.py with a pylint score of 9.5/10"
- Auto-fix mode using pylint's --fix option
- Configurable pylint rules (.pylintrc)
- Code formatting with black/autopep8
- Type checking with mypy
- Security scanning with bandit
- Memory and conversation history → Phase 2 ✅
- Code validation with pylint → Phase 3 ✅
- Multi-agent system → Phase 4 🔄
- Additional tools (search, execute code, git operations)
- Multi-file operations
- Code analysis tools (using astroid)
- Improved error handling and recovery
- Structured output and logging
hb-coding-agent/
├── .venv/ # Virtual environment
├── .gitignore
├── README.md
├── requirements.txt
├── plan.md # This file
├── .env # API keys (not committed)
├── src/
│ ├── __init__.py
│ ├── tools/
│ │ ├── __init__.py
│ │ ├── file_tools.py # Read/write file tools
│ ├── agent/
│ │ ├── __init__.py
│ │ ├── react_agent.py # ReACT agent implementation
│ ├── config.py # Configuration management
│ └── main.py # Entry point
Create project structure (src/, tests/ directories)✅Implement file tools (read_file, write_file)✅Set up Gemini LLM configuration✅Build ReACT agent✅Create simple CLI interface✅Add conversation memory (Phase 2)✅Add pylint validation tool (Phase 3)✅- Implement multi-agent system (Phase 4) 🔄
Transform the single ReACT agent into a collaborative multi-agent system where specialized agents work together to solve coding tasks. This creates a more robust system with separation of concerns: planning, implementation, and validation.
Instead of one agent doing everything, we'll have three specialized agents that communicate and collaborate:
- Planning Agent (Readonly): Analyzes the problem, explores the codebase, creates execution plan
- Implementation Agent (Read/Write): Executes the plan, modifies files, uses linting
- Validator Agent (Readonly + Git): Validates changes using git diff and linting
User Request
↓
Planning Agent
→ Explores codebase (read files, grep search, bash commands)
→ Analyzes requirements
→ Creates execution plan (structured format)
↓
Implementation Agent
→ Receives execution plan
→ Implements changes (create/edit files)
→ Runs linting on changes
→ Reports implementation status
↓
Validator Agent
→ Reviews changes (git diff)
→ Validates code quality (linting, AST)
→ Provides approval or requests fixes
↓
[If fixes needed: Implementation Agent → Validator Agent loop]
↓
Final Result to User
Role: Understand requirements and create actionable plan
Tools (Read-only):
read_file: Read existing filesrun_bash_command: Explore directory structure, search filesgrep_search: NEW - Multi-file text search tool
Responsibilities:
- Analyze user request
- Explore codebase to understand structure
- Search for relevant files and code patterns
- Create detailed execution plan with:
- Files to create/modify
- Changes to make
- Order of operations
- Dependencies and considerations
Output Format: Structured plan (JSON or Markdown)
{
"analysis": "Summary of the problem",
"files_to_modify": ["file1.py", "file2.py"],
"files_to_create": ["new_file.py"],
"steps": [
{"action": "create", "file": "new_file.py", "description": "..."},
{"action": "modify", "file": "file1.py", "description": "..."}
],
"considerations": ["edge case 1", "dependency on X"]
}Role: Execute the plan and make code changes
Tools (Read/Write):
read_file: Read files to understand contextwrite_file: Create and modify fileslint_file: Validate Python code qualityrun_bash_command: Limited bash operations
Responsibilities:
- Parse execution plan from Planning Agent
- Implement each step in order
- Run linting after each Python file change
- Fix linting issues automatically
- Report implementation progress
Input: Execution plan from Planning Agent Output: Implementation report with:
- Files created/modified
- Linting scores
- Any issues encountered
- Status (success/partial/failed)
Role: Validate implementation quality and correctness
Tools (Read-only + Git):
git_diff: NEW - Show changes made (git diff)git_status: NEW - Show file change statuslint_file: Validate code qualityread_file: Review final state of files
Responsibilities:
- Review git diff to see all changes
- Validate code quality (linting, AST parsing)
- Check that implementation matches plan
- Provide approval or request specific fixes
- Generate validation report
Output: Validation report with:
- Changes summary (from git diff)
- Code quality assessment
- Issues found (if any)
- Approval status (approved/needs-fixes)
- Specific fixes required
Purpose: Multi-file text search across codebase
@tool
def grep_search(pattern: str, file_pattern: str = "*.py", case_sensitive: bool = False) -> str:
"""Search for text pattern across multiple files.
Args:
pattern: Text or regex pattern to search for
file_pattern: File glob pattern (e.g., "*.py", "src/**/*.py")
case_sensitive: Whether search is case-sensitive
Returns:
Formatted search results with file paths and line numbers
"""Use Cases:
- Find all usages of a function
- Search for TODO comments
- Find import statements
- Locate class definitions
Purpose: Show changes made to files
@tool
def git_diff(file_path: str = "") -> str:
"""Show git diff of changes.
Args:
file_path: Optional specific file to diff (empty = all changes)
Returns:
Git diff output showing changes
"""Purpose: Show status of working directory
@tool
def git_status() -> str:
"""Show git status of working directory.
Returns:
List of modified, created, deleted files
"""A simple orchestrator function that:
- Takes user request
- Invokes Planning Agent → gets plan
- Invokes Implementation Agent with plan → gets implementation report
- Invokes Validator Agent → gets validation report
- If validation fails: loop (Implementation Agent → Validator Agent)
- Returns final result to user
Implementation Approach: Simple sequential function (not another agent)
def orchestrate_multi_agent(user_request: str, home_directory: str = None):
"""Orchestrate multiple agents to handle a coding request."""
# Phase 1: Planning
planning_agent = create_planning_agent(home_directory)
plan = planning_agent.invoke({"messages": [("user", user_request)]})
# Phase 2: Implementation
implementation_agent = create_implementation_agent(home_directory)
impl_report = implementation_agent.invoke({
"messages": [("user", f"Execute this plan:\n{plan}")]
})
# Phase 3: Validation
validator_agent = create_validator_agent(home_directory)
validation = validator_agent.invoke({
"messages": [("user", f"Validate this implementation:\n{impl_report}")]
})
# Phase 4: Fix loop (if needed)
max_iterations = 3
iteration = 0
while not is_approved(validation) and iteration < max_iterations:
fix_request = extract_fixes(validation)
impl_report = implementation_agent.invoke({
"messages": [("user", f"Fix these issues:\n{fix_request}")]
})
validation = validator_agent.invoke({
"messages": [("user", f"Re-validate:\n{impl_report}")]
})
iteration += 1
return {
"plan": plan,
"implementation": impl_report,
"validation": validation,
"status": "approved" if is_approved(validation) else "needs_review"
}src/
├── tools/
│ ├── file_tools.py # Existing
│ ├── bash_tools.py # Existing
│ ├── lint_tools.py # Existing
│ ├── git_tools.py # NEW - git diff, git status
│ └── search_tools.py # NEW - grep_search
├── agent/
│ ├── react_agent.py # Existing (might refactor)
│ ├── planning_agent.py # NEW
│ ├── implementation_agent.py # NEW
│ ├── validator_agent.py # NEW
│ └── orchestrator.py # NEW - coordinates agents
├── main.py # Updated to use orchestrator
└── config.py # Existing
-
git_tools.py: Implement git_diff and git_status
- Use subprocess to run git commands
- Parse and format output
- Handle errors (no git repo, no changes, etc.)
-
search_tools.py: Implement grep_search
- Use subprocess to run grep or native Python search
- Format results with file paths and line numbers
- Support glob patterns for file filtering
-
planning_agent.py:
- Create agent with readonly tools only
- System prompt: "You are a planning agent..."
- Tools: read_file, run_bash_command, grep_search
-
implementation_agent.py:
- Create agent with read/write tools
- System prompt: "You are an implementation agent..."
- Tools: read_file, write_file, lint_file, run_bash_command
-
validator_agent.py:
- Create agent with validation tools
- System prompt: "You are a validator agent..."
- Tools: git_diff, git_status, lint_file, read_file
- orchestrator.py:
- Simple function-based orchestration (not agent-based)
- Sequential execution: Planning → Implementation → Validation
- Simple fix loop with max iterations
- Structured output format
- main.py:
- Add flag for multi-agent mode:
--multi-agent - When enabled, use orchestrator instead of single agent
- Display progress for each phase
- Show intermediate results (plan, implementation, validation)
- Add flag for multi-agent mode:
You are an expert planning agent. Your role is to analyze coding requests and create detailed execution plans.
You have READ-ONLY access to the codebase. Use your tools to:
- Explore directory structure
- Read existing files
- Search for patterns across files
- Understand current code organization
Your output should be a structured plan with:
1. Analysis: Summary of the problem
2. Files to create: List with descriptions
3. Files to modify: List with descriptions
4. Step-by-step actions: Ordered list of what to do
5. Considerations: Edge cases, dependencies, risks
Be thorough and specific. The implementation agent will follow your plan exactly.
You are an expert implementation agent. Your role is to execute coding plans.
You will receive a plan from the planning agent. Follow it step by step:
1. Create or modify files as specified
2. After each Python file change, run lint_file
3. Fix any linting issues immediately
4. Report progress and results
Be precise and careful. Aim for clean, well-tested code with pylint scores of 8.0+.
You are an expert validation agent. Your role is to review and validate code changes.
Use your tools to:
1. Check git diff to see all changes
2. Validate code quality with linting
3. Verify implementation matches the plan
4. Check for edge cases and issues
Provide a validation report with:
- Summary of changes
- Code quality assessment
- Issues found (be specific)
- Approval status (approved/needs-fixes)
- If fixes needed: specific instructions
Be thorough but fair. Focus on correctness and quality.
Agents communicate via structured messages in their shared memory/context:
# Planning Agent Output
{
"type": "plan",
"analysis": "User wants to create a REST API endpoint...",
"files_to_create": [
{"path": "api/users.py", "purpose": "User endpoint handlers"}
],
"files_to_modify": [
{"path": "api/__init__.py", "purpose": "Register new routes"}
],
"steps": [
{"seq": 1, "action": "create", "file": "api/users.py", "details": "..."},
{"seq": 2, "action": "modify", "file": "api/__init__.py", "details": "..."}
]
}
# Implementation Agent Output
{
"type": "implementation_report",
"status": "success",
"files_created": ["api/users.py"],
"files_modified": ["api/__init__.py"],
"linting_results": {
"api/users.py": {"score": 9.5, "issues": []},
"api/__init__.py": {"score": 10.0, "issues": []}
}
}
# Validator Agent Output
{
"type": "validation_report",
"status": "approved", # or "needs_fixes"
"changes_summary": "Created users.py with GET/POST handlers...",
"quality_score": 9.5,
"issues": [],
"approval": true,
"fix_instructions": [] # populated if approval is false
}- Create
src/tools/git_tools.pywith git_diff and git_status - Create
src/tools/search_tools.pywith grep_search - Create
src/agent/planning_agent.pywith readonly tools - Create
src/agent/implementation_agent.pywith read/write tools - Create
src/agent/validator_agent.pywith validation tools - Create
src/agent/orchestrator.pywith sequential coordination - Update
src/main.pyto support--multi-agentflag - Test with simple task: "Create a hello world function"
- Test with complex task: "Add REST endpoint for users"
✅ Planning Agent:
- Can explore codebase with readonly tools
- Produces structured, actionable plans
- Uses grep_search to find relevant code
✅ Implementation Agent:
- Follows plans from planning agent
- Creates/modifies files correctly
- Runs linting automatically
- Fixes issues iteratively
✅ Validator Agent:
- Reviews changes with git diff
- Validates code quality
- Provides clear feedback
- Approves or requests specific fixes
✅ Orchestration:
- Sequential flow works: Planning → Implementation → Validation
- Fix loop works: Implementation ↔ Validator (max 3 iterations)
- User sees progress for each phase
- Final result includes all phase outputs
✅ End-to-End:
- User gives request → system produces validated implementation
- Changes are committed to git (or ready to commit)
- All Python files pass linting
- System provides comprehensive report
# Single agent mode (existing)
python src/main.py --home ./test-project
You: Create a factorial function
# Multi-agent mode (new)
python src/main.py --home ./test-project --multi-agent
You: Create a factorial function
# Output:
# [PLANNING] Analyzing request...
# [PLANNING] Creating execution plan...
# Plan: Create factorial.py with factorial function and tests
#
# [IMPLEMENTATION] Executing plan...
# [IMPLEMENTATION] Created factorial.py
# [IMPLEMENTATION] Linting: 9.5/10
#
# [VALIDATION] Reviewing changes...
# [VALIDATION] Git diff: +25 lines in factorial.py
# [VALIDATION] Code quality: Excellent
# [VALIDATION] ✅ APPROVED
#
# Final Result: Successfully created factorial function with validation- Agent Independence: Each agent is an independent ReACT agent with its own tools and prompt
- No Shared State: Agents communicate via messages/outputs, not shared memory
- Sequential Execution: Simple coordinator function, not parallel (easier to implement)
- Error Handling: Each agent can fail independently; orchestrator handles failures
- Iteration Limit: Max 3 fix iterations to prevent infinite loops
- Git Requirements: Assumes working directory is a git repo (or initialize one)
- Parallel agent execution where possible
- Human-in-the-loop approval points
- More sophisticated agent communication (message bus)
- Agent specialization (testing agent, documentation agent)
- Learning from validation feedback
- Cost tracking per agent
- Agent performance metrics