Skip to content

Latest commit

 

History

History
250 lines (192 loc) · 8.11 KB

File metadata and controls

250 lines (192 loc) · 8.11 KB

Claude Code Autonomous Demo

🤖 Autonomous Refactoring with Full Feedback Loop

This demo showcases a complete autonomous code refactoring workflow using Claude Code with comprehensive validation and iterative improvement.

🔄 Workflow Pipeline

code2llm → Claude Code → vallm → Runtime Tests → code2llm (loop)
    ↓           ↓           ↓            ↓           ↓
 Analysis   Refactoring  Validation   Functionality  Re-analysis

📋 Complete Process

  1. code2llm Analysis - Deep structural analysis and code smell detection
  2. Claude Code - Advanced refactoring with architectural improvements
  3. vallm Validation - Multi-criteria quality assessment (syntax, security, performance, semantics)
  4. Runtime Tests - Automated functionality validation with pytest
  5. code2llm Re-analysis - Verify improvements and detect remaining issues
  6. Iterative Loop - Continue until perfect code achieved (max 5 iterations)

🎯 Key Features

🔍 Comprehensive Analysis

  • Code Structure: Functions, classes, modules, entry points
  • Code Smells: Complexity, duplication, dead code, coupling issues
  • Security: Vulnerabilities, hardcoded credentials, injection risks
  • Performance: Inefficient algorithms, memory usage patterns
  • Architecture: SOLID violations, tight coupling, separation of concerns

🧠 Claude Code Integration

  • Advanced Reasoning: Superior code understanding and refactoring
  • Architectural Patterns: Design patterns, clean architecture principles
  • Security Focus: Proactive vulnerability detection and fixing
  • Performance Optimization: Algorithmic improvements and best practices

✅ Multi-Layer Validation

  • vallm: Syntax, imports, complexity, security, semantics, performance
  • Runtime Tests: Functional validation with auto-generated test suites
  • Quality Metrics: Code quality scoring and improvement tracking

🔄 Autonomous Loop

  • Feedback Integration: Validation failures guide next iteration
  • Progressive Improvement: Each iteration builds on previous results
  • Best Version Tracking: Automatically saves the best refactored version
  • Termination Criteria: Stops when perfect code achieved or max iterations reached

🚀 Quick Start

Prerequisites

  • Docker and Docker Compose
  • ANTHROPIC_API_KEY environment variable
  • Local Ollama (optional, for comparison)

Setup

# Set your Anthropic API key
export ANTHROPIC_API_KEY='your-anthropic-api-key'

# Run the demo
./run.sh

📁 Project Structure

11_claude_code_autonomous/
├── claude_autonomous_demo.py    # Main autonomous workflow
├── legacy_code/
│   └── data_processor.py        # Legacy code with multiple issues
├── requirements.txt              # Python dependencies
├── Dockerfile                    # Container configuration
├── docker-entrypoint.sh          # Container startup script
├── run.sh                        # Demo runner script
└── README.md                     # This file

🔧 Legacy Code Issues

The demo refactors a complex data processing system with:

🚨 Security Issues

  • SQL Injection: Direct string concatenation in queries
  • Command Injection: Unsafe subprocess calls
  • Pickle Deserialization: Unsafe object serialization
  • Hardcoded Credentials: API keys in source code
  • eval() Usage: Dangerous code execution

🏗️ Architecture Problems

  • SRP Violations: DataProcessor has too many responsibilities
  • Tight Coupling: Direct access to internal state
  • Global Variables: Shared state management issues
  • No Error Handling: Missing exception management

🐉 Code Smells

  • Deep Nesting: Complex conditional structures
  • Duplicate Code: Repeated validation functions
  • Dead Code: Unused functions and variables
  • Long Parameter Lists: Maintainability issues
  • Magic Numbers: Hardcoded values

⚡ Performance Issues

  • Inefficient Algorithms: O(n²) nested loops
  • Memory Leaks: Unbounded cache growth
  • Poor Data Structures: Suboptimal choices

📊 Validation Criteria

vallm Validation

  • Syntax: Python syntax correctness
  • Imports: Module import validation
  • Complexity: Cyclomatic complexity analysis
  • Security: Vulnerability detection
  • Semantics: Code meaning and intent
  • Performance: Efficiency analysis

Runtime Tests

  • Functionality: Behavioral correctness
  • Edge Cases: Boundary condition testing
  • Error Handling: Exception management
  • Integration: Component interaction

Quality Metrics

  • Code Smells: Reduction in detected issues
  • Coupling: Improved module independence
  • Cohesion: Better relatedness of code elements
  • Maintainability: Enhanced readability and structure

🎯 Expected Improvements

After autonomous refactoring, the code should have:

✅ Security Fixes

  • Parameterized queries to prevent SQL injection
  • Safe subprocess usage with proper validation
  • Secure serialization methods
  • Environment-based configuration
  • Input sanitization and validation

🏗️ Architecture Improvements

  • Single Responsibility Principle compliance
  • Dependency injection implementation
  • Proper abstraction layers
  • Interface segregation
  • Loose coupling design

🧹 Code Quality

  • Eliminated code smells
  • Reduced complexity
  • Improved naming conventions
  • Better error handling
  • Comprehensive documentation

⚡ Performance Gains

  • Optimized algorithms
  • Efficient data structures
  • Proper memory management
  • Reduced computational complexity

📈 Output Files

The demo generates several output files:

  • best_refactored.py - Final optimized version
  • refactored_v{1-5}.py - Iteration snapshots
  • claude_autonomous.log - Detailed workflow logs
  • claude-autonomous-output.log - Console output
  • test_*.py - Auto-generated test files

🔄 Iteration Process

Each iteration includes:

  1. Analysis: code2llm examines current state
  2. Refactoring: Claude Code generates improvements
  3. Validation: vallm assesses code quality
  4. Testing: Runtime validation of functionality
  5. Scoring: Overall quality assessment
  6. Decision: Continue if not perfect

The loop continues until:

  • All validation criteria pass ✅
  • All runtime tests pass ✅
  • No code smells detected ✅
  • Maximum iterations reached (5) ⚠️

🎮 Advanced Features

🧠 Intelligent Feedback

  • Validation errors guide specific fixes
  • Test failures inform corrections
  • Code analysis targets remaining issues
  • Progressive improvement strategy

📊 Quality Tracking

  • Iteration-by-iteration scoring
  • Best version automatic selection
  • Improvement metrics visualization
  • Comparative analysis

🔧 Customizable Parameters

  • Maximum iterations limit
  • Validation weight adjustments
  • Test coverage requirements
  • Quality thresholds

🚀 Use Cases

This autonomous workflow is ideal for:

  • Legacy Modernization: Updating old codebases
  • Security Hardening: Vulnerability remediation
  • Performance Optimization: Algorithmic improvements
  • Code Quality: Maintaining high standards
  • Technical Debt: Systematic reduction

🎓 Learning Outcomes

Running this demo demonstrates:

  • Autonomous Refactoring: AI-driven code improvement
  • Multi-Tool Integration: Combining specialized tools
  • Quality Assurance: Comprehensive validation strategies
  • Iterative Development: Progressive enhancement approach
  • Best Practices: Modern software engineering principles

🔮 Future Enhancements

Potential improvements to the workflow:

  • Multi-Model Support: Integration with other LLMs
  • Custom Validation: Domain-specific rules
  • Performance Benchmarks: Quantitative measurements
  • Code Review Integration: Team collaboration features
  • CI/CD Pipeline: Automated deployment integration

📞 Support

For questions or issues with this demo:

  1. Check the generated log files
  2. Verify ANTHROPIC_API_KEY is set correctly
  3. Ensure Docker is running properly
  4. Review the troubleshooting section in logs

Enjoy the autonomous refactoring experience! 🚀