Add autonomous agent with SSH execution and Claude token savings (93% reduction) #3

lambertmt · 2026-01-15T18:12:10Z

Summary

This PR adds an autonomous agent capability that executes tools internally, keeping raw data away from Claude's context window. Real-world testing shows 40-95% Claude token savings on analysis tasks.

Key Features

Autonomous agent loop - local LLM decides what tools to call, executes them, analyzes results
Built-in SSH execution - agent can run commands on configured hosts
GPG-encrypted credentials - secure storage for SSH passwords
Strict output formatting - clean JSON tool calls, plain text answers

Real-World Results

Task	Claude Direct	Claude w/ Agent	Local LLM (free)	Savings
Debugging workflow (7 calls)	~56,000	~4,100	~35,000	93%
Security audit	~11,800	~800	~11,000	93%
Docker logs analysis	~10,500	~500	~10,000	95%
System health check	~5,500	~1,500	~4,000	73%
Log analysis (journalctl)	~4,000	~800	~3,200	80%
Code gen (w/ exploration)	~2,700	~1,700	~1,000	37%
Disk analysis	~1,500	~500	~1,000	65%
Code gen (small input)	~1,550	~1,600	~1,500	0%
Simple query	~500	~300	~200	40%

Note: The 0% case matters - when raw data is small, there's no benefit. This shines on data-heavy tasks.

How It Works

Claude sends task → Agent (local LLM) executes SSH internally → 
Agent analyzes 40K chars locally → Returns 800-token summary to Claude

Claude never sees the raw output. Tokens shift from paid (Claude) to free (local LLM).

Test Plan

Health check tool working
SSH execution with GPG-encrypted credentials
Agent loop with auto_execute=true
Multi-iteration tool calls
Token measurements validated

🤖 Generated with Claude Code

Features: - New agent_chat tool with stateful conversation management - Tool definition schema for describing available tools - Few-shot prompt format for reliable JSON tool call output - Multi-strategy JSON parsing (code blocks, inline, permissive) - Conversation continuation with tool results - Automatic conversation cleanup after 30 minutes - list_conversations debug tool Enables Claude to delegate tasks to local LLMs while maintaining control of tool execution - a cost-effective hybrid approach. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The parseToolCall method now properly handles: - Nested JSON objects (e.g., {"tool": "x", "arguments": {...}}) - Braces inside string values (e.g., "command": "awk '{print}'") Replaced regex-based extraction with balanced brace tracking that respects string escaping, enabling reliable tool call detection from LLM output. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove dangerous single-quote replacement in tryParseToolJson that broke JSON containing single quotes in string values (e.g., awk '$3') - Add DEBUG_MCP env var to enable detailed logging of: - parseToolCall input/output and strategies - agent_chat conversation flow and LLM responses - Try JSON parsing strategies in order: as-is, trailing comma fix Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Major changes: - agent_chat now auto-executes ssh_exec internally (no CC middleman) - Agentic loop runs until final_answer or max_iterations - Strict prompt guidelines for clean JSON/text output formatting - Remove max_tokens limit (local tokens are free with 128K context) - ssh_exec added as built-in tool automatically - Reports tools_executed in response for transparency This enables massive CC token savings - raw tool output (e.g., logs) never touches Claude's context, only the final analysis is returned. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Documents real-world testing showing 70-90% Claude token savings: - Log analysis: 15,000 tokens → 1,500 tokens (90% reduction) - System health check: 15,000 tokens → 4,500 tokens (70% reduction) Includes architecture diagrams, usage examples, and configuration guide for the autonomous agent with internal tool execution. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add "How the Token Math Works" section explaining token breakdown - Correct savings percentages: 40-80% (was 70-90%) - Add Local Tokens column to comparison tables - Update video script with consistent numbers and explanations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Show tokens shifting from Claude (paid) to Local LLM (free) - Add security audit (93%) and Docker logs (95%) as top examples - Update all claims to "up to 95%" based on actual testing - Include 8 test cases sorted by Claude Direct tokens - Add test-scripts/health_check.sh generated by agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Acknowledge CC Token Saver, Ollama Claude, Rubber Duck MCP as prior art - Position as infrastructure-focused implementation, not novel invention - Add comparison tables: when to use this vs other options - Add PORTABILITY.md for others evaluating the tool - Add posts/ with Reddit and Substack drafts - Update video script with honest intro and metadata - Add Nextcloud debugging use case to stats table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Michael Lambert and others added 8 commits January 12, 2026 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add autonomous agent with SSH execution and Claude token savings (93% reduction) #3

Add autonomous agent with SSH execution and Claude token savings (93% reduction) #3

Uh oh!

lambertmt commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add autonomous agent with SSH execution and Claude token savings (93% reduction) #3

Are you sure you want to change the base?

Add autonomous agent with SSH execution and Claude token savings (93% reduction) #3

Uh oh!

Conversation

lambertmt commented Jan 15, 2026

Summary

Key Features

Real-World Results

How It Works

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant