-
Notifications
You must be signed in to change notification settings - Fork 291
Open
Description
MCP Resource Limits Configuration Proposal
Problem Statement
Docker MCP Gateway / cagent currently has no built-in mechanism to limit:
- Maximum concurrent MCP instances running in parallel
- Memory per instance
- Total memory consumption
- CPU allocation per tool
- Instance lifecycle (timeout, cleanup)
This leads to resource exhaustion on local development machines (especially WSL 2), causing:
- 90%+ CPU usage from uncontrolled
docker-mcp.exespawning - Memory bloat
- System lockups
- Daemon crashes
Example of Current Issue
When adding multiple MCP tools (exa, fetch, filesystem, clickhouse, playwright, etc.), each tool call can spawn new instances without limits, resulting in 100+ orphaned processes consuming resources indefinitely.
Proposed Solution
Add a .mcp-limits.yaml configuration file that allows users to define resource constraints:
# .mcp-limits.yaml
mcp:
global:
max_concurrent_instances: 10 # Max total instances across all tools
max_total_memory: 2048 # MB - total memory cap
max_total_cpu: 80 # Percentage (0-100)
cleanup_orphans: true
orphan_detection_interval: 30 # seconds
instance_timeout: 600 # seconds (10 minutes)
tools:
exa:
max_instances: 2
max_memory: 256 # MB per instance
max_cpu: 25 # Percentage per instance
timeout: 120 # seconds
fetch:
max_instances: 3
max_memory: 512
max_cpu: 50
timeout: 180
playwright:
max_instances: 1
max_memory: 1024
max_cpu: 50
timeout: 300
clickhouse:
max_instances: 1
max_memory: 512
max_cpu: 40
timeout: 600
# Fallback for tools not explicitly configured
default:
max_instances: 2
max_memory: 256
max_cpu: 30
timeout: 300Implementation Details
1. Configuration Loading
# Locations checked in order:
~/.mcp-limits.yaml # User home
.mcp-limits.yaml # Project root
$CAGENT_CONFIG_DIR/limits.yaml # Environment variable
/etc/cagent/mcp-limits.yaml # System-wide (Linux/Mac)2. Instance Manager Enhancement
class MCPInstanceManager:
def __init__(self, config_path):
self.config = load_config(config_path)
self.active_instances = {}
self.start_orphan_cleanup_task()
def spawn_instance(self, tool_name):
"""Spawn with resource limits enforced"""
# Check global limits
if len(self.active_instances) >= self.config.global.max_concurrent_instances:
raise MCPLimitExceeded("Max concurrent instances reached")
tool_config = self.config.tools.get(tool_name, self.config.default)
# Check tool-specific limits
tool_instances = len([i for i in self.active_instances.values()
if i.tool == tool_name])
if tool_instances >= tool_config.max_instances:
raise MCPLimitExceeded(f"Max instances for {tool_name} reached")
# Spawn with cgroup/ulimit constraints
instance = self._spawn_docker_container(
tool_name,
memory_limit=f"{tool_config.max_memory}M",
cpuset=tool_config.max_cpu
)
self.active_instances[instance.id] = instance
return instance
def cleanup_orphans(self):
"""Periodically remove dead/timed-out instances"""
now = time.time()
to_remove = []
for instance_id, instance in self.active_instances.items():
elapsed = now - instance.created_at
if elapsed > instance.config.timeout:
instance.terminate()
to_remove.append(instance_id)
for instance_id in to_remove:
del self.active_instances[instance_id]3. Monitoring & Alerts
class MCPResourceMonitor:
def check_health(self):
"""Emit warnings when approaching limits"""
total_mem = sum(i.memory_usage for i in self.instances.values())
total_cpu = sum(i.cpu_usage for i in self.instances.values())
if total_mem > self.config.global.max_total_memory * 0.8:
logger.warning(f"Memory usage at {total_mem}MB ({total_mem / self.config.global.max_total_memory * 100}%)")
if total_cpu > self.config.global.max_total_cpu * 0.8:
logger.warning(f"CPU usage at {total_cpu}%")4. CLI Integration
# Show current limits
cagent mcp limits show
# Update limits
cagent mcp limits set --max-instances 5 --max-memory 2048
# Monitor usage in real-time
cagent mcp monitor
# Force cleanup
cagent mcp cleanup --forceBenefits
- Prevents resource exhaustion: No more 90%+ CPU spikes
- Production-ready: Scales safely in constrained environments
- User-friendly: Zero-config defaults, easy to customize
- Transparent: Monitor actual usage vs limits
- Safe: Graceful degradation instead of crashes
Testing Scenarios
# Test 1: Spawn 20 tools, should queue/fail gracefully
for i in {1..20}; do cagent call exa --query "test$i" & done
# Test 2: Monitor that cleanup removes orphans
cagent mcp monitor # Should show instances terminating after timeout
# Test 3: Verify memory cap respected
cagent mcp limits set --max-memory 512
# Large operation should fail or queueBackward Compatibility
- All limits default to
unlimitedif.mcp-limits.yamlnot found - Existing configs work unchanged
- Environment variable overrides available for CI/CD
Related Issues
- Similar pattern used by: Kubernetes (resource requests/limits), Docker (--memory, --cpus), Systemd (MemoryMax, CPUQuota)
Files to Modify
cagent/config/limits.py- New config parsercagent/mcp/instance_manager.py- Resource enforcementcagent/mcp/monitor.py- Health checkscagent/cli/commands/mcp.py- New CLI commandsdocs/mcp-configuration.md- Documentationexamples/.mcp-limits.yaml- Example config
Implementation Priority
- Phase 1: Global instance/memory limits (MVP)
- Phase 2: Per-tool limits + cleanup
- Phase 3: CLI monitoring + auto-tuning
- Phase 4: Integration with Docker Desktop API for WSL 2
Author: aiob3 & Gordon (Security/Performance concern)
Date: 03/12/2026
Severity: Medium (Impacts usability, not security)
Type: Enhancement/Feature Request
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels