A Kubernetes alert remediation system with a modern web interface and an intelligent remediation agent. The platform analyzes monitoring alerts, retrieves live cluster context, and generates executable remediation scripts through LLM-powered workflows. It leverages intelligent tools and strategies to gather relevant system information and incorporates historical context to improve diagnosis and decision-making. Additionally, it supports advanced monitoring and telemetry capabilities to provide deeper visibility into cluster health and performance.
helm-chart/: Contains the core Helm chart for deploying the agents and their mandatory dependencies.README.md: Main setup guide for getting the agents up and running (this file).OTEL-setup.md: Optional guide for setting up monitoring (Prometheus, Grafana, Loki, Tempo).
To get the agents running, you need three mandatory steps:
The MCP (Model Context Protocol) Server provides Kubernetes tools for the agents.
git clone https://github.com/Flux159/mcp-server-kubernetes.gitpython3 -c "
import json
with open('mcp-server-kubernetes/helm-chart/values.schema.json') as f:
schema = json.load(f)
schema['properties']['observability'] = {
'type': 'object',
'additionalProperties': True
}
with open('mcp-server-kubernetes/helm-chart/values.schema.json', 'w') as f:
json.dump(schema, f, indent=2)
print('Done')
"helm install mcp-server ./mcp-server-kubernetes/helm-chart \
--set kubeconfig.provider=serviceaccount \
--set transport.mode=http \
--set transport.service.type=ClusterIP \
--set security.allowOnlyNonDestructive=false \
--create-namespace \
--namespace mcp-systemThe agents use PostgreSQL for long-term memory. We use the CrunchyData Operator to manage it.
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.40.0/install.sh | bash -s v0.40.0kubectl create -f https://operatorhub.io/install/postgresql.yaml
Install the agents using this Helm chart. This will also automatically create the PostgreSQL cluster.
helm upgrade --install 01cloud-agent ./helm-chart -n 01cloud --create-namespaceModify values.yaml to configure your agents. You must set the MODEL_PROVIDER, MODEL_NAME, and provide the corresponding API key in the secret section.
agents:
- name: l0
enabled: true
image: myregistry/l0-agent:latest
env:
MODEL_PROVIDER: deepseek # options: gemini, openai, openrouter, anthropic, deepseek
MODEL_NAME: deepseek-chat # examples: gemini-2.0-flash, gpt-4o, claude-3-5-sonnet
MCP_SERVER_URL: http://mcp-server-mcp-server-kubernetes.mcp-system.svc.cluster.local:3001/mcp
ENABLE_K8S_TOOLS: "true"
STM_ENABLE_POSTGRES: "true"
usePostgresql: true
secret:
DEEPSEEK_API_KEY: "your-api-key"
# GOOGLE_API_KEY: "your-api-key"
# OPENAI_API_KEY: "your-api-key"
# OPENROUTER_API_KEY: "your-api-key"
# ANTHROPIC_API_KEY: "your-api-key"Apply changes:
helm upgrade --install 01cloud-agent ./helm-chart -n 01cloudOnce the agents are deployed, you can access the agent UI by port-forwarding the agent-l0 service:
kubectl port-forward svc/agent-l0 -n 01cloud 3000:3000Now you can open your browser and go to http://localhost:3000.
For advanced monitoring (Grafana, Loki, Tempo, OpenTelemetry), see OTEL-setup.md.
The following services are available via port-forwarding:
# PostgreSQL
kubectl port-forward svc/agents-primary -n 01cloud 5432:5432
# MCP Server
kubectl port-forward svc/mcp-server-mcp-server-kubernetes -n mcp-system 3001:3001