Skip to content

pdudotdev/aiNOC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

182 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

✨ aiNOC

Version License Last Commit

Core Version
Vendors Cisco IOS-XE
Management RESTCONF CLI
Integrations Jira NetBox Discord HashiCorp Vault
On-Request
Vendors Arista EOS Juniper JunOS Aruba AOS-CX SONiC FRR MikroTik RouterOS VyOS
Management NETCONF REST APIs gNMI eAPI
Integrations Slack ServiceNow
Stats
Performance MTTD MTTR Cost/Session

πŸ“– Table of Contents

πŸ”­ Overview

AI-based network troubleshooting framework for multi-vendor, multi-protocol, multi-area/multi-AS enterprise networks.

▫️ Key characteristics:

  • Multi-vendor support
  • Multi-protocol, L2/L3
  • Multi-area/multi-AS
  • CLI/RESTCONF (Core)
  • NETCONF/REST/gNMI/eAPI
  • 15 MCP tools, 4 skills
  • 12 operational guardrails
  • HITL for any config changes
  • Dashboard for agent monitoring
  • Discord integration
  • Jira integration
  • HashiCorp Vault
  • NetBox

▫️ Core vs. On-Request features:

  • Core:
    • Easy to integrate in Cisco IOS/IOS-XE environments
    • CLI, RESTCONF management, OSPF/BGP troubleshooting
    • Jira, Discord, NetBox, HashiCorp Vault integration
  • On-Request:
    • Custom vendor modules (Arista, Juniper, MikroTik, etc.)
    • Custom management (REST APIs, NETCONF, gNMI, eAPI)
    • Built and adapted per client's network environment

▫️ aiNOC operating mode in v5.0+:

▫️ Important project files:

▫️ Agent guardrails list:

▫️ Supported models:

  • Haiku 4.5
  • Sonnet 4.6
  • Opus 4.6 (default, best reasoning)

⚠️ NOTE: Due to the intermittent nature of troubleshooting, it's worth using an advanced model by default. Costs won't become unsustainable even if addressing and fixing several issues per day.

▫️ Set your default model:
Create settings.json under .claude/:

{
  "model":"opus",
  "effortLevel":"medium"
}

▫️ High-level architecture:

arch

πŸ€ Here's a Quick Demo

  • See a DEMO of aiNOC v5.5

⭐ What's New in v5.5

βš’οΈ Core Tech Stack

Tool
Claude Code βœ“
MCP (FastMCP) βœ“
Python βœ“
Scrapli βœ“
Genie βœ“
RESTCONF βœ“
Jira API βœ“
Discord API βœ“
HashiCorp Vault βœ“
NetBox βœ“
Vector βœ“
Ubuntu βœ“

πŸ“‹ Supported Vendors

Vendor Platform cli_style Status
Cisco IOS-XE ios Core
Arista EOS eos On-Request
Juniper JunOS junos On-Request
MikroTik RouterOS routeros On-Request
Aruba AOS-CX aos On-Request
SONiC FRR frr On-Request
VyOS VyOS vyos On-Request

πŸš› Supported Transports

Management Devices Tier Status
RESTCONF Cisco IOS-XE Primary Core
CLI Cisco IOS-XE Fallback Core
NETCONF custom β€” On-Request
REST APIs custom β€” On-Request
gNMI custom β€” On-Request
eAPI Arista β€” On-Request

πŸŽ“ Troubleshooting Scope

aiNOC Core
OSPF
BGP
Redistribution
Policy-based routing
Route-maps, prefix lists
NAT/PAT, access lists
aiNOC On-Request Extensions
EIGRP
HSRP
VRRP
etc.

πŸ› οΈ Installation & Usage

▫️ Step 1:

git clone https://github.com/pdudotdev/aiNOC/
cd aiNOC/
python3 -m venv mcp
source mcp/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

▫️ Step 2: The included CLAUDE.md and skills/* are templates. Customize them with your own troubleshooting methodology, tool descriptions, and operational guidelines.

⚠️ NOTE: There is no one-size-fits-all CLAUDE.md or SKILL.md that works in any network environment. These should be customized for each specific topology, vendor combination, and architecture.

▫️ Step 3:

  • Configure IP SLA (or Connectivity Monitor, Netwatch etc.) paths in your network
  • Make sure they are being tracked and logged remotely to Vector (Syslog)
  • Configure the transforms inside /etc/vector/vector.yaml - example
  • aiNOC monitors Vector's /var/log/network.json file for specific logs and parses them per-vendor

▫️ Step 4: Run the aiNOC watcher and dashboard services. Claude is invoked non-interactively via tmux + print mode (-p) with a default prompt template. The human operator monitors agent operations via the web dashboard on :5555, and interacts via Discord (fix approval βœ… or rejection ❌ embeds).

sudo apt install tmux
sudo cp oncall/oncall-watcher.service /etc/systemd/system/
sudo cp dashboard/oncall-dashboard.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now oncall-watcher.service
sudo systemctl enable --now oncall-dashboard.service

Manage with: systemctl start | stop | restart | status oncall-watcher

▫️ Step 5: Check if the service and Vector are running:

sudo systemctl status vector
sudo systemctl status oncall-watcher.service
sudo systemctl status oncall-dashboard.service

πŸ”„ Test Network Topology

▫️ Network diagram:

topology

▫️ Router configurations:

  • Please find my test lab's config files under the lab_configs directory
  • They are the network's fallback configs for containerlab redeploy -t AINOC-TOPOLOGY.yml
  • Default credentials: see .env file at .env.example

πŸ“ž aiNOC Service Mode

aiNOC runs as an on-call watcher that monitors Vector's /var/log/network.json for SLA path failures and automatically invokes a Claude agent to diagnose the issue and propose a fix.

How It Works

  1. Network devices track connectivity paths (Cisco IP SLA β€” extensible to Arista Connectivity Monitor, Juniper RPM probes, MikroTik Netwatch etc.)
  2. Failures are logged remotely to Syslog β†’ Vector records, parses, and writes to /var/log/network.json
  3. oncall-watcher service detects the SLA failure, opens a Jira ticket, and invokes the aiNOC agent session
  4. Web dashboard is activated and displays agent's work in real time. User is notified and kept in the loop via Discord.
  5. Agent follows structured troubleshooting methodology: CLAUDE.md + /skills + MCP tools β†’ identifies root cause(s) β†’ proposes fix
  6. Only upon human operator approval via Discord, the agent applies and verifies the fix, otherwise issue is documented and the case stays open
  7. Case resolution is logged to Jira and Discord, and the watcher resumes monitoring
  8. aiNOC agent learns a new lesson about network troubleshooting and documents to lessons.md

See Installation & Usage for instructions.

⬆️ Planned Upgrades

  • New protocols supported
  • Performance-based SLAs

♻️ Repository Lifecycle

New features are being added periodically (protocols, integrations, optimizations).

Stay up-to-date:

  • Watch and Star this repository

Current version:

  • aiNOC v5.5

πŸ“„ Disclaimer

You are responsible for defining your own troubleshooting methodologies and context files, as well as building your own test environment and meeting the necessary conditions (e.g., RAM/CPU, Claude subscription/API key, etc.).

πŸ“œ License

Licensed under the Business Source License 1.1. The source code is available for research, educational, and non-commercial use. Commercial use, SaaS deployment, enterprise automation, or resale of this software is prohibited without explicit written permission from the author.

πŸ“§ Collaborations

Interested in collaborating?

About

aiNOC: Network troubleshooting framework for multi-vendor, multi-protocol, multi-area enterprise networks based on Claude Code, FastMCP, Python, Scrapli, RESTCONF, Containerlab, Jira, etc.

Topics

Resources

License

Stars

Watchers

Forks

Packages