Skip to content

Enterprise-grade NVIDIA GPU monitoring suite with real-time dashboard, time-series analytics, and intelligent alerts. Features React frontend, FastAPI backend, and Supabase for data persistence. Tracks utilization, memory, temperature, power with microsecond precision. Perfect for ML/AI, render farms.

License

Notifications You must be signed in to change notification settings

jackccrawford/gpu-sentinel-pro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

37 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GPU Sentinel Pro

"Information should not be displayed all at once; let people gradually become familiar with it." - Edward Tufte

Transform GPU monitoring from complex metrics into intuitive visual patterns. Enterprise-grade NVIDIA GPU monitoring with real-time analytics, intelligent alerts, and historical analysis.

CodeQL X Follow MIT License Built with Codeium Python FastAPI React TypeScript Supabase

Dark Mode Dashboard Real-time GPU metrics visualized for instant comprehension

Quick Start

Prerequisites

  • NVIDIA GPU with compute capability 3.0 or higher
  • NVIDIA Driver 450.80.02 or higher
  • Python 3.8+ and Node.js 16.0+
  • 4GB RAM (8GB recommended)
  • 1GB free disk space

Installation

  1. Clone the repository:
git clone git@github.com:jackccrawford/gpu-sentinel-pro.git
cd gpu-sentinel-pro
  1. Set up the backend:
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt
  1. Set up the frontend:
cd frontend
npm install
  1. Start the services:
# Terminal 1 - Backend
cd backend
python src/service/app.py

# Terminal 2 - Frontend
cd frontend
npm run dev
  1. Access the dashboard at http://localhost:5173

Pro Features

🎯 Enterprise-Grade Monitoring

  • Real-time Visual Dashboard

    • Modern React components with Material UI
    • Responsive design for desktop and mobile
    • Dark/light mode with automatic system preference detection
    • Multi-GPU support with individual monitoring panels
  • Advanced Metrics

    • Temperature and utilization with color-coded ranges
    • Memory usage and bandwidth monitoring
    • Power consumption and efficiency tracking
    • Process-level GPU utilization
    • Custom metric aggregation

πŸ”” Intelligent Alert System

  • Configurable Thresholds
    {
      "temperature": {
        "warning": 75,
        "critical": 85
      },
      "memory": {
        "warning": 85,
        "critical": 95
      }
    }
  • Alert Types
    • Temperature spikes
    • Memory leaks
    • Process crashes
    • Power anomalies
    • Custom conditions

πŸ“Š Analytics & Reporting

  • Historical Data

    • Time-series metrics storage
    • Customizable retention policies
    • Data export in multiple formats
    • Trend analysis and forecasting
  • Performance Insights

    • Workload pattern recognition
    • Resource utilization heatmaps
    • Efficiency recommendations
    • Cost analysis tools

πŸ›  Enterprise Integration

  • API Access

    • RESTful API with OpenAPI documentation
    • Secure authentication
    • Rate limiting and quotas
    • Webhook support
  • Security

    • Role-based access control
    • Audit logging
    • SSL/TLS encryption
    • Regular security updates

Configuration

Backend Settings

# config.py
SETTINGS = {
    'update_interval': 1000,  # ms
    'retention_period': '30d',
    'log_level': 'INFO',
    'enable_analytics': True,
    'alert_cooldown': 300,  # seconds
}

Frontend Configuration

// config.ts
export const CONFIG = {
  API_URL: 'http://localhost:8000',
  REFRESH_RATE: 1000,
  THEME_MODE: 'system',  // 'light' | 'dark' | 'system'
  CHART_HISTORY: 300,    // data points
};

System Architecture

graph TD
    A[Frontend React App] -->|HTTP/WebSocket| B[FastAPI Backend]
    B -->|NVML| C[GPU Hardware]
    B -->|Time Series| D[Supabase]
    B -->|Alerts| E[Notification Service]
Loading

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'feat: add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a pull request

Support

License

This project is licensed under the MIT License - see the LICENSE file for details.


About

Enterprise-grade NVIDIA GPU monitoring suite with real-time dashboard, time-series analytics, and intelligent alerts. Features React frontend, FastAPI backend, and Supabase for data persistence. Tracks utilization, memory, temperature, power with microsecond precision. Perfect for ML/AI, render farms.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •