Skip to content

[FEATURE]: Implement Retry Queue for Failed Panel Heartbeats #1

@msikorski26

Description

@msikorski26

Summary
Add a local queue mechanism to the worker so it can retry sending heartbeat/status updates to the panel when the connection fails (e.g., panel temporarily unavailable). This ensures no data loss during outages.

Motivation
Currently, if the panel is down or unreachable, heartbeats are silently dropped. A retry queue would make the worker more resilient:

  • Buffers updates locally (e.g., in-memory or SQLite).
  • Retries on next successful connection.
  • Prevents data gaps in monitoring history.

Proposed Implementation

  1. Queue Storage: Use a lightweight queue (e.g., in-memory array or persistent SQLite) to store failed payloads with timestamps.
  2. Retry Logic:
    • On send failure: Enqueue payload with retry count (max 5-10).
    • On successful connection: Flush queue oldest-first.
    • Exponential backoff for retries (e.g., 1s → 5s → 30s).
  3. Config Options:
    Option Type Default Description
    retryQueue.enabled boolean true Enable/disable queue
    retryQueue.maxSize number 1000 Max queued items
    retryQueue.persistence string 'memory' 'memory' | 'sqlite'
    retryQueue.maxRetries number 10 Per-item retry limit

Example Flow

Worker detects downtime → Enqueue: {timestamp: now, status: {...}, retry: 1}
Panel back online → Retry queue → Send batched → Clear on success

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions