A distributed job queue system built with Node.js and Redis featuring distributed workers, retries, dead letter queues (DLQ), fault-tolerant recovery, and a live monitoring dashboard.
This project simulates how production-grade background job processing systems work internally.
- Distributed worker architecture
- Concurrent job processing
- Reliable Redis-backed queue
- Visibility timeout handling
- Automatic retries
- Dead Letter Queue (DLQ)
- Fault-tolerant watchdog recovery
- Live monitoring dashboard
- Metrics API
- Horizontal worker scaling
- Dockerized Redis setup
Live monitoring dashboard showing queue state, processing jobs, completed jobs, failures, and DLQ metrics.
Multiple workers consuming jobs concurrently from the Redis queue.
Workers automatically retry failed jobs until the retry limit is exceeded.
Jobs exceeding maximum retry attempts are moved into the Dead Letter Queue for inspection and recovery.
Failed jobs persisted inside Redis after exceeding retry limits.
Metrics endpoint exposing internal queue state and processing statistics.
Client ↓ Producer API ↓ Redis Queue ↓ Distributed Workers ↓ ACK / Retry System ↓ Dead Letter Queue (DLQ)
- Node.js
- Express.js
- Redis
- Docker
- HTML/CSS/JavaScript
- Producer API receives jobs through HTTP requests.
- Jobs are pushed into a Redis queue.
- Multiple distributed workers consume jobs concurrently.
- Workers move jobs into a processing queue using atomic Redis operations.
- Successfully completed jobs are ACKed and removed.
- Failed jobs are retried automatically.
- Jobs exceeding retry limits are moved to the Dead Letter Queue (DLQ).
- A watchdog continuously monitors stale jobs and recovers failed processing.
- Metrics are exposed through a monitoring dashboard and API.
git clone https://github.com/CHINMOYSHARMA-debug/distributed_job_queue.git
cd distributed_job_queuenpm installdocker compose upnpm run devGET /metricsReturns:
{
"queued": 3,
"processing": 2,
"deadLetter": 1,
"completed": 8,
"failed": 2
}The producer API and monitoring dashboard are deployed publicly on render. Distributed workers and watchdog processes are demonstrated locally due to free-tier limitations on long-running background services.





