Skip to content
22 changes: 19 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ IoTHub is a high-performance MQTT server implemented in Rust using Tokio. The ar
5. **Event-driven**: tokio::select! for responsive packet and shutdown handling
6. **Session Management**: SessionId starts as `__anon_$uuid`, becomes `__client_$clientId` after CONNECT

### Current Status (Milestone 3 - Completed)
### Current Status (Milestone 5 - Completed)

**✅ Milestone 1 Completed:**
- Event-driven architecture with tokio::select!
Expand Down Expand Up @@ -121,6 +121,22 @@ IoTHub is a high-performance MQTT server implemented in Rust using Tokio. The ar
- Atomic session state save (all-or-nothing)
- Config-based storage backend selection

**✅ Milestone 4 Completed:**
- TLS/SSL encryption via `tls://` listener prefix
- Multiple simultaneous listeners (TCP + TLS)
- Username/password authentication (file-based)
- Topic-based ACLs for publish/subscribe access control
- Pluggable auth and ACL backends (`allowall`, `file`)

**✅ Milestone 5 Completed:**
- QoS=2 "exactly once" delivery guarantee
- PUBREC/PUBREL/PUBCOMP four-step handshake
- QoS=2 state machine (AwaitingPubRec, AwaitingPubComp)
- Inbound and outbound QoS=2 message handling
- QoS=2 retransmission (PUBLISH and PUBREL retry)
- QoS=2 state persistence across restarts
- Comprehensive QoS=2 test coverage

### Project Structure
- `src/auth/` - Authentication and authorization (Milestone 4+)
- `src/protocol/` - MQTT protocol implementation (v3.1.1 in progress)
Expand All @@ -134,8 +150,8 @@ IoTHub is a high-performance MQTT server implemented in Rust using Tokio. The ar
**Milestone 1** ✅: Full MQTTv3 Server (QoS=0, no persistency/auth)
**Milestone 2** ✅: QoS=1 Support
**Milestone 3** ✅: Persistence Layer
**Milestone 4**: Security (TLS, Auth, ACLs)
**Milestone 5**: QoS=2 Support
**Milestone 4**: Security (TLS, Auth, ACLs)
**Milestone 5**: QoS=2 Support
**Milestone 6**: Observability (Prometheus, Grafana)
**Milestone 7**: Flow Control & Production Features
**v1.0**: Production Ready
Expand Down
22 changes: 16 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ A high-performance MQTT server daemon implementation in Rust using Tokio, design
## Features

- **MQTT v3.1.1 protocol support** with all packet types
- **QoS=0 and QoS=1** message delivery with retransmission and DUP detection
- **QoS=0, QoS=1, and QoS=2** message delivery with full protocol support
- **Message routing** with full MQTT wildcard support (`+` single-level, `#` multi-level)
- **Clean session** with session takeover and proper cleanup
- **Keep-alive mechanism** with configurable timeouts
Expand All @@ -30,13 +30,13 @@ A high-performance MQTT server daemon implementation in Rust using Tokio, design
- **Race-condition-free shutdown** using CancellationToken
- **UNIX signal handling** (SIGINT graceful, SIGTERM immediate)
- **Comprehensive configuration** with TOML support
- **Extensive test coverage** with 100+ tests validating all functionality
- **Extensive test coverage** with 136+ tests validating all functionality

## Current Status

**IoTD has completed Milestone 4 - Security! 🎉**
**IoTD has completed Milestone 5 - QoS=2! 🎉**

The project now has full security support including TLS encryption, username/password authentication, and topic-based access control lists.
The project now supports all three MQTT QoS levels, providing "exactly once" delivery guarantee with the complete PUBREC/PUBREL/PUBCOMP four-step handshake protocol.

### Completed Features ✅

Expand Down Expand Up @@ -84,15 +84,25 @@ The project now has full security support including TLS encryption, username/pas
- ✅ **Config-based TLS** - Certificate and key file paths in TOML
- ✅ **TLS integration tests** - Self-signed cert testing with rcgen

#### Milestone 5 - QoS=2 ✅ **COMPLETED**
- ✅ **QoS=2 message delivery** - "Exactly once" guarantee fully implemented
- ✅ **PUBREC/PUBREL/PUBCOMP flow** - Complete four-step handshake protocol
- ✅ **QoS=2 state machine** - AwaitingPubRec and AwaitingPubComp states
- ✅ **Inbound QoS=2 tracking** - Broker receives and processes QoS=2 messages
- ✅ **Outbound QoS=2 delivery** - Broker sends QoS=2 to subscribers
- ✅ **QoS=2 retransmission** - PUBLISH and PUBREL retry on timeout
- ✅ **QoS=2 persistence** - State survives server restarts
- ✅ **Comprehensive QoS=2 tests** - 5+ integration tests covering all scenarios

### Roadmap 📋

#### Near-term (v0.x - v1.0)
- **Milestone 1**: ✅ Basic MQTT Server (Completed)
- **Milestone 2**: ✅ QoS=1 Support (Completed)
- **Milestone 3**: ✅ Persistence Layer (Completed)
- **Milestone 4**: ✅ Security - TLS, Authentication, ACLs (Completed)
- **Milestone 5** (Next): QoS=2 "exactly once" delivery
- **Milestone 6**: Observability (Prometheus, Grafana)
- **Milestone 5**: ✅ QoS=2 "exactly once" delivery (Completed)
- **Milestone 6** (Next): Observability (Prometheus, Grafana)
- **Milestone 7**: Flow control & production features
- **v1.0**: Production-ready single-node broker

Expand Down
24 changes: 22 additions & 2 deletions docs/documentation.html
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,8 @@ <h2>MQTT Protocol Implementation</h2>
<h3>Supported Packet Types</h3>
<ul>
<li><strong>Control</strong>: CONNECT, CONNACK, DISCONNECT, PINGREQ, PINGRESP</li>
<li><strong>Publish</strong>: PUBLISH, PUBACK (QoS=1)</li>
<li><strong>Publish QoS=1</strong>: PUBLISH, PUBACK</li>
<li><strong>Publish QoS=2</strong>: PUBLISH, PUBREC, PUBREL, PUBCOMP</li>
<li><strong>Subscribe</strong>: SUBSCRIBE, SUBACK, UNSUBSCRIBE, UNSUBACK</li>
</ul>

Expand All @@ -107,6 +108,24 @@ <h3>QoS=1 Message Flow</h3>
└──── PUBACK ────────┘ └── PUBACK ─┘</pre>
</div>

<h3>QoS=2 Message Flow (Exactly Once)</h3>
<div class="code-block">
<pre>Publisher IoTD Subscriber
│ │ │
├── PUBLISH (QoS=2) ──────►│ │
│ │──── store message ────► │
│◄────── PUBREC ───────────│ │
│ │ │
├────── PUBREL ───────────►│──── PUBLISH (QoS=2) ───►│
│ │ │
│ │◄────── PUBREC ───────────│
│◄────── PUBCOMP ──────────│ │
│ │────── PUBREL ───────────►│
│ │ │
│ │◄────── PUBCOMP ──────────│
│ │ │</pre>
</div>

<h3>Topic Validation Rules</h3>
<ul>
<li>Topics must not be empty</li>
Expand All @@ -129,11 +148,12 @@ <h3>Server Configuration</h3>
# Default: 10000
retained_message_limit = 10000

# Maximum retransmission attempts for QoS=1
# Maximum retransmission attempts for QoS=1 and QoS=2
# Default: 10
max_retransmission_limit = 10

# Interval between retransmission attempts (milliseconds)
# Applies to PUBLISH (QoS=1, QoS=2) and PUBREL (QoS=2)
# Default: 5000 (5 seconds)
retransmission_interval_ms = 5000

Expand Down
17 changes: 15 additions & 2 deletions docs/features.html
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ <h2>Core MQTT Features</h2>
<li>✅ <strong>MQTT v3.1.1 Protocol Support</strong> - Full implementation of all packet types</li>
<li>✅ <strong>QoS=0 (At Most Once)</strong> - Fire-and-forget message delivery</li>
<li>✅ <strong>QoS=1 (At Least Once)</strong> - Reliable delivery with acknowledgments</li>
<li>🚧 <strong>QoS=2 (Exactly Once)</strong> - Coming in Milestone 5</li>
<li> <strong>QoS=2 (Exactly Once)</strong> - Guaranteed exactly-once delivery</li>
<li>✅ <strong>Wildcard Subscriptions</strong> - Support for + and # wildcards</li>
<li>✅ <strong>Retained Messages</strong> - Store and deliver last known good values</li>
<li>✅ <strong>Will Messages</strong> - Last Will and Testament support</li>
Expand All @@ -57,6 +57,18 @@ <h2>Quality of Service Level 1</h2>
<li><strong>QoS Downgrade</strong> - Proper min(publish, subscribe) handling</li>
</ul>

<h2>Quality of Service Level 2</h2>
<p>Our QoS=2 implementation guarantees exactly-once delivery with:</p>
<ul>
<li><strong>Four-Step Handshake</strong> - PUBLISH → PUBREC → PUBREL → PUBCOMP</li>
<li><strong>State Machine</strong> - AwaitingPubRec and AwaitingPubComp states</li>
<li><strong>Inbound Tracking</strong> - Proper handling of received QoS=2 messages</li>
<li><strong>Outbound Delivery</strong> - QoS=2 delivery to subscribers</li>
<li><strong>Retransmission</strong> - Automatic retry for PUBLISH and PUBREL</li>
<li><strong>State Persistence</strong> - QoS=2 state survives server restarts</li>
<li><strong>Duplicate Detection</strong> - Proper handling of duplicate messages</li>
</ul>

<h2>Security</h2>
<p>Production-grade security with TLS, authentication, and authorization:</p>
<ul>
Expand All @@ -73,7 +85,8 @@ <h2>Persistence Layer</h2>
<ul>
<li><strong>Session Persistence</strong> - Restore sessions for clean_session=false clients</li>
<li><strong>Subscription Recovery</strong> - Subscriptions survive reconnects</li>
<li><strong>In-Flight Messages</strong> - QoS=1 messages restored on reconnect</li>
<li><strong>In-Flight Messages</strong> - QoS=1 and QoS=2 messages restored on reconnect</li>
<li><strong>QoS=2 State</strong> - Inbound QoS=2 messages persist across restarts</li>
<li><strong>Retained Messages</strong> - Persist across server restarts</li>
<li><strong>InMemoryStorage</strong> - Fast storage for development/testing</li>
<li><strong>SqliteStorage</strong> - Durable storage for production</li>
Expand Down
20 changes: 10 additions & 10 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ <h1>IoTD - IoT Daemon</h1>
<p class="description">A modern, async MQTT v3.1.1 server implementation designed for scalability, reliability, and extensibility.</p>
<div class="cta-buttons">
<a href="getting-started.html" class="btn btn-primary">Get Started</a>
<a href="https://github.com/lileding/iotd/releases/latest" class="btn btn-secondary">Download v0.4.0</a>
<a href="https://github.com/lileding/iotd/releases/latest" class="btn btn-secondary">Download v0.5.0</a>
</div>
</div>
</header>
Expand Down Expand Up @@ -73,7 +73,7 @@ <h3>🛡️ Reliable</h3>
<div class="container">
<h2>Quick Start</h2>
<pre><code># Download and extract
wget https://github.com/lileding/iotd/releases/download/v0.4.0/iotd-linux-x86_64.tar.gz
wget https://github.com/lileding/iotd/releases/download/v0.5.0/iotd-linux-x86_64.tar.gz
tar -xzf iotd-linux-x86_64.tar.gz

# Run the server
Expand All @@ -86,17 +86,17 @@ <h2>Quick Start</h2>

<section class="latest-release">
<div class="container">
<h2>Latest Release: v0.4.0</h2>
<h2>Latest Release: v0.5.0</h2>
<div class="release-info">
<h3>Security Complete! 🎉</h3>
<h3>QoS=2 Complete! 🎉</h3>
<ul>
<li>TLS/SSL encryption with configurable certificates</li>
<li>Multiple simultaneous listeners (TCP + TLS)</li>
<li>Username/password authentication (file-based)</li>
<li>Topic-based access control lists (ACLs)</li>
<li>100+ comprehensive tests</li>
<li>QoS=2 "exactly once" delivery guarantee</li>
<li>Complete PUBREC/PUBREL/PUBCOMP four-step handshake</li>
<li>QoS=2 state persistence across restarts</li>
<li>Full MQTT 3.1.1 QoS support (0, 1, and 2)</li>
<li>136+ comprehensive tests</li>
</ul>
<a href="https://github.com/lileding/iotd/releases/tag/v0.4.0" class="btn btn-small">View Release Notes</a>
<a href="https://github.com/lileding/iotd/releases/tag/v0.5.0" class="btn btn-small">View Release Notes</a>
</div>
</div>
</section>
Expand Down
15 changes: 8 additions & 7 deletions docs/roadmap.html
Original file line number Diff line number Diff line change
Expand Up @@ -90,19 +90,20 @@ <h3>v0.4.0 - Security <span class="status completed">Completed</span></h3>
</ul>
</div>

<div class="timeline-item">
<h3>v0.5.0 - QoS=2 <span class="status planned">Next</span></h3>
<div class="timeline-item completed">
<h3>v0.5.0 - QoS=2 <span class="status completed">Completed</span></h3>
<p><strong>Milestone 5: Exactly Once Delivery</strong></p>
<ul>
<li>🎯 QoS=2 implementation</li>
<li>🎯 PUBREC/PUBREL/PUBCOMP flow</li>
<li>🎯 Two-phase commit protocol</li>
<li>🎯 State persistence for QoS=2</li>
<li>✅ QoS=2 implementation with exactly-once guarantee</li>
<li>✅ PUBREC/PUBREL/PUBCOMP four-step handshake</li>
<li>✅ QoS=2 state machine and retransmission</li>
<li>✅ State persistence for QoS=2 across restarts</li>
<li>✅ Comprehensive QoS=2 test suite</li>
</ul>
</div>

<div class="timeline-item">
<h3>v0.6.0 - Observability <span class="status planned">Planned</span></h3>
<h3>v0.6.0 - Observability <span class="status planned">Next</span></h3>
<p><strong>Milestone 6: Monitoring & Metrics</strong></p>
<ul>
<li>📊 Prometheus metrics export</li>
Expand Down
85 changes: 47 additions & 38 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,29 +150,34 @@ IoTD (IoT Daemon) development follows a progressive milestone approach, where ea

---

### Milestone 5: QoS=2 Support 🎯 **PLANNED**
### Milestone 5: QoS=2 Support **COMPLETED**
**Target**: Exactly-once delivery guarantee

**Features to Implement:**
- 🎯 QoS=2 (Exactly once) message delivery
- 🎯 PUBREC/PUBREL/PUBCOMP flow
- 🎯 Two-phase commit protocol
- 🎯 Message state persistence for QoS=2
- 🎯 Duplicate detection across restarts
- 🎯 Proper error handling and recovery

**Technical Challenges:**
- Complex state machine for QoS=2 flow
- Ensuring exactly-once semantics
- Performance impact of two-phase protocol
- Recovery after crashes
**✅ Completed Features:**
- ✅ QoS=2 (Exactly once) message delivery
- ✅ PUBREC/PUBREL/PUBCOMP four-step handshake
- ✅ QoS=2 state machine (AwaitingPubRec, AwaitingPubComp)
- ✅ Inbound QoS=2 tracking for received messages
- ✅ Outbound QoS=2 delivery to subscribers
- ✅ QoS=2 retransmission (PUBLISH and PUBREL retry)
- ✅ QoS=2 state persistence across restarts
- ✅ QoS downgrade handling (min of publish/subscribe QoS)
- ✅ Duplicate detection and proper handling
- ✅ Comprehensive QoS=2 test suite

**Timeline**: 6-8 weeks
**Success Criteria**:
- [ ] QoS=2 messages delivered exactly once
- [ ] Proper handling of all edge cases
- [ ] State survives server restarts
- [ ] Acceptable performance
**Architecture Implemented:**
- Extended InflightMessage with qos2_state field
- Separate inbound_qos2 HashMap for received messages
- State machine: AwaitingPubRec → AwaitingPubComp → Complete
- Retransmission logic handles both PUBLISH and PUBREL
- Persistence types: PersistedQos2State, PersistedInboundQos2Message

**Timeline**: Completed
**Success Criteria**: All achieved ✓
- [✓] QoS=2 messages delivered exactly once
- [✓] Proper handling of all edge cases
- [✓] State survives server restarts
- [✓] Acceptable performance

---

Expand Down Expand Up @@ -268,11 +273,13 @@ IoTD (IoT Daemon) development follows a progressive milestone approach, where ea
- Topic-based ACLs ✓
- Multiple simultaneous listeners ✓

### v0.5.0 - QoS=2 (Milestone 5) 🎯
- Exactly-once delivery
- PUBREC/PUBREL/PUBCOMP flow
- Two-phase commit protocol
- State persistence for QoS=2
### v0.5.0 - QoS=2 (Milestone 5) ✅ **COMPLETED**
- Exactly-once delivery ✓
- PUBREC/PUBREL/PUBCOMP flow ✓
- QoS=2 state machine ✓
- State persistence for QoS=2 ✓
- Retransmission for PUBLISH and PUBREL ✓
- Comprehensive test coverage ✓

### v0.6.0 - Observability (Milestone 6) 📊
- Prometheus metrics
Expand Down Expand Up @@ -438,22 +445,24 @@ IoTD (IoT Daemon) development follows a progressive milestone approach, where ea

---

## What's Next: Milestone 5 - QoS=2 🎯
## What's Next: Milestone 6 - Observability 📊

With security now complete, the next major milestone implements exactly-once delivery:
With QoS=2 now complete, the next major milestone adds production monitoring capabilities:

**Key Features to Implement:**
1. **QoS=2 Delivery**: PUBREC/PUBREL/PUBCOMP four-packet flow
2. **Two-phase commit**: Ensure exactly-once semantics
3. **State Persistence**: QoS=2 state survives server restarts
4. **Duplicate Detection**: Across reconnects and restarts
5. **Proper Error Handling**: Recovery after partial flows

**Preparation Tasks:**
- Extend session state machine for QoS=2 states
- Add QoS=2 in-flight tracking to storage trait
- Design packet ID reuse rules per MQTT 3.1.1 spec
- Plan comprehensive QoS=2 test scenarios
1. **Prometheus Metrics**: Export broker metrics for monitoring
2. **Grafana Dashboards**: Pre-built dashboard templates
3. **Health Check Endpoints**: HTTP endpoints for load balancers
4. **Structured Logging**: Enhanced logging with levels and context
5. **Performance Metrics**: Connection, message, and latency tracking

**Metrics to Export:**
- Connection count and connection rate
- Message throughput by QoS level
- Topic statistics and subscription counts
- Error rates and types
- Resource usage (memory, CPU)
- Latency percentiles

---

Expand Down
Loading