Skip to content

Conversation

@qj0r9j0vc2
Copy link
Member

Summary

This PR implements structured task supervision for the CipherBFT node, resolving the issue of missing coordinated shutdown for background workers.

Changes

  • New supervisor.rs module: Implements NodeSupervisor using tokio-util's TaskTracker and CancellationToken for:

    • spawn() - Regular task spawning with cancellation support
    • spawn_cancellable() - Tasks that handle their own cancellation
    • spawn_critical() - Tasks that trigger full shutdown on failure
    • shutdown() - Graceful shutdown with configurable timeout
  • Refactored node.rs:

    • Added run_with_supervisor() method for external supervisor control
    • Peer-connector task now runs under supervisor
    • Event loop respects cancellation token with biased select
  • Updated main.rs:

    • cmd_start() creates supervisor and sets up Ctrl+C signal handler
    • cmd_testnet_start() uses shared supervisor for all validators

Shutdown Order

  1. Stop accepting new network connections (Primary shutdown)
  2. Drain in-flight consensus rounds (via cancellation token)
  3. Flush pending storage writes (supervisor waits for tasks)
  4. Close database connections
  5. Exit

Test Plan

  • All 66 existing tests pass
  • 3 new supervisor unit tests added and passing:
    • test_supervisor_spawn_and_shutdown
    • test_supervisor_cancellation
    • test_critical_task_failure_triggers_shutdown

Closes #56

Implement NodeSupervisor using tokio-util's TaskTracker and
CancellationToken for coordinated task management across the node.

Changes:
- Add supervisor.rs with NodeSupervisor implementation
- Refactor node.rs to use supervisor for task lifecycle
- Update main.rs cmd_start and cmd_testnet_start with signal handling
- Add tokio-util rt feature for TaskTracker support

Shutdown order follows: network -> consensus -> storage -> exit

Closes #56

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing structured task supervision for background workers

2 participants