-
Notifications
You must be signed in to change notification settings - Fork 1
Add sensor diagnostics demo #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add lightweight sensor diagnostics demo (no Gazebo required): - CMakeLists.txt and package.xml for ROS 2 package - Simulated sensor base class with fault injection support - LiDAR simulator node with configurable parameters: - scan_rate, range_min/max, noise_stddev - Runtime fault injection via parameters - Diagnostics publisher for sensor status This is the foundation for the sensor diagnostics demo that demonstrates ros2_medkit's data monitoring and configuration management features.
Add remaining simulated sensor nodes with fault injection: - IMU simulator: acceleration, angular velocity, drift simulation - GPS simulator: NavSatFix with position noise and signal loss - Camera simulator: RGB image generation with noise/black frames All sensors support runtime parameter changes for fault injection: - failure_probability for timeouts - inject_nan for invalid values - drift_rate for gradual sensor drift - noise parameters for degraded operation
Add anomaly detector that monitors sensor streams and reports faults: - Subscribes to LiDAR, IMU, GPS topics - Detects NaN values, out-of-range readings - Monitors message rates and timeouts - Publishes DiagnosticArray for detected faults Fault detection thresholds are configurable via ROS parameters: - rate_timeout_sec - max_nan_ratio - sensor-specific rate minimums
Add ROS 2 launch file and configuration: - demo.launch.py: launches all nodes with proper namespacing - /sensors: lidar_sim, imu_sim, gps_sim, camera_sim - /processing: anomaly_detector - /diagnostics: ros2_medkit_gateway - sensor_params.yaml: default sensor parameters - medkit_params.yaml: gateway configuration - sensor_manifest.yaml: SOVD entity hierarchy definition
Add lightweight Dockerfile and docker-compose.yml: - Based on ros:jazzy-ros-base (no Gazebo/GUI packages) - Image size: ~500MB vs ~4GB for TurtleBot3 demo - Startup time: ~5s vs ~60s - Includes CI profile for headless testing - Optional sovd-web-ui service for visualization
Add interactive demo scripts: - run-demo.sh: walkthrough of ros2_medkit API features - inject-noise.sh: increase sensor noise levels - inject-failure.sh: cause sensor timeouts - inject-nan.sh: inject NaN values - inject-drift.sh: enable sensor drift - restore-normal.sh: clear all faults Add comprehensive README.md with: - Quick start guide (Docker and source) - API examples for all sensor operations - Fault scenario documentation - Parameter reference tables - Comparison with TurtleBot3 demo
- Add sensor diagnostics demo to demos table (marked as Ready) - Add Quick Start section with docker compose instructions
Build sensor diagnostics demo image in Docker build job. Runs before TurtleBot3 build (faster, validates basic ros2_medkit).
The gateway requires additional packages: - ros2_medkit_serialization (JSON/ROS message conversion) - ros2_medkit_msgs (custom messages) - dynmsg (dynamic message introspection) Also add ros-jazzy-yaml-cpp-vendor to apt dependencies.
9c090fb to
2c05967
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a lightweight, Gazebo‑free sensor diagnostics demo for ros2_medkit, including simulated sensors, an anomaly detector, Docker/docker‑compose integration, and CI wiring. It provides REST‑driven fault injection and a walkthrough script to showcase monitoring, configuration, and fault detection.
Changes:
- Introduces simulated LiDAR, IMU, GPS, and camera nodes with configurable fault injection and diagnostic publishing.
- Adds an anomaly detector node, parameter/config/manifest files, Dockerfile, and docker‑compose setup for running the demo plus an optional web UI.
- Updates documentation, helper shell scripts, and CI to build the new demo image and advertise it as the primary quick‑start path.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| demos/sensor_diagnostics/src/lidar_sim_node.cpp | LiDAR simulator node publishing LaserScan with noise, NaN, drift, and failure injection plus diagnostic status. |
| demos/sensor_diagnostics/src/imu_sim_node.cpp | IMU simulator node publishing Imu with configurable noise, NaN, drift, and failure injection plus diagnostics. |
| demos/sensor_diagnostics/src/gps_sim_node.cpp | GPS simulator node publishing NavSatFix with noise, drift, NaN, and loss‑of‑fix behavior plus diagnostics. |
| demos/sensor_diagnostics/src/camera_sim_node.cpp | Camera simulator node publishing Image and CameraInfo with noise, black frames, and brightness faults plus diagnostics. |
| demos/sensor_diagnostics/src/anomaly_detector_node.cpp | Anomaly detector node subscribing to sensor topics, detecting anomalies/timeouts/rate degradation, and publishing fault events. |
| demos/sensor_diagnostics/run-demo.sh | Interactive shell script driving the REST API to show gateway health, sensor data, configs, and current faults. |
| demos/sensor_diagnostics/restore-normal.sh | Helper script to reset all sensor fault‑related parameters back to normal values via the REST API. |
| demos/sensor_diagnostics/inject-noise.sh | Helper script to inject high‑noise faults into all sensors via configuration updates. |
| demos/sensor_diagnostics/inject-nan.sh | Helper script to enable NaN injection on LiDAR, IMU, and GPS via configuration updates. |
| demos/sensor_diagnostics/inject-failure.sh | Helper script to force LiDAR sensor timeouts by setting failure probability to 1.0. |
| demos/sensor_diagnostics/inject-drift.sh | Helper script to configure drift faults on LiDAR, IMU, and GPS sensors. |
| demos/sensor_diagnostics/restore-normal.sh | Script to clear all configured faults and restore baseline sensor settings. |
| demos/sensor_diagnostics/package.xml | Defines the sensor_diagnostics_demo ROS 2 package and its dependencies/exec deps. |
| demos/sensor_diagnostics/launch/demo.launch.py | Launch description that starts all simulated sensors, the anomaly detector, and the ros2_medkit gateway with appropriate namespaces and configs. |
| demos/sensor_diagnostics/include/sensor_diagnostics/simulated_sensor_base.hpp | Adds a (currently unused) base class and config struct for reusable fault‑injection behavior in simulated sensors. |
| demos/sensor_diagnostics/docker-compose.yml | docker‑compose stack to run the sensor demo container and optional SOVD web UI, plus a CI profile variant. |
| demos/sensor_diagnostics/config/sensor_params.yaml | Default ROS 2 parameters for each simulated sensor and the anomaly detector. |
| demos/sensor_diagnostics/config/sensor_manifest.yaml | SOVD manifest describing areas, components, and apps for the sensor diagnostics demo. |
| demos/sensor_diagnostics/config/medkit_params.yaml | ros2_medkit gateway configuration (HTTP server, CORS, discovery via manifest) for this demo. |
| demos/sensor_diagnostics/README.md | Dedicated README explaining the demo, architecture, REST API examples, fault scenarios, scripts, and parameter reference. |
| demos/sensor_diagnostics/Dockerfile | Dockerfile building a lightweight Jazzy‑based image containing ros2_medkit_gateway and the sensor diagnostics demo. |
| demos/sensor_diagnostics/CMakeLists.txt | CMake configuration for building and installing the new simulator and anomaly detector nodes plus config/launch assets. |
| README.md (root) | Top‑level README updated to add the Sensor Diagnostics demo and quick‑start instructions. |
| .github/workflows/ci.yml | CI workflow updated to build the sensor diagnostics Docker image alongside the TurtleBot3 demo image. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
demos/sensor_diagnostics/include/sensor_diagnostics/simulated_sensor_base.hpp
Show resolved
Hide resolved
- Add --recurse-submodules to clone dynmsg submodule - Add ros-jazzy-example-interfaces to apt dependencies - Skip test dependencies not in ros-base image - Disable BUILD_TESTING for faster builds - Use ghcr.io/selfpatch/sovd_web_ui:latest image
- Add ros2_medkit_fault_manager to Docker image
- Add fault_manager node to launch file
- Integrate anomaly_detector with ReportFault service
- Fix topic subscriptions (/sensors/scan instead of /sensors/lidar_sim/scan)
- Fix run-demo.sh jq queries for {items: []} response format
- Add libsqlite3-dev for fault_manager storage
Fault injection now properly reports to FaultManager:
- inject-nan.sh -> SENSOR_NAN faults
- inject-failure.sh -> SENSOR_TIMEOUT faults
- inject-noise.sh -> RATE_DEGRADED faults
Demonstrates two fault reporting mechanisms in ros2_medkit: LEGACY PATH (diagnostics → bridge): - LiDAR and Camera publish DiagnosticArray to /diagnostics topic - ros2_medkit_diagnostic_bridge converts and reports to fault_manager - Auto-generates fault codes from diagnostic names (e.g., LIDAR_SIM) MODERN PATH (direct service call): - IMU and GPS monitored by anomaly_detector - Anomaly detector calls /fault_manager/report_fault directly - Uses specific fault codes (SENSOR_NAN, SENSOR_TIMEOUT, etc.) Changes: - Add diagnostic_bridge and fault_reporter to Dockerfile - Update lidar_sim and camera_sim to publish DiagnosticArray - Add diagnostic_bridge node to launch file - Remove LiDAR monitoring from anomaly_detector - Update all inject scripts with path documentation - Add comprehensive dual-path documentation to README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add missing <algorithm> include for std::clamp (lidar, camera) - Validate rate parameters to prevent division by zero (all sensors) - Remove unused max_nan_ratio parameter from anomaly_detector - Remove unused lidar_rate_min from config (LiDAR uses legacy path now) - Remove empty clear_passed_faults function and clear_timer_ - Fix DRIFT_DETECTED → DRIFTING in README fault table - Add documentation to simulated_sensor_base.hpp explaining future use
- Add response callback to async_send_request() in anomaly_detector
to ensure service requests are properly processed by the executor
- Fix restore-normal.sh to use correct entity paths for fault clearing:
- Legacy path faults: /apps/diagnostic_bridge/faults/{fault_code}
- Modern path faults: /apps/anomaly_detector/faults/{fault_code}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } else if (param.get_name() == "scan_rate") { | ||
| // Update timer with new rate | ||
| double rate = param.as_double(); | ||
| auto period = std::chrono::duration<double>(1.0 / rate); | ||
| timer_ = this->create_wall_timer( | ||
| std::chrono::duration_cast<std::chrono::nanoseconds>(period), | ||
| std::bind(&LidarSimNode::publish_scan, this)); | ||
| RCLCPP_INFO(this->get_logger(), "Scan rate changed to %.1f Hz", rate); |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When handling a runtime change to scan_rate, the new value is used directly to compute 1.0 / rate without validation, so setting scan_rate to 0 or a negative value via parameters will cause a division-by-zero or extremely large period at runtime. In the constructor you already guard against non-positive rates and reset the parameter; consider applying the same validation and clamping logic here before recomputing the timer period.
| msg.angle_min = angle_min_; | ||
| msg.angle_max = angle_max_; | ||
| msg.angle_increment = (angle_max_ - angle_min_) / static_cast<double>(num_readings_); | ||
| msg.time_increment = 0.0; |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_readings_ is taken directly from the num_readings parameter and then used as the divisor for angle_increment and as the loop bound, but there is no validation that it is positive. If a user (or YAML config) sets num_readings to 0 or a negative value, this will either trigger a division-by-zero when computing angle_increment or undefined behavior in the loop; please validate num_readings (e.g., enforce a minimum of 1) before using it.
| ros2 launch sensor_diagnostics_demo demo.launch.py & | ||
| sleep 10 && | ||
| curl -sf http://localhost:8080/api/v1/health && | ||
| curl -sf http://localhost:8080/api/v1/apps | jq '.[] | .name' && |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the CI-oriented sensor-demo-ci service, the jq filter jq '.[] | .name' assumes the /api/v1/apps response is a top-level array, but elsewhere in the repo (e.g., run-demo.sh and the README) the same endpoint is treated as an object with an items array (.items[]). This mismatch means the CI command will not actually list app names as intended; consider switching this filter to .items[] | .name (or whatever matches the gateway schema) so the CI output accurately validates and displays the discovered apps.
| curl -sf http://localhost:8080/api/v1/apps | jq '.[] | .name' && | |
| curl -sf http://localhost:8080/api/v1/apps | jq '.items[] | .name' && |
|
|
||
| // State tracking | ||
| std::map<std::string, rclcpp::Time> sensor_timestamps_; | ||
| std::set<std::string> active_faults_; |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
active_faults_ is populated in report_fault() but never read or used elsewhere, so it currently just adds state and potential compiler warnings without affecting behavior. If fault clearing is handled exclusively via the REST scripts (as the comment suggests), consider removing this member or wiring it into a future report_passed()/auto-clearing mechanism to avoid dead code.
| std::set<std::string> active_faults_; | |
| [[maybe_unused]] std::set<std::string> active_faults_; |
| std::bind(&AnomalyDetectorNode::imu_callback, this, std::placeholders::_1)); | ||
|
|
||
| gps_sub_ = this->create_subscription<sensor_msgs::msg::NavSatFix>( | ||
| "/sensors/fix", 10, |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GPS subscription uses the absolute topic name /sensors/fix, but the gps_sim_node publishes GPS fixes on the relative topic "fix" under the /sensors/gps_sim node namespace (/sensors/gps_sim/fix). This mismatch means the anomaly detector never receives GPS data and will not raise GPS-related faults (e.g., NO_FIX, SENSOR_NAN, RATE_DEGRADED) despite the PR and README claiming this behavior; please align the subscription topic with the simulator's published topic.
| "/sensors/fix", 10, | |
| "/sensors/gps_sim/fix", 10, |
| <depend>ros2_medkit_msgs</depend> | ||
|
|
||
| <exec_depend>ros2launch</exec_depend> | ||
| <exec_depend>ros2_medkit_gateway</exec_depend> |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The launch file launch/demo.launch.py depends on the external packages ros2_medkit_diagnostic_bridge and ros2_medkit_fault_manager, but only ros2_medkit_gateway is declared as an <exec_depend> here. Without declaring these additional runtime dependencies, a user building this package from source may hit runtime errors when launching the demo on a system where those packages are not already present; please add them as <exec_depend> entries to keep package metadata in sync with the launch-time requirements.
| <exec_depend>ros2_medkit_gateway</exec_depend> | |
| <exec_depend>ros2_medkit_gateway</exec_depend> | |
| <exec_depend>ros2_medkit_diagnostic_bridge</exec_depend> | |
| <exec_depend>ros2_medkit_fault_manager</exec_depend> |
| // Create subscribers for MODERN path sensors (IMU, GPS) | ||
| // Note: LiDAR and Camera use LEGACY path via /diagnostics → diagnostic_bridge | ||
| imu_sub_ = this->create_subscription<sensor_msgs::msg::Imu>( | ||
| "/sensors/imu", 10, |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IMU subscription uses the absolute topic name /sensors/imu, but the imu_sim_node publishes IMU data on the relative topic "imu" under the /sensors/imu_sim node namespace (/sensors/imu_sim/imu). As a result, the anomaly detector never receives IMU messages and cannot detect IMU anomalies or timeouts as described in the demo documentation; please update this subscription (or the publisher) so the topic names align.
| "/sensors/imu", 10, | |
| "/sensors/imu_sim/imu", 10, |
| @@ -0,0 +1,270 @@ | |||
| // Copyright 2025 selfpatch | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2026, pls check also other files
| { | ||
| // Track active faults for later clearing | ||
| std::string fault_key = source + ":" + code; | ||
| active_faults_.insert(fault_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
active_faults_ is being populated here but never actually read anywhere. Without a check like
if (active_faults_.count(fault_key))
return;
at the top of this function, we'll keep spamming the FaultManager with the same fault on every callback cycle
| } else if (param.get_name() == "scan_rate") { | ||
| // Update timer with new rate | ||
| double rate = param.as_double(); | ||
| auto period = std::chrono::duration<double>(1.0 / rate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validation is missing in this parameter callback. If someone sets scan_rate to 0 or negative at runtime via ros2 param set, we'll get a division by zero here
probably we need rate validation as in the constructor (line 64-71)
| find_package(ros2_medkit_msgs REQUIRED) | ||
|
|
||
| # LiDAR simulator node | ||
| add_executable(lidar_sim_node src/lidar_sim_node.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lidar_sim_node and camera_sim_node both include <rcl_interfaces/msg/set_parameters_result.hpp> but rcl_interfaces isn't listed in their ament_target_dependencies(). It works because it's pulled transitively, but it's cleaner to list it explicitly - prevents surprises if dependencies change upstream
|
|
||
| // Random number generation | ||
| std::mt19937 rng_; | ||
| std::normal_distribution<double> normal_dist_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normal_dist_ is declared but never used anywhere in the camera node. Leftover?
| // Create publishers | ||
| fix_pub_ = this->create_publisher<sensor_msgs::msg::NavSatFix>("fix", 10); | ||
| diag_pub_ = this->create_publisher<diagnostic_msgs::msg::DiagnosticStatus>( | ||
| "diagnostics", 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMU and GPS publish diagnostics to the relative diagnostics topic (so it becomes /sensors/imu_sim/diagnostics), while LiDAR and Camera publish to the absolute /diagnostics. I know this is intentional to demonstrate the "modern path" where anomaly_detector handles, but maybe a quick comment in the code would help future readers understand this wasn't an oversight.
| /// This base class is retained for future refactoring when sensors need to | ||
| /// share common fault injection logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally fair! Just wanted to flag that carrying unused code can sometimes cause confusion. If it's a "we'll definitely use this soon" thing, all good. If it's more speculative, maybe a TODO with a ticket number, or just remove it until it's needed (YAGNI and all that)?
Description
This PR adds a lightweight, Gazebo‑free sensor diagnostics demo for ros2_medkit, including simulated sensors, an anomaly detector, Docker/docker‑compose integration, and CI wiring. It provides REST‑driven fault injection and a walkthrough script to showcase monitoring, configuration, and fault detection.
Changes:
Related Issue
closes #13
Checklist