Skip to content

Conversation

@bburda
Copy link
Contributor

@bburda bburda commented Jan 25, 2026

Description

This PR adds a lightweight, Gazebo‑free sensor diagnostics demo for ros2_medkit, including simulated sensors, an anomaly detector, Docker/docker‑compose integration, and CI wiring. It provides REST‑driven fault injection and a walkthrough script to showcase monitoring, configuration, and fault detection.

Changes:

  • Introduces simulated LiDAR, IMU, GPS, and camera nodes with configurable fault injection and diagnostic publishing.
  • Adds an anomaly detector node, parameter/config/manifest files, Dockerfile, and docker‑compose setup for running the demo plus an optional web UI.
  • Updates documentation, helper shell scripts, and CI to build the new demo image and advertise it as the primary quick‑start path.

Related Issue

closes #13

Checklist

  • Tested locally
  • README updated (if needed)

Add lightweight sensor diagnostics demo (no Gazebo required):
- CMakeLists.txt and package.xml for ROS 2 package
- Simulated sensor base class with fault injection support
- LiDAR simulator node with configurable parameters:
  - scan_rate, range_min/max, noise_stddev
  - Runtime fault injection via parameters
  - Diagnostics publisher for sensor status

This is the foundation for the sensor diagnostics demo that demonstrates
ros2_medkit's data monitoring and configuration management features.
Add remaining simulated sensor nodes with fault injection:
- IMU simulator: acceleration, angular velocity, drift simulation
- GPS simulator: NavSatFix with position noise and signal loss
- Camera simulator: RGB image generation with noise/black frames

All sensors support runtime parameter changes for fault injection:
- failure_probability for timeouts
- inject_nan for invalid values
- drift_rate for gradual sensor drift
- noise parameters for degraded operation
Add anomaly detector that monitors sensor streams and reports faults:
- Subscribes to LiDAR, IMU, GPS topics
- Detects NaN values, out-of-range readings
- Monitors message rates and timeouts
- Publishes DiagnosticArray for detected faults

Fault detection thresholds are configurable via ROS parameters:
- rate_timeout_sec
- max_nan_ratio
- sensor-specific rate minimums
Add ROS 2 launch file and configuration:
- demo.launch.py: launches all nodes with proper namespacing
  - /sensors: lidar_sim, imu_sim, gps_sim, camera_sim
  - /processing: anomaly_detector
  - /diagnostics: ros2_medkit_gateway
- sensor_params.yaml: default sensor parameters
- medkit_params.yaml: gateway configuration
- sensor_manifest.yaml: SOVD entity hierarchy definition
Add lightweight Dockerfile and docker-compose.yml:
- Based on ros:jazzy-ros-base (no Gazebo/GUI packages)
- Image size: ~500MB vs ~4GB for TurtleBot3 demo
- Startup time: ~5s vs ~60s
- Includes CI profile for headless testing
- Optional sovd-web-ui service for visualization
Add interactive demo scripts:
- run-demo.sh: walkthrough of ros2_medkit API features
- inject-noise.sh: increase sensor noise levels
- inject-failure.sh: cause sensor timeouts
- inject-nan.sh: inject NaN values
- inject-drift.sh: enable sensor drift
- restore-normal.sh: clear all faults

Add comprehensive README.md with:
- Quick start guide (Docker and source)
- API examples for all sensor operations
- Fault scenario documentation
- Parameter reference tables
- Comparison with TurtleBot3 demo
- Add sensor diagnostics demo to demos table (marked as Ready)
- Add Quick Start section with docker compose instructions
@bburda bburda self-assigned this Jan 25, 2026
Copilot AI review requested due to automatic review settings January 25, 2026 12:45
Build sensor diagnostics demo image in Docker build job.
Runs before TurtleBot3 build (faster, validates basic ros2_medkit).
The gateway requires additional packages:
- ros2_medkit_serialization (JSON/ROS message conversion)
- ros2_medkit_msgs (custom messages)
- dynmsg (dynamic message introspection)

Also add ros-jazzy-yaml-cpp-vendor to apt dependencies.
@bburda bburda force-pushed the feat/sensor-diagnostics-demo branch from 9c090fb to 2c05967 Compare January 25, 2026 12:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a lightweight, Gazebo‑free sensor diagnostics demo for ros2_medkit, including simulated sensors, an anomaly detector, Docker/docker‑compose integration, and CI wiring. It provides REST‑driven fault injection and a walkthrough script to showcase monitoring, configuration, and fault detection.

Changes:

  • Introduces simulated LiDAR, IMU, GPS, and camera nodes with configurable fault injection and diagnostic publishing.
  • Adds an anomaly detector node, parameter/config/manifest files, Dockerfile, and docker‑compose setup for running the demo plus an optional web UI.
  • Updates documentation, helper shell scripts, and CI to build the new demo image and advertise it as the primary quick‑start path.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
demos/sensor_diagnostics/src/lidar_sim_node.cpp LiDAR simulator node publishing LaserScan with noise, NaN, drift, and failure injection plus diagnostic status.
demos/sensor_diagnostics/src/imu_sim_node.cpp IMU simulator node publishing Imu with configurable noise, NaN, drift, and failure injection plus diagnostics.
demos/sensor_diagnostics/src/gps_sim_node.cpp GPS simulator node publishing NavSatFix with noise, drift, NaN, and loss‑of‑fix behavior plus diagnostics.
demos/sensor_diagnostics/src/camera_sim_node.cpp Camera simulator node publishing Image and CameraInfo with noise, black frames, and brightness faults plus diagnostics.
demos/sensor_diagnostics/src/anomaly_detector_node.cpp Anomaly detector node subscribing to sensor topics, detecting anomalies/timeouts/rate degradation, and publishing fault events.
demos/sensor_diagnostics/run-demo.sh Interactive shell script driving the REST API to show gateway health, sensor data, configs, and current faults.
demos/sensor_diagnostics/restore-normal.sh Helper script to reset all sensor fault‑related parameters back to normal values via the REST API.
demos/sensor_diagnostics/inject-noise.sh Helper script to inject high‑noise faults into all sensors via configuration updates.
demos/sensor_diagnostics/inject-nan.sh Helper script to enable NaN injection on LiDAR, IMU, and GPS via configuration updates.
demos/sensor_diagnostics/inject-failure.sh Helper script to force LiDAR sensor timeouts by setting failure probability to 1.0.
demos/sensor_diagnostics/inject-drift.sh Helper script to configure drift faults on LiDAR, IMU, and GPS sensors.
demos/sensor_diagnostics/restore-normal.sh Script to clear all configured faults and restore baseline sensor settings.
demos/sensor_diagnostics/package.xml Defines the sensor_diagnostics_demo ROS 2 package and its dependencies/exec deps.
demos/sensor_diagnostics/launch/demo.launch.py Launch description that starts all simulated sensors, the anomaly detector, and the ros2_medkit gateway with appropriate namespaces and configs.
demos/sensor_diagnostics/include/sensor_diagnostics/simulated_sensor_base.hpp Adds a (currently unused) base class and config struct for reusable fault‑injection behavior in simulated sensors.
demos/sensor_diagnostics/docker-compose.yml docker‑compose stack to run the sensor demo container and optional SOVD web UI, plus a CI profile variant.
demos/sensor_diagnostics/config/sensor_params.yaml Default ROS 2 parameters for each simulated sensor and the anomaly detector.
demos/sensor_diagnostics/config/sensor_manifest.yaml SOVD manifest describing areas, components, and apps for the sensor diagnostics demo.
demos/sensor_diagnostics/config/medkit_params.yaml ros2_medkit gateway configuration (HTTP server, CORS, discovery via manifest) for this demo.
demos/sensor_diagnostics/README.md Dedicated README explaining the demo, architecture, REST API examples, fault scenarios, scripts, and parameter reference.
demos/sensor_diagnostics/Dockerfile Dockerfile building a lightweight Jazzy‑based image containing ros2_medkit_gateway and the sensor diagnostics demo.
demos/sensor_diagnostics/CMakeLists.txt CMake configuration for building and installing the new simulator and anomaly detector nodes plus config/launch assets.
README.md (root) Top‑level README updated to add the Sensor Diagnostics demo and quick‑start instructions.
.github/workflows/ci.yml CI workflow updated to build the sensor diagnostics Docker image alongside the TurtleBot3 demo image.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bburda bburda marked this pull request as draft January 25, 2026 13:01
- Add --recurse-submodules to clone dynmsg submodule
- Add ros-jazzy-example-interfaces to apt dependencies
- Skip test dependencies not in ros-base image
- Disable BUILD_TESTING for faster builds
- Use ghcr.io/selfpatch/sovd_web_ui:latest image
- Add ros2_medkit_fault_manager to Docker image
- Add fault_manager node to launch file
- Integrate anomaly_detector with ReportFault service
- Fix topic subscriptions (/sensors/scan instead of /sensors/lidar_sim/scan)
- Fix run-demo.sh jq queries for {items: []} response format
- Add libsqlite3-dev for fault_manager storage

Fault injection now properly reports to FaultManager:
- inject-nan.sh -> SENSOR_NAN faults
- inject-failure.sh -> SENSOR_TIMEOUT faults
- inject-noise.sh -> RATE_DEGRADED faults
@bburda bburda changed the title Add senosr diagnostic demo Add sensor diagnostics demo Jan 25, 2026
Demonstrates two fault reporting mechanisms in ros2_medkit:

LEGACY PATH (diagnostics → bridge):
- LiDAR and Camera publish DiagnosticArray to /diagnostics topic
- ros2_medkit_diagnostic_bridge converts and reports to fault_manager
- Auto-generates fault codes from diagnostic names (e.g., LIDAR_SIM)

MODERN PATH (direct service call):
- IMU and GPS monitored by anomaly_detector
- Anomaly detector calls /fault_manager/report_fault directly
- Uses specific fault codes (SENSOR_NAN, SENSOR_TIMEOUT, etc.)

Changes:
- Add diagnostic_bridge and fault_reporter to Dockerfile
- Update lidar_sim and camera_sim to publish DiagnosticArray
- Add diagnostic_bridge node to launch file
- Remove LiDAR monitoring from anomaly_detector
- Update all inject scripts with path documentation
- Add comprehensive dual-path documentation to README
@bburda bburda marked this pull request as ready for review January 25, 2026 18:11
@bburda bburda requested a review from Copilot January 25, 2026 18:11
@bburda bburda added the enhancement New feature or request label Jan 25, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add missing <algorithm> include for std::clamp (lidar, camera)
- Validate rate parameters to prevent division by zero (all sensors)
- Remove unused max_nan_ratio parameter from anomaly_detector
- Remove unused lidar_rate_min from config (LiDAR uses legacy path now)
- Remove empty clear_passed_faults function and clear_timer_
- Fix DRIFT_DETECTED → DRIFTING in README fault table
- Add documentation to simulated_sensor_base.hpp explaining future use
- Add response callback to async_send_request() in anomaly_detector
  to ensure service requests are properly processed by the executor
- Fix restore-normal.sh to use correct entity paths for fault clearing:
  - Legacy path faults: /apps/diagnostic_bridge/faults/{fault_code}
  - Modern path faults: /apps/anomaly_detector/faults/{fault_code}
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +123 to +130
} else if (param.get_name() == "scan_rate") {
// Update timer with new rate
double rate = param.as_double();
auto period = std::chrono::duration<double>(1.0 / rate);
timer_ = this->create_wall_timer(
std::chrono::duration_cast<std::chrono::nanoseconds>(period),
std::bind(&LidarSimNode::publish_scan, this));
RCLCPP_INFO(this->get_logger(), "Scan rate changed to %.1f Hz", rate);
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When handling a runtime change to scan_rate, the new value is used directly to compute 1.0 / rate without validation, so setting scan_rate to 0 or a negative value via parameters will cause a division-by-zero or extremely large period at runtime. In the constructor you already guard against non-positive rates and reset the parameter; consider applying the same validation and clamping logic here before recomputing the timer period.

Copilot uses AI. Check for mistakes.
Comment on lines +151 to +154
msg.angle_min = angle_min_;
msg.angle_max = angle_max_;
msg.angle_increment = (angle_max_ - angle_min_) / static_cast<double>(num_readings_);
msg.time_increment = 0.0;
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_readings_ is taken directly from the num_readings parameter and then used as the divisor for angle_increment and as the loop bound, but there is no validation that it is positive. If a user (or YAML config) sets num_readings to 0 or a negative value, this will either trigger a division-by-zero when computing angle_increment or undefined behavior in the loop; please validate num_readings (e.g., enforce a minimum of 1) before using it.

Copilot uses AI. Check for mistakes.
ros2 launch sensor_diagnostics_demo demo.launch.py &
sleep 10 &&
curl -sf http://localhost:8080/api/v1/health &&
curl -sf http://localhost:8080/api/v1/apps | jq '.[] | .name' &&
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the CI-oriented sensor-demo-ci service, the jq filter jq '.[] | .name' assumes the /api/v1/apps response is a top-level array, but elsewhere in the repo (e.g., run-demo.sh and the README) the same endpoint is treated as an object with an items array (.items[]). This mismatch means the CI command will not actually list app names as intended; consider switching this filter to .items[] | .name (or whatever matches the gateway schema) so the CI output accurately validates and displays the discovered apps.

Suggested change
curl -sf http://localhost:8080/api/v1/apps | jq '.[] | .name' &&
curl -sf http://localhost:8080/api/v1/apps | jq '.items[] | .name' &&

Copilot uses AI. Check for mistakes.

// State tracking
std::map<std::string, rclcpp::Time> sensor_timestamps_;
std::set<std::string> active_faults_;
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

active_faults_ is populated in report_fault() but never read or used elsewhere, so it currently just adds state and potential compiler warnings without affecting behavior. If fault clearing is handled exclusively via the REST scripts (as the comment suggests), consider removing this member or wiring it into a future report_passed()/auto-clearing mechanism to avoid dead code.

Suggested change
std::set<std::string> active_faults_;
[[maybe_unused]] std::set<std::string> active_faults_;

Copilot uses AI. Check for mistakes.
std::bind(&AnomalyDetectorNode::imu_callback, this, std::placeholders::_1));

gps_sub_ = this->create_subscription<sensor_msgs::msg::NavSatFix>(
"/sensors/fix", 10,
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GPS subscription uses the absolute topic name /sensors/fix, but the gps_sim_node publishes GPS fixes on the relative topic "fix" under the /sensors/gps_sim node namespace (/sensors/gps_sim/fix). This mismatch means the anomaly detector never receives GPS data and will not raise GPS-related faults (e.g., NO_FIX, SENSOR_NAN, RATE_DEGRADED) despite the PR and README claiming this behavior; please align the subscription topic with the simulator's published topic.

Suggested change
"/sensors/fix", 10,
"/sensors/gps_sim/fix", 10,

Copilot uses AI. Check for mistakes.
<depend>ros2_medkit_msgs</depend>

<exec_depend>ros2launch</exec_depend>
<exec_depend>ros2_medkit_gateway</exec_depend>
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The launch file launch/demo.launch.py depends on the external packages ros2_medkit_diagnostic_bridge and ros2_medkit_fault_manager, but only ros2_medkit_gateway is declared as an <exec_depend> here. Without declaring these additional runtime dependencies, a user building this package from source may hit runtime errors when launching the demo on a system where those packages are not already present; please add them as <exec_depend> entries to keep package metadata in sync with the launch-time requirements.

Suggested change
<exec_depend>ros2_medkit_gateway</exec_depend>
<exec_depend>ros2_medkit_gateway</exec_depend>
<exec_depend>ros2_medkit_diagnostic_bridge</exec_depend>
<exec_depend>ros2_medkit_fault_manager</exec_depend>

Copilot uses AI. Check for mistakes.
// Create subscribers for MODERN path sensors (IMU, GPS)
// Note: LiDAR and Camera use LEGACY path via /diagnostics → diagnostic_bridge
imu_sub_ = this->create_subscription<sensor_msgs::msg::Imu>(
"/sensors/imu", 10,
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IMU subscription uses the absolute topic name /sensors/imu, but the imu_sim_node publishes IMU data on the relative topic "imu" under the /sensors/imu_sim node namespace (/sensors/imu_sim/imu). As a result, the anomaly detector never receives IMU messages and cannot detect IMU anomalies or timeouts as described in the demo documentation; please update this subscription (or the publisher) so the topic names align.

Suggested change
"/sensors/imu", 10,
"/sensors/imu_sim/imu", 10,

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,270 @@
// Copyright 2025 selfpatch

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2026, pls check also other files

{
// Track active faults for later clearing
std::string fault_key = source + ":" + code;
active_faults_.insert(fault_key);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

active_faults_ is being populated here but never actually read anywhere. Without a check like

if (active_faults_.count(fault_key)) 
 return; 

at the top of this function, we'll keep spamming the FaultManager with the same fault on every callback cycle

} else if (param.get_name() == "scan_rate") {
// Update timer with new rate
double rate = param.as_double();
auto period = std::chrono::duration<double>(1.0 / rate);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation is missing in this parameter callback. If someone sets scan_rate to 0 or negative at runtime via ros2 param set, we'll get a division by zero here
probably we need rate validation as in the constructor (line 64-71)

find_package(ros2_medkit_msgs REQUIRED)

# LiDAR simulator node
add_executable(lidar_sim_node src/lidar_sim_node.cpp)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lidar_sim_node and camera_sim_node both include <rcl_interfaces/msg/set_parameters_result.hpp> but rcl_interfaces isn't listed in their ament_target_dependencies(). It works because it's pulled transitively, but it's cleaner to list it explicitly - prevents surprises if dependencies change upstream


// Random number generation
std::mt19937 rng_;
std::normal_distribution<double> normal_dist_;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normal_dist_ is declared but never used anywhere in the camera node. Leftover?

// Create publishers
fix_pub_ = this->create_publisher<sensor_msgs::msg::NavSatFix>("fix", 10);
diag_pub_ = this->create_publisher<diagnostic_msgs::msg::DiagnosticStatus>(
"diagnostics", 10);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMU and GPS publish diagnostics to the relative diagnostics topic (so it becomes /sensors/imu_sim/diagnostics), while LiDAR and Camera publish to the absolute /diagnostics. I know this is intentional to demonstrate the "modern path" where anomaly_detector handles, but maybe a quick comment in the code would help future readers understand this wasn't an oversight.

Comment on lines +9 to +10
/// This base class is retained for future refactoring when sensors need to
/// share common fault injection logic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally fair! Just wanted to flag that carrying unused code can sometimes cause confusion. If it's a "we'll definitely use this soon" thing, all good. If it's more speculative, maybe a TODO with a ticket number, or just remove it until it's needed (YAGNI and all that)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lightweight demo focusing on sensor data and configurations

3 participants