Skip to content

improve rcl_action tests failures non-default RMW impls.#1301

Closed
fujitatomoya wants to merge 1 commit intorollingfrom
fujitatomoya/improve-rcl_action-cyclonedds
Closed

improve rcl_action tests failures non-default RMW impls.#1301
fujitatomoya wants to merge 1 commit intorollingfrom
fujitatomoya/improve-rcl_action-cyclonedds

Conversation

@fujitatomoya
Copy link
Copy Markdown
Collaborator

Description

different approach from ros2/rmw_cyclonedds#565

this PR fixes the following failures with rmw_cyclonedds_cpp,

export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
colcon test --event-handlers console_direct+ --packages-select rcl_action
...
     3 FAILED TESTS
    -- run_test.py: return code 1
    -- run_test.py: inject classname prefix into gtest result file '/root/ros2_ws/colcon_ws/build/rcl_action/test_results/rcl_action/test_wait.gtest.xml'
    -- run_test.py: verify result file '/root/ros2_ws/colcon_ws/build/rcl_action/test_results/rcl_action/test_wait.gtest.xml'
  >>>
build/rcl_action/test_results/rcl_action/test_action_client.gtest.xml: 11 tests, 0 errors, 1 failure, 0 skipped
- rcl_action.TestActionClientBaseFixture test_action_client_init_fini (/root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_action_client.cpp:87)
  <<< failure message
    /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_action_client.cpp:183
    Value of: rcutils_error_is_set()
      Actual: true
    Expected: false
  >>>
build/rcl_action/test_results/rcl_action/test_action_server.gtest.xml: 22 tests, 0 errors, 1 failure, 0 skipped
- rcl_action.TestActionServer test_set_internal_services_introspection_contents (/root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_action_server.cpp:896)
  <<< failure message
    /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_action_server.cpp:287
    Value of: get_publisher_count(get_result_service_event_topic_name) == expect_publisher_count
      Actual: false
    Expected: true
  >>>
build/rcl_action/test_results/rcl_action/test_wait.gtest.xml: 6 tests, 0 errors, 3 failures, 0 skipped
- rcl_action.TestActionClientWait test_wait_set_add_action_client (/root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_wait.cpp:153)
  <<< failure message
    /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_wait.cpp:64
    Value of: rcutils_error_is_set()
      Actual: true
    Expected: false
    rmw_subscription_get_content_filter: unimplemented, at /root/ros2_ws/colcon_ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:3255
  >>>
- rcl_action.TestActionClientWait test_client_wait_set_get_num_entities (/root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_wait.cpp:362)
  <<< failure message
    /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_wait.cpp:64
    Value of: rcutils_error_is_set()
      Actual: true
    Expected: false
    rmw_subscription_get_content_filter: unimplemented, at /root/ros2_ws/colcon_ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:3255
  >>>
- rcl_action.TestActionClientWait test_client_wait_set_get_entities_ready (/root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_wait.cpp:527)
  <<< failure message
    /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl_action/test/rcl_action/test_wait.cpp:64
    Value of: rcutils_error_is_set()
      Actual: true
    Expected: false
    rmw_subscription_get_content_filter: unimplemented, at /root/ros2_ws/colcon_ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:3255
  >>>

this PR just call rcutils_reset_error right after rmw_subscription_get_content_filter during the construction of subscription. this is being done in the ROS 2 system library, not issued by user application by purpose. so we should reset the error strings to avoid leaking a stale error state out of subscription init.

this can fix the many warnings at once. the thing is that, some unit tests are only running with default rmw implementation which is rmw_fastrtps that supports content filtering feature. so those warnings are not observed before unless specific rmw configuration is set to start the test.

part of ros2/rmw_cyclonedds#550

Is this user-facing behavior change?

No

Did you use Generative AI?

No

Additional Information

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya fujitatomoya self-assigned this Mar 18, 2026
Comment on lines +148 to +151
// depending on rmw implementation, this call may set an error string (e.g. "unimplemented")
// so we must call rcutils_reset_error() in both branches to avoid leaking a stale error
// state out of subscription init.
rcutils_reset_error();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this makes sense because this is being called in ROS 2 system, not by user application. and eventually construction of subscription succeeds even if rmw implementation does not support content filtering feature. so we should reset the error strings here.

if the user application calls rmw_subscription_get_content_filter on purpose, the error strings will stay and user application expects the message unimplemented....

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, just move it out of the if/else entirely ?

Anyway, I am confused about what is going on here, why are we calling this in the constructor in the first place and not in rcl_subscription_is_cft_supported ?

Also I can't find any place, were rcl_subscription_is_cft_supported is actually called in the whole stack.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need to call cft API to see if that is supported or not at the construction.

Also I can't find any place, were rcl_subscription_is_cft_supported is actually called in the whole stack.

this is planned to use upper layer for improvement, #1287 and ros2/rclcpp#3034

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, just move it out of the if/else entirely ?

The same suggestion.

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

@Barry-Xu-2018 can you review this?

CC: @jmachowinski @mjcarroll

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

fujitatomoya commented Mar 18, 2026

Pulls: #1301
Gist: https://gist.githubusercontent.com/fujitatomoya/6bb54b81b5fdcbbcee37ff602365fefd/raw/d500630426712eaaa52dfa5f7168334d18a734dd/ros2.repos
BUILD args: --packages-above-and-dependencies rcl
TEST args: --packages-above rcl
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/18533

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

Pulls: #1301
Gist: https://gist.githubusercontent.com/fujitatomoya/9edae6ef089f4083cdcaf311cc386db9/raw/d500630426712eaaa52dfa5f7168334d18a734dd/ros2.repos
BUILD args: --packages-above-and-dependencies rcl --cmake-args -DRMW_IMPLEMENTATION=rmw_cyclonedds_cpp
TEST args: --packages-above rcl
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/18534

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

i started 2 CI with/without --cmake-args -DRMW_IMPLEMENTATION=rmw_cyclonedds_cpp.

Comment on lines +148 to +151
// depending on rmw implementation, this call may set an error string (e.g. "unimplemented")
// so we must call rcutils_reset_error() in both branches to avoid leaking a stale error
// state out of subscription init.
rcutils_reset_error();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, just move it out of the if/else entirely ?

Anyway, I am confused about what is going on here, why are we calling this in the constructor in the first place and not in rcl_subscription_is_cft_supported ?

Also I can't find any place, were rcl_subscription_is_cft_supported is actually called in the whole stack.

@jmachowinski
Copy link
Copy Markdown
Contributor

I just went a bit over the code and directly found more stuff that is wrong here.

rcl_subscription_is_cft_enabled(const rcl_subscription_t * subscription)

Never returns true, even if content filtering is actually active.

@iuhilnehc-ynos can you have a look at this ?

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

@jmachowinski

Never returns true, even if content filtering is actually active.

if the content filtering expression and parameters are not set (means using content filtering), this returns false. that is expected behavior.

@iuhilnehc-ynos can you have a look at this ?

unfortunately he is not working with us anymore... i do not think we can get reply from him...

@jmachowinski
Copy link
Copy Markdown
Contributor

if the content filtering expression and parameters are not set (means using content filtering), this returns false. that is expected behavior.

There is no place, were is_cft_enabled is ever set to true, this looks strange. Also there is no test that it ever returns true, only tests for false.

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

hmmm that is strange, true case should be tested with cft supported rmws... @Barry-Xu-2018 can you check on this?

for this PR, i am okay with either way. keeping the caching is_cft_supported data via construction so that it can return the true/false on next user call immediately, or as @jmachowinski suggests we can do it lazily on the 1st call of rcl_subscription_is_cft_supported and the cache the result. @Barry-Xu-2018 what do you think?

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

Anyway, I am confused about what is going on here, why are we calling this in the constructor in the first place and not in rcl_subscription_is_cft_supported ?

Whether an RMW implementation supports content filters is fixed, so I think it should be determined during subscription initialization. The upper-level rclcpp/rclpy can directly provide an is_cft_support() function, instead of attempting to configure a content filter and only discovering that the RMW implementation does not support it when it returns "unsupported." This approach makes the code clearer.

The original logic has already been optimized based on this way.
e.g.
https://github.com/ros2/rclcpp/blob/df2ac887ed518e704a3c4150a6041e16542a6d7a/rclcpp/src/rclcpp/parameter_event_handler.cpp#L77-L82

bool
ParameterEventHandler::configure_nodes_filter(const std::vector<std::string> & node_names)
{
  if (!event_subscription_->is_cft_supported()) {
    return false;
  }

If using the lazy approach, two functions need to be modified. When the RMW implementation does not support content filters and their RMW functions return "unsupported", we need to set is_cft_supported accordingly.

  • rcl_subscription_set_content_filter()
  • rcl_subscription_get_content_filter()
    So I prefer to set it up during the initialization of the subscription.

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

@jmachowinski

I just went a bit over the code and directly found more stuff that is wrong here.

rcl_subscription_is_cft_enabled(const rcl_subscription_t * subscription)

Never returns true, even if content filtering is actually active.

Here is my understanding.

bool
rcl_subscription_is_cft_enabled(const rcl_subscription_t * subscription)
{
  if (!rcl_subscription_is_valid(subscription)) {
    return false;
  }
  return subscription->impl->rmw_handle->is_cft_enabled;
}

subscription->impl->rmw_handle->is_cft_enabled is set in rmw-xxx while call configure content filter function.
In rmw_fastdds_cpp

rmw_ret_t
rmw_subscription_set_content_filter(
  rmw_subscription_t * subscription,
  const rmw_subscription_content_filter_options_t * options)
{
  RMW_CHECK_ARGUMENT_FOR_NULL(subscription, RMW_RET_INVALID_ARGUMENT);
  RMW_CHECK_ARGUMENT_FOR_NULL(options, RMW_RET_INVALID_ARGUMENT);
  RMW_CHECK_TYPE_IDENTIFIERS_MATCH(
    subscription,
    subscription->implementation_identifier,
    eprosima_fastrtps_identifier,
    return RMW_RET_INCORRECT_RMW_IMPLEMENTATION);
  rmw_ret_t ret = rmw_fastrtps_shared_cpp::__rmw_subscription_set_content_filter(
    subscription, options);
  auto info = static_cast<const CustomSubscriberInfo *>(subscription->data);
  subscription->is_cft_enabled = (info && info->filtered_topic_);   <<=== Here
  return ret;
}

Even if rmw_subscription_set_content_filter() returns successfully, it does not mean that the CFT has already taken effect. You still need to call rcl_subscription_is_cft_enabled() to determine whether the internal CFT of the RMW implementation is enabled.

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

@fujitatomoya

hmmm that is strange, true case should be tested with cft supported rmws... @Barry-Xu-2018 can you check on this?

Actually, it was tested.

Use test_subscription_content_filtered as an example,

ret = rcl_subscription_init(&subscription, this->node_ptr, ts, topic, &subscription_options);
ASSERT_EQ(RCL_RET_OK, ret) << rcl_get_error_string().str;
OSRF_TESTING_TOOLS_CPP_SCOPE_EXIT(
{
rcl_ret_t ret = rcl_subscription_fini(&subscription, this->node_ptr);
EXPECT_EQ(RCL_RET_OK, ret) << rcl_get_error_string().str;
});
bool is_cft_support = rcl_subscription_is_cft_enabled(&subscription);
ASSERT_TRUE(wait_for_established_subscription(&publisher, 10, 1000));
// publish with a non-filtered data
constexpr char test_string[] = "NotFilteredData";
{
test_msgs__msg__Strings msg;
test_msgs__msg__Strings__init(&msg);
ASSERT_TRUE(rosidl_runtime_c__String__assign(&msg.string_value, test_string));
ret = rcl_publish(&publisher, &msg, nullptr);
test_msgs__msg__Strings__fini(&msg);
ASSERT_EQ(RCL_RET_OK, ret) << rcl_get_error_string().str;
}
if (is_cft_support) {
ASSERT_FALSE(wait_for_subscription_to_be_ready(&subscription, context_ptr, 10, 100));
} else {
ASSERT_TRUE(wait_for_subscription_to_be_ready(&subscription, context_ptr, 100, 100));
test_msgs__msg__Strings msg;
test_msgs__msg__Strings__init(&msg);
OSRF_TESTING_TOOLS_CPP_SCOPE_EXIT(
{
test_msgs__msg__Strings__fini(&msg);
});
ret = rcl_take(&subscription, &msg, nullptr, nullptr);
ASSERT_EQ(RCL_RET_OK, ret) << rcl_get_error_string().str;
ASSERT_EQ(
std::string(test_string),
std::string(msg.string_value.data, msg.string_value.size));
}

You’ll notice that the test doesn’t directly check the return value of rcl_subscription_is_cft_enabled(), but instead conducts different tests based on its return value. That is, even if the rmw implementation does not support CFT, there is a corresponding branch to handle it.

@jmachowinski
Copy link
Copy Markdown
Contributor

@Barry-Xu-2018 Thanks for the explanation on where and how is_cft_enabled is set and manipulated.

This now opens new question for me :
If the mechanism is, that the rmw_create_subscription creates and fills in the handle correctly, why are we manipulating it in the first place in the rcl layer ?

E.g.:

  // Check if the subscription supports content filtering
  // If RMW implementation doesn't support content filtering, calling
  // rmw_subscription_get_content_filter() with unavailable parameters will always return
  // RMW_RET_UNSUPPORTED.
  if (rmw_subscription_get_content_filter(
    subscription->impl->rmw_handle, NULL, NULL) == RMW_RET_UNSUPPORTED)
  {
    subscription->impl->rmw_handle->is_cft_supported = false;
  } else {
    rcutils_reset_error();
    subscription->impl->rmw_handle->is_cft_supported = true;
  }

This code block should not be present at all,

subscription->impl->rmw_handle->is_cft_supported

must contain the correct value after rmw_create_subscription. Anything else I would consider a bug in the rmw implementation.

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

@jmachowinski

This now opens new question for me :
If the mechanism is, that the rmw_create_subscription creates and fills in the handle correctly, why are we manipulating it in the first place in the rcl layer ?

You’re right, this value really should be set in rmw_xxx.

At the time when we designed the RMW CFT interface, we indeed did not consider the existence of the variable subscription->impl->rmw_handle->is_cft_supported (I added this value recently. ros2/rmw#415). Now, it would be quite troublesome to require each rmw_xxx to modify and fill in this value themselves, so I used a workaround: during initialization in rcl, subscription->impl->rmw_handle->is_cft_supported is configured based on whether calling the RMW CFT interface returns “unsupported.” This way, we only need to modify one place.

@jmachowinski
Copy link
Copy Markdown
Contributor

I would argue that the design is now broken.
You have a variable that is in the RMW handle, that is modified in the RCL layer.
If it is supposed to be some RCL internal thing / is basically owned by RCL, it should be
in the rcl subscription handle. Anyway, I also dislike the workaround of calling the rmw_subscription_get_content_filter and guessing the support from the result. It sound
way more straight forward to me, to patch the RWM layer to set the correct value.

And, as only fastDDS supports this anyway, we only need to do one patch. For the rest it should
default to false anyway.

@fujitatomoya
Copy link
Copy Markdown
Collaborator Author

@jmachowinski @Barry-Xu-2018 thanks for the comments.

so after all, i agree that we should put that in rmw implementation, not in rcl. with that, we do not even need this PR i believe.
@Barry-Xu-2018 can you address that fix?

as only fastDDS supports this anyway, we only need to do one patch. For the rest it should default to false anyway.

RTI ConnextDDS also supports this.

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

@jmachowinski

I would argue that the design is now broken.
You have a variable that is in the RMW handle, that is modified in the RCL layer.
If it is supposed to be some RCL internal thing / is basically owned by RCL, it should be
in the rcl subscription handle. Anyway, I also dislike the workaround of calling the rmw_subscription_get_content_filter and guessing the support from the result. It sound
way more straight forward to me, to patch the RWM layer to set the correct value.

And, as only fastDDS supports this anyway, we only need to do one patch. For the rest it should
default to false anyway.

Thank you for your comments. I will make changes to rmw_fastdds/rmw_connext to remove this workaround at the RCL layer.

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

so after all, i agree that we should put that in rmw implementation, not in rcl. with that, we do not even need this PR i believe.

I'll handle it.

@Barry-Xu-2018
Copy link
Copy Markdown
Contributor

@jmachowinski
Copy link
Copy Markdown
Contributor

Closing this PR in favor for #1304

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants