Skip to content

Exception thrown while waiting for action result: RCLError "failed to add guard condition to wait set" #2163

@schornakj

Description

@schornakj

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04 LTS
  • Installation type:
    • Binaries
  • Version or commit hash:
    • Humble, from the 2023-01-27 sync snapshot
  • DDS implementation:
    • CycloneDDS
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

Calling rclcpp_action::Client::async_get_result() after rclcpp_action::Client::async_send_goal() sometimes throws rclcpp::exceptions::RCLError with a "failed to add guard condition to wait set: guard condition implementation is invalid" message.

While I don't have a totally-atomic code sample that reproduces this issue independent of my project, the code in question is similar to this:

  const auto goal_handle_future = client_->async_send_goal(goal);
  if (goal_handle_future.wait_for(goal_response_timeout_) == std::future_status::timeout)
  {
    // handle failure to get response to goal request
    return false;
  }

  const auto goal_handle = goal_handle_future.get();
  if (!goal_handle)
  {
    // handle rejection of the goal request
    return false;
  }

  auto wrapped_result_future = client_->async_get_result(goal_handle);
  if (wrapped_result_future.wait_for(goal_result_timeout_) == std::future_status::timeout)
  {
    // handle timeout before action result is received
    return false;
  }

  // handle success
  return true;

Expected behavior

If the goal_handle future returned from async_send_goal() is set (indicating that the goal request was accepted), then I should always be able to use the goal_handle to wait for the action result.

Actual behavior

For some small proportion of action goals (on the order of 1 in 100), an exception is thrown:

    [test_node_main-1] terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
    [test_node_main-1]   what():  failed to add guard condition to wait set: guard condition implementation is invalid, at ./src/rcl/guard_condition.c:172, at ./src/rcl/wait.c:460
    [test_node_main-1] Stack trace (most recent call last) in thread 18342:
    [test_node_main-1] #16   Object "/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2", at 0xffffffffffffffff, in 
    [test_node_main-1] #15   Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f9fe4282bb3, in __clone
    [test_node_main-1] #14   Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f9fe41f1b42, in 
    [test_node_main-1] #13   Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f9fe44812b2, in 
    [test_node_main-1] #12   Object "/opt/ros/humble/lib/librclcpp.so", at 0x7f9fe4b418e1, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long)
    [test_node_main-1] #11   Object "/opt/ros/humble/lib/librclcpp.so", at 0x7f9fe4b3aab2, in rclcpp::Executor::get_next_executable(rclcpp::AnyExecutable&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >)
    [test_node_main-1] #10   Object "/opt/ros/humble/lib/librclcpp.so", at 0x7f9fe4b37d94, in rclcpp::Executor::wait_for_work(std::chrono::duration<long, std::ratio<1l, 1000000000l> >)
    [test_node_main-1] #9    Object "/opt/ros/humble/lib/librclcpp.so", at 0x7f9fe4b4ac86, in 
    [test_node_main-1] #8    Object "/opt/ros/humble/lib/librclcpp.so", at 0x7f9fe4b385ab, in rclcpp::detail::add_guard_condition_to_rcl_wait_set(rcl_wait_set_s&, rclcpp::GuardCondition const&)
    [test_node_main-1] #7    Object "/opt/ros/humble/lib/librclcpp.so", at 0x7f9fe4b34838, in rclcpp::exceptions::throw_from_rcl_error(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rcutils_error_state_s const*, void (*)())
    [test_node_main-1] #6    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f9fe445323d, in std::rethrow_exception(std::__exception_ptr::exception_ptr)
    [test_node_main-1] #5    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f9fe44532b6, in std::terminate()
    [test_node_main-1] #4    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f9fe445324b, in 
    [test_node_main-1] #3    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f9fe4447bbd, in 
    [test_node_main-1] #2    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f9fe41857f2, in abort
    [test_node_main-1] #1    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f9fe419f475, in raise
    [test_node_main-1] #0    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7f9fe41f3a7c, in pthread_kill
    [test_node_main-1] Aborted (Signal sent by tkill() 18291 0)

Additional information

Based on the stack trace, this looks like a problem at the intersection of the rclcpp action client and the MultiThreadedExecutor, which is a place where I've had issues roughly similar to this one in the past.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions