-
Notifications
You must be signed in to change notification settings - Fork 39
Parallel CPU LBVH Implementation #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ing support - Introduced LBVH class for efficient broad phase collision detection. - Implemented LBVH node structure with AABB intersection checks. - Added methods for detecting vertex-vertex, edge-vertex, edge-edge, face-vertex, edge-face, and face-face candidates. - Integrated profiling functionality with IPC_TOOLKIT_WITH_PROFILER flag. - Updated CMakeLists to include new LBVH test files and source files. - Added tests for LBVH construction and candidate detection. - Included Python bindings for LBVH and example usage scripts. - Enhanced configuration options in config.hpp for profiling support.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #188 +/- ##
==========================================
- Coverage 97.37% 97.36% -0.02%
==========================================
Files 157 160 +3
Lines 24240 24602 +362
Branches 843 879 +36
==========================================
+ Hits 23604 23953 +349
- Misses 636 649 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a parallel CPU implementation of Linear Bounding Volume Hierarchy (LBVH) for broad-phase collision detection and adds performance profiling capabilities. The LBVH uses Morton codes for efficient spatial sorting and parallel construction.
Changes:
- Implemented LBVH broad-phase collision detection method with parallel construction using TBB
- Added profiler utility with CSV export and optional compilation via CMake flag
- Integrated profiling blocks into BVH and LBVH implementations
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/ipc/broad_phase/lbvh.hpp | Defines LBVH class with Node structure and detection methods |
| src/ipc/broad_phase/lbvh.cpp | Implements LBVH construction using Morton codes and parallel traversal |
| src/ipc/utils/profiler.hpp | Profiler header with timing macros and data structures |
| src/ipc/utils/profiler.cpp | Profiler implementation with CSV export functionality |
| tests/src/tests/broad_phase/test_lbvh.cpp | Test cases for LBVH construction and candidate detection |
| python/examples/lbvh.py | Python example demonstrating LBVH visualization |
| CMakeLists.txt | Adds profiler option and nlohmann/json dependency |
| .github/workflows/python.yml | Updates Python version to 3.14 |
| .clang-format | Removes Language and RemoveEmptyLinesInUnwrappedLines options |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Updated the LBVH Node structure to use Eigen::Array3f for AABB representation, improving compactness of Nodes. - Introduced a union in the Node structure to efficiently manage leaf and internal node data, reducing memory usage and cache misses. - Added functions for calculating 2D and 3D Morton codes in a new morton.hpp file, enhancing spatial indexing capabilities. - Modified the test suite for LBVH to include checks for candidate presence rather than exact size matches, improving robustness of collision detection tests. - Updated CMakeLists.txt to include the new morton.hpp file. - Adjusted utility functions to maintain consistency with the new data types and structures. - Updated test data with new simulation scenes
Updated bit-shift constants in morton_2D and morton_3D to use 1ULL instead of 1ul or 1, ensuring explicit 64-bit unsigned long long type. This improves type safety and portability across platforms with differing type sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 16 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .def_readonly("left", &LBVH::Node::left) | ||
| .def_readonly("right", &LBVH::Node::right) |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python binding exposes both left and right members of the union, but accessing these on a leaf node (where they alias primitive_id and is_inner_marker) will return incorrect values. Consider only exposing these members conditionally or adding a check in Python to ensure they're only accessed on inner nodes.
src/ipc/broad_phase/lbvh.hpp
Outdated
| // This compresses the Node size to 64 bytes (1 cache line), | ||
| // reducing cache misses during traversal. | ||
| union { | ||
| struct { | ||
| /// @brief Pointer to the left child or INVALID_POINTER in case of leaf | ||
| int left; | ||
| /// @brief Pointer to the right child or INVALID_POINTER in case of leaf | ||
| int right; | ||
| }; | ||
|
|
||
| struct { | ||
| /// @brief The primitive id (INVALID_ID <=> inner node) | ||
| /// @note We use this to distinguish leaves from internal nodes to safely access the union. | ||
| int primitive_id; | ||
|
|
||
| /// @brief Marker to indicate this is an inner node | ||
| /// If is_inner == 0 then right = 0 which is INVALID_POINTER | ||
| /// If is_inner != 0 then right = actual right pointer | ||
| int is_inner_marker; | ||
| }; | ||
| }; | ||
|
|
||
| /// @brief Default constructor | ||
| Node(); | ||
|
|
||
| /// @brief Check if this node is an inner node. | ||
| bool is_inner() const { return is_inner_marker; } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assert message indicates the Node should be 32 bytes, but the comment on line 23 states it should be 64 bytes (1 cache line). These are contradictory. Based on the structure (2x Array3f = 2x12 bytes + 2x int = 8 bytes = 32 bytes total), the assert is correct but the comment is misleading.
src/ipc/broad_phase/lbvh.cpp
Outdated
| uint64_t code_i, | ||
| int j) | ||
| { | ||
| if (j < 0 || j > sorted_morton_codes.size() - 1) { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition checks if i is strictly less than sorted_morton_codes.size() - 1, but this doesn't protect against an empty vector. If sorted_morton_codes is empty, size() - 1 will underflow (since size_t is unsigned), resulting in a very large number, and the check will incorrectly pass. Add a check for empty vector first.
| if (j < 0 || j > sorted_morton_codes.size() - 1) { | |
| if (j < 0 || j >= static_cast<int>(sorted_morton_codes.size())) { |
| bvh->detect_edge_edge_candidates(expected_ee_candidates); | ||
|
|
||
| CHECK(ee_candidates.size() >= expected_ee_candidates.size()); | ||
| contains_all_candidates(ee_candidates, expected_ee_candidates); |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CHECK macro without a condition will always fail. This should be CHECK(contains_all_candidates(ee_candidates, expected_ee_candidates)) to actually verify the result of the function call.
| bvh->detect_edge_face_candidates(expected_ef_candidates); | ||
|
|
||
| CHECK(ef_candidates.size() >= expected_ef_candidates.size()); | ||
| contains_all_candidates(ef_candidates, expected_ef_candidates); |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CHECK macro without a condition will always fail. This should be CHECK(contains_all_candidates(ef_candidates, expected_ef_candidates)) to actually verify the result of the function call.
| bvh->detect_face_vertex_candidates(expected_fv_candidates); | ||
|
|
||
| CHECK(fv_candidates.size() >= expected_fv_candidates.size()); | ||
| contains_all_candidates(fv_candidates, expected_fv_candidates); |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CHECK macro without a condition will always fail. This should be CHECK(contains_all_candidates(fv_candidates, expected_fv_candidates)) to actually verify the result of the function call.
| void Profiler::start(const std::string& name) | ||
| { | ||
| current_scope.push_back(name); | ||
| if (!m_data.contains(current_scope)) { | ||
| m_data[current_scope] = { | ||
| { "time_ms", 0 }, | ||
| { "count", 0 }, | ||
| }; | ||
| } | ||
| } | ||
|
|
||
| void Profiler::stop(const double time_ms) | ||
| { | ||
| const static std::string log_fmt_text = fmt::format( | ||
| "[{}] {{}} {{:.6f}} ms", | ||
| fmt::format(fmt::fg(fmt::terminal_color::magenta), "timing")); | ||
|
|
||
| logger().trace( | ||
| fmt::runtime(log_fmt_text), current_scope.to_string(), time_ms); | ||
|
|
||
| assert(m_data.contains(current_scope)); | ||
| assert(m_data.at(current_scope).contains("time_ms")); | ||
| assert(m_data.at(current_scope).contains("count")); | ||
| m_data[current_scope]["time_ms"] = | ||
| m_data[current_scope]["time_ms"].get<double>() + time_ms; | ||
| m_data[current_scope]["count"] = | ||
| m_data[current_scope]["count"].get<size_t>() + 1; | ||
| current_scope.pop_back(); | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The m_data JSON object is being modified concurrently from multiple threads without synchronization (lines 23-28, 43-46). Multiple threads can call start() and stop() simultaneously, leading to data races when reading/writing the JSON object. This needs proper synchronization with mutexes or atomic operations.
| // Use a fixed-size array as a stack to avoid dynamic allocations | ||
| constexpr int MAX_STACK_SIZE = 64; | ||
| int stack[MAX_STACK_SIZE]; | ||
| int stack_ptr = 0; | ||
| stack[stack_ptr++] = LBVH::Node::INVALID_POINTER; | ||
|
|
||
| int node_idx = 0; // root | ||
| do { | ||
| const LBVH::Node& node = lbvh[node_idx]; | ||
|
|
||
| // Check left and right are valid pointers | ||
| assert(node.is_inner()); | ||
|
|
||
| #if defined(__GNUC__) || defined(__clang__) | ||
| // Prefetch child nodes to reduce cache misses | ||
| __builtin_prefetch(&lbvh[node.left], 0, 1); | ||
| __builtin_prefetch(&lbvh[node.right], 0, 1); | ||
| #endif | ||
|
|
||
| const LBVH::Node& child_l = lbvh[node.left]; | ||
| const LBVH::Node& child_r = lbvh[node.right]; | ||
| const bool intersects_l = child_l.intersects(query); | ||
| const bool intersects_r = child_r.intersects(query); | ||
|
|
||
| // Query overlaps a leaf node => report collision. | ||
| if (intersects_l && child_l.is_leaf()) { | ||
| attempt_add_candidate<Candidate, swap_order, triangular>( | ||
| query, child_l, can_collide, candidates); | ||
| } | ||
| if (intersects_r && child_r.is_leaf()) { | ||
| attempt_add_candidate<Candidate, swap_order, triangular>( | ||
| query, child_r, can_collide, candidates); | ||
| } | ||
|
|
||
| // Query overlaps an internal node => traverse. | ||
| bool traverse_l = (intersects_l && !child_l.is_leaf()); | ||
| bool traverse_r = (intersects_r && !child_r.is_leaf()); | ||
|
|
||
| if (!traverse_l && !traverse_r) { | ||
| assert(stack_ptr > 0); | ||
| node_idx = stack[--stack_ptr]; | ||
| } else { | ||
| node_idx = traverse_l ? node.left : node.right; | ||
| if (traverse_l && traverse_r) { | ||
| // Postpone traversal of the right child | ||
| assert(stack_ptr < MAX_STACK_SIZE); | ||
| stack[stack_ptr++] = node.right; |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the BVH tree depth exceeds 64, the stack will overflow, causing undefined behavior. While 64 is likely sufficient for most cases, there's no runtime check to prevent this. Consider adding a runtime check and handling the overflow case gracefully, or use a dynamic stack as a fallback.
| // Use a fixed-size array as a stack to avoid dynamic allocations | ||
| constexpr int MAX_STACK_SIZE = 64; | ||
| int stack[MAX_STACK_SIZE]; | ||
| int stack_ptr = 0; | ||
| stack[stack_ptr++] = LBVH::Node::INVALID_POINTER; | ||
|
|
||
| int node_idx = 0; // root | ||
| do { | ||
| const LBVH::Node& node = lbvh[node_idx]; | ||
|
|
||
| // Check left and right are valid pointers | ||
| assert(node.is_inner()); | ||
|
|
||
| #if defined(__GNUC__) || defined(__clang__) | ||
| // Prefetch child nodes to reduce cache misses | ||
| __builtin_prefetch(&lbvh[node.left], 0, 1); | ||
| __builtin_prefetch(&lbvh[node.right], 0, 1); | ||
| #endif | ||
|
|
||
| const LBVH::Node& child_l = lbvh[node.left]; | ||
| const LBVH::Node& child_r = lbvh[node.right]; | ||
|
|
||
| // 1. Intersect 4 queries at once | ||
| // (child_l.min <= query.max) && (query.min <= child_l.max) | ||
| const simd_int4 intersects_l = (child_l.aabb_min.x() <= q_max_x) | ||
| & (child_l.aabb_min.y() <= q_max_y) | ||
| & (child_l.aabb_min.z() <= q_max_z) | ||
| & (q_min_x <= child_l.aabb_max.x()) | ||
| & (q_min_y <= child_l.aabb_max.y()) | ||
| & (q_min_z <= child_l.aabb_max.z()); | ||
|
|
||
| // 2. Intersect 4 queries at once | ||
| // (child_r.min <= query.max) && (query.min <= child_r.max) | ||
| const simd_int4 intersects_r = (child_r.aabb_min.x() <= q_max_x) | ||
| & (child_r.aabb_min.y() <= q_max_y) | ||
| & (child_r.aabb_min.z() <= q_max_z) | ||
| & (q_min_x <= child_r.aabb_max.x()) | ||
| & (q_min_y <= child_r.aabb_max.y()) | ||
| & (q_min_z <= child_r.aabb_max.z()); | ||
|
|
||
| const bool any_intersects_l = simd_any(intersects_l); | ||
| const bool any_intersects_r = simd_any(intersects_r); | ||
|
|
||
| // Query overlaps a leaf node => report collision | ||
| if (any_intersects_l && child_l.is_leaf()) { | ||
| for (int k = 0; k < n_queries; ++k) { | ||
| if (intersects_l[k]) { | ||
| attempt_add_candidate< | ||
| Candidate, swap_order, triangular>( | ||
| queries[k], child_l, can_collide, candidates); | ||
| } | ||
| } | ||
| } | ||
| if (any_intersects_r && child_r.is_leaf()) { | ||
| for (int k = 0; k < n_queries; ++k) { | ||
| if (intersects_r[k]) { | ||
| attempt_add_candidate< | ||
| Candidate, swap_order, triangular>( | ||
| queries[k], child_r, can_collide, candidates); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Query overlaps an internal node => traverse. | ||
| bool traverse_l = (any_intersects_l && !child_l.is_leaf()); | ||
| bool traverse_r = (any_intersects_r && !child_r.is_leaf()); | ||
|
|
||
| if (!traverse_l && !traverse_r) { | ||
| assert(stack_ptr > 0); | ||
| node_idx = stack[--stack_ptr]; | ||
| } else { | ||
| node_idx = traverse_l ? node.left : node.right; | ||
| if (traverse_l && traverse_r) { | ||
| // Postpone traversal of the right child | ||
| assert(stack_ptr < MAX_STACK_SIZE); | ||
| stack[stack_ptr++] = node.right; | ||
| } | ||
| } | ||
| } while (node_idx != LBVH::Node::INVALID_POINTER); // Same as root |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same stack overflow issue exists in the SIMD version. If the BVH tree depth exceeds 64, the stack will overflow. Consider adding a runtime check or using a dynamic stack as a fallback.
Description
This pull request introduces a new broad phase collision detection method, LBVH (Linear Bounding Volume Hierarchy), and adds performance profiling capabilities to the project. It also includes updates to the build system and Python bindings to support these features. The most important changes are grouped below.
LBVH Broad Phase Collision Detection
LBVHclass insrc/ipc/broad_phase/lbvh.hppand corresponding source file, providing a new broad phase method for collision detection. This includes node structure, build routines, and candidate detection logic.python/examples/lbvh.pydemonstrating LBVH usage and visualization.Performance Profiling
src/ipc/utils/profiler.cpp,src/ipc/utils/profiler.hpp) to measure and record performance metrics, with CSV output support.CMakeLists.txt,src/ipc/config.hpp.in) to allow enabling/disabling the profiler via theIPC_TOOLKIT_WITH_PROFILERoption.src/ipc/broad_phase/bvh.cpp) with profiler blocks to collect timing data for key routines.Type of change