Skip to content

Conversation

@Czaki
Copy link
Contributor

@Czaki Czaki commented Aug 20, 2024

This PR is proof of concept for speedup triangulation of shapes layer using an existing project with already setup wheel compile pipeline.

If speedup of triangulation will be enough, it will be a motivator to create some backend package for napari containing compiled code.

In comparison to the existing approach

What could be improved.
rewrite part of the code directly to C++ to avoid cython code generation and simplify debugging

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced triangulation functionalities for polygons, including support for monotone polygons and various geometric operations.
    • Added a new library for triangulation with essential geometric data structures and algorithms.
    • Implemented new geometric utilities for intersection detection and point-segment relationships.
    • Enhanced C++ capabilities by updating the standard and adding new libraries.
    • Expanded project dependencies to include additional libraries for improved functionality.
  • Bug Fixes

    • Enhanced debugging capabilities with new debug flags and improved error handling in geometric computations.
  • Tests

    • Added comprehensive unit tests for triangulation and intersection functionalities to ensure correctness and robustness.
  • Chores

    • Updated testing configuration for improved output during test runs.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 20, 2024

Walkthrough

The pull request introduces updates to the CMake configuration, Python project settings, and several source files related to geometric operations. The C++ standard is upgraded to C++17, and new libraries are added, including a triangulation library. The pyproject.toml file sees updates in dependencies and linting rules. New Cython files implement geometric algorithms, while existing files are modified to incorporate debugging utilities and intersection functionalities. Additionally, a suite of unit tests is created to validate the new features and ensure robustness.

Changes

File Change Summary
CMakeLists.txt Updated C++ standard to C++17, added debug flags, modified library definitions to include new headers, added triangulate library, and updated installation commands.
pyproject.toml Updated dependencies for numpy, added new dependencies, a new keyword, and modified linting rules.
src/PartSegCore_compiled_backend/triangulate.pyx Introduced Cython implementations for triangulation, including multiple geometric functions and classes.
src/PartSegCore_compiled_backend/triangulation/debug_util.hpp Added utility functions for printing data structures and an overload for the output stream operator.
src/PartSegCore_compiled_backend/triangulation/intersection.hpp Added structures and functions for detecting intersections between line segments.
src/PartSegCore_compiled_backend/triangulation/point.hpp Introduced geometric representations for points and segments, including arithmetic operations and hash functions.
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp Added functionalities for polygon triangulation, including data structures and algorithms for monotone polygons.
src/tests/test_triangulate.py Created unit tests for triangulation and intersection functions, covering various geometric scenarios.
tox.ini Updated test command to pytest -v for verbose output during test runs.

Possibly related PRs

  • Add missed headers to CMakeList.txt config #54: The changes in this PR directly relate to the main PR as both involve modifications to the CMakeLists.txt file, specifically updating the python_add_library commands to include additional header files (my_queue.h and global_consts.h) for several libraries, which aligns with the main PR's updates to library definitions.

🐰 In the land of code where rabbits hop,
New triangulation tools make our hearts stop!
With C++17, we leap and we bound,
Debugging made easy, new features abound.
From points to segments, we dance in delight,
In our geometric world, everything feels right! 🐇✨

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

jni added a commit to napari/napari that referenced this pull request Aug 29, 2024
# References and relevant issues

This PR is part of work done in
4DNucleome/PartSegCore-compiled-backend#36 from
where I extract idea for checking if polygon is convex

# Description

For a convex polygon, the triangulation is a trivial problem that could
be done in linear time. However, the current approach is performing
heavy computations even for such cases.

This PR adds an additional check that increases time for non-convex
polygons, but drastically reduces for convex one.

On sample data (of thousands of quadrangles) it reduces the time to
triangulate face from, 20456ms (56% of running time) to 1461ms (8,8%) of
running time.

Example:

```python
from napari.layers import Shapes
import numpy as np

DATA_SIZE = 18000
PER_LINE = int(DATA_SIZE ** (1/2))

array = np.array([(0.2, 0.1), (0.1, 0.9), (0.9, 0.9), (0.9, 0.2)]) * 10

data = []
for i in range(DATA_SIZE // PER_LINE):
    for j in range(PER_LINE):
        data.append(array + (i*10, j*10))

s = Shapes(data, shape_type='polygon')
```

I may prepare more complex examples to validate this approach if you
think that it is a good idea.

---------

Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
src/PartSegCore_compiled_backend/triangulate.pyx (3)

1-4: Consider the performance impact of CYTHON_TRACE_NOGIL.

The CYTHON_TRACE_NOGIL=1 directive enables debugging but can impact performance. Consider making this conditional based on debug/release builds.


239-256: Consider extracting common polygon vector conversion logic.

The functions triangulate_polygon_py, triangulate_polygon_numpy, and triangulate_polygon_numpy_li share similar logic for converting input points to polygon vectors and filtering duplicate points. Consider extracting this into a helper function to reduce code duplication.

+cdef vector[Point] _convert_points_to_vector(points, reserve_size):
+    cdef vector[Point] polygon_vector
+    cdef Point p1, p2
+    
+    polygon_vector.reserve(reserve_size)
+    polygon_vector.push_back(Point(points[0][0], points[0][1]))
+    for point in points[1:]:
+        p1 = polygon_vector[polygon_vector.size() - 1]
+        p2 = Point(point[0], point[1])
+        if p1 != p2:
+            polygon_vector.push_back(p2)
+    return polygon_vector

Also applies to: 258-277, 280-308


300-302: Add documentation for polygon closure handling.

The code removes the last point if it's identical to the first point (polygon closure), but this behavior isn't documented. Consider adding a docstring note about this special case handling.

 def triangulate_polygon_numpy_li(list[cnp.ndarray[float_types, ndim=2]] polygon_li: list[np.ndarray]) -> tuple[np.ndarray, np.ndarray]:
-    """ Triangulate polygon"""
+    """ Triangulate polygon
+    
+    Note: If a polygon's last point is identical to its first point (closed polygon),
+    the last point will be removed before triangulation.
+    """

Also applies to: 392-394

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1a3baf0 and 4a8b806.

📒 Files selected for processing (1)
  • src/PartSegCore_compiled_backend/triangulate.pyx (1 hunks)
🔇 Additional comments (2)
src/PartSegCore_compiled_backend/triangulate.pyx (2)

258-277: Enhance type safety in numpy array handling.

The function uses a fused type for float arrays but doesn't explicitly check the array's dtype. Consider adding runtime dtype validation to prevent potential memory issues.


371-408: Consider memory optimization in triangulate_polygon_with_edge_numpy_li.

The function creates multiple intermediate numpy arrays which could be memory-intensive for large polygons. Consider using pre-allocated arrays or streaming the results.

Comment on lines 353 to 368
def triangulate_path_edge_py(path: Sequence[Sequence[float]], closed: bool=False, limit: float=3.0, bevel: bool=False) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
""" Triangulate path"""
cdef vector[Point] path_vector
cdef PathTriangulation result
cdef Point p1, p2

path_vector.reserve(len(path))
for point in path:
path_vector.push_back(Point(point[0], point[1]))

result = triangulate_path_edge(path_vector, closed, limit, bevel)
return (
np.array([(point.x, point.y) for point in result.centers], dtype=np.float32),
np.array([(offset.x, offset.y) for offset in result.offsets], dtype=np.float32),
np.array([(triangle.x, triangle.y, triangle.z) for triangle in result.triangles], dtype=np.uintp),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add parameter validation for the limit parameter.

The triangulate_path_edge_py function accepts a limit parameter but doesn't validate its value. Consider adding validation to ensure it's positive.

 def triangulate_path_edge_py(path: Sequence[Sequence[float]], closed: bool=False, limit: float=3.0, bevel: bool=False) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
     """ Triangulate path"""
+    if limit <= 0:
+        raise ValueError("limit must be positive")
     cdef vector[Point] path_vector
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def triangulate_path_edge_py(path: Sequence[Sequence[float]], closed: bool=False, limit: float=3.0, bevel: bool=False) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
""" Triangulate path"""
cdef vector[Point] path_vector
cdef PathTriangulation result
cdef Point p1, p2
path_vector.reserve(len(path))
for point in path:
path_vector.push_back(Point(point[0], point[1]))
result = triangulate_path_edge(path_vector, closed, limit, bevel)
return (
np.array([(point.x, point.y) for point in result.centers], dtype=np.float32),
np.array([(offset.x, offset.y) for offset in result.offsets], dtype=np.float32),
np.array([(triangle.x, triangle.y, triangle.z) for triangle in result.triangles], dtype=np.uintp),
)
def triangulate_path_edge_py(path: Sequence[Sequence[float]], closed: bool=False, limit: float=3.0, bevel: bool=False) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
""" Triangulate path"""
if limit <= 0:
raise ValueError("limit must be positive")
cdef vector[Point] path_vector
cdef PathTriangulation result
cdef Point p1, p2
path_vector.reserve(len(path))
for point in path:
path_vector.push_back(Point(point[0], point[1]))
result = triangulate_path_edge(path_vector, closed, limit, bevel)
return (
np.array([(point.x, point.y) for point in result.centers], dtype=np.float32),
np.array([(offset.x, offset.y) for offset in result.offsets], dtype=np.float32),
np.array([(triangle.x, triangle.y, triangle.z) for triangle in result.triangles], dtype=np.uintp),
)

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 10, 2024

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (6)
src/PartSegCore_compiled_backend/triangulation/point.hpp (2)

64-81: Enhance the Vector struct with additional geometric operations

Consider adding common geometric operations to the Vector struct, such as dot product, cross product, magnitude, and normalization. This will enhance its utility and may simplify computations in other parts of the code.

Apply this diff to extend the Vector struct:

 struct Vector {
   // ... existing code ...
+  [[nodiscard]] coordinate_t dot(const Vector& other) const {
+    return x * other.x + y * other.y;
+  }
+
+  [[nodiscard]] coordinate_t cross(const Vector& other) const {
+    return x * other.y - y * other.x;
+  }
+
+  [[nodiscard]] coordinate_t magnitude() const {
+    return std::sqrt(x * x + y * y);
+  }
+
+  Vector normalized() const {
+    coordinate_t mag = magnitude();
+    return mag > 0 ? Vector{x / mag, y / mag} : Vector{0, 0};
+  }
 };

56-61: Consider improving the hash function for Point to handle floating-point precision

Hashing floating-point numbers can lead to collisions due to precision issues. Consider using a more robust method to combine the hashes of x and y, reducing the potential for collisions.

Apply this diff to improve the hash function:

 struct Point {
   struct PointHash {
     std::size_t operator()(const Point &p) const {
-      return std::hash<Point::coordinate_t>()(p.x) ^
-             (std::hash<Point::coordinate_t>()(p.y) << 1);
+      std::size_t hx = std::hash<Point::coordinate_t>()(p.x);
+      std::size_t hy = std::hash<Point::coordinate_t>()(p.y);
+      return hx ^ (hy + 0x9e3779b9 + (hx << 6) + (hx >> 2));
     }
   };
 };
src/PartSegCore_compiled_backend/triangulation/intersection.hpp (2)

235-238: Offer to optimize the _find_intersections function and address edge cases

The TODO comment indicates that the current implementation of _find_intersections is not optimal and needs adjustments for edge cases. Optimizing this function can enhance performance and reliability.

Would you like assistance in proposing an optimized implementation that handles edge cases more efficiently?


130-145: Clean up commented-out debugging code in _orientation function

Consider removing the large block of commented-out code within the _orientation function to improve code readability and maintainability.

src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

93-93: Consider optimizing the left_to_right function

There is a TODO comment indicating that the left_to_right function could be implemented more efficiently. Optimizing this function may improve performance.

Would you like assistance in optimizing this function or should we open a new GitHub issue to track this task?

pyproject.toml (1)

Line range hint 1-6: Ensure consistency between build and runtime NumPy version requirements

The build-system requires numpy>=2, while the project dependencies specify numpy>=1.25.0. This may lead to inconsistencies between the NumPy versions used during build and runtime. Consider synchronizing these version requirements to avoid potential compatibility issues.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 42c1b46 and 4a8b806.

📒 Files selected for processing (9)
  • CMakeLists.txt (3 hunks)
  • pyproject.toml (2 hunks)
  • src/PartSegCore_compiled_backend/triangulate.pyx (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/debug_util.hpp (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/intersection.hpp (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/point.hpp (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1 hunks)
  • src/tests/test_triangulate.py (1 hunks)
  • tox.ini (1 hunks)
🧰 Additional context used
📓 Learnings (1)
src/PartSegCore_compiled_backend/triangulation/point.hpp (1)
Learnt from: Czaki
PR: 4DNucleome/PartSegCore-compiled-backend#36
File: src/PartSegCore_compiled_backend/triangulation/point.hpp:0-0
Timestamp: 2024-12-05T23:09:52.930Z
Learning: In this project, `float` is preferred over `double` in geometric computations to prioritize performance.
🔇 Additional comments (14)
src/PartSegCore_compiled_backend/triangulation/point.hpp (1)

166-175: ⚠️ Potential issue

Fix the logic in point_on_line method for non-axis-aligned segments

The point_on_line method may not correctly determine if a point p lies on a non-horizontal and non-vertical segment. Specifically, it should compare p.x to x_cord and ensure p lies within the segment bounds, accounting for floating-point precision errors.

Apply this diff to fix the issue:

-    return (this->bottom.x <= x_cord && x_cord <= this->top.x);
+    return (this->bottom.y <= p.y && p.y <= this->top.y) &&
+           std::abs(p.x - x_cord) <= std::numeric_limits<coordinate_t>::epsilon();
src/PartSegCore_compiled_backend/triangulate.pyx (6)

105-127: 🛠️ Refactor suggestion

Add input validation to on_segment function

Consider adding input validation to ensure that p, q, and r are sequences of length 2. This will prevent index errors and improve robustness.


128-150: 🛠️ Refactor suggestion

Add input validation to orientation function

Add checks to ensure that p, q, and r are sequences of length 2 to prevent potential index errors.


152-171: 🛠️ Refactor suggestion

Add input validation for segments in do_intersect function

Validate that s1 and s2 are sequences of two points, each containing two floats. This will enhance error handling and prevent unexpected exceptions due to malformed input.


173-185: 🛠️ Refactor suggestion

Validate input in find_intersections function

Before processing, ensure that the segments list is not empty and that each segment contains valid point data. Adding input validation will prevent runtime errors from invalid data.


209-219: 🛠️ Refactor suggestion

Add input validation in is_convex function

Ensure that the polygon sequence contains at least three points, each with two coordinates. Input validation will prevent logical errors and unnecessary computations.


353-369: 🛠️ Refactor suggestion

Add parameter validation for the limit parameter in triangulate_path_edge_py function

Ensure that the limit parameter is positive by adding input validation. This will prevent unintended behavior or runtime errors.

Apply this diff:

 def triangulate_path_edge_py(path: Sequence[Sequence[float]], closed: bool=False, limit: float=3.0, bevel: bool=False) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
     """ Triangulate path"""
+    if limit <= 0:
+        raise ValueError("limit must be positive")
     cdef vector[Point] path_vector
src/tests/test_triangulate.py (1)

1-658: Unit tests are comprehensive and well-structured

The test suite effectively covers various scenarios for the triangulation and intersection functionalities, ensuring robustness and correctness.

tox.ini (1)

31-31: LGTM

Changing the command to pytest -v for verbose output is appropriate.

pyproject.toml (1)

143-143: Verify the impact of changing skip-magic-trailing-comma to false

Changing skip-magic-trailing-comma to false may affect code formatting, particularly regarding trailing commas. Ensure that this change aligns with your project's coding standards and does not introduce formatting inconsistencies.

CMakeLists.txt (4)

131-131: LGTM! Installation target is properly configured.

The installation path for the triangulate module is consistent with other core functionality modules.


Line range hint 1-131: Address existing technical debt: Duplicate headers.

As identified in previous reviews, there are duplicate my_queue.h files in the codebase:

  • src/PartSegCore_compiled_backend/multiscale_opening/my_queue.h
  • src/PartSegCore_compiled_backend/sprawl_utils/my_queue.h

While not directly related to the triangulation changes, consider addressing this technical debt by consolidating these headers into a common include directory.


109-120: Consider reorganizing triangulation headers to prevent duplication.

The triangulate library setup looks good, but based on previous findings of duplicate headers in the project, consider:

  1. Moving triangulation headers to a common include directory
  2. Ensuring header guards are properly implemented

4-6: Verify C++17 compatibility across the codebase.

The upgrade to C++17 and addition of debug flags look good. However, ensure all existing code is compatible with C++17 features.

✅ Verification successful

C++17 upgrade is safe to proceed

The codebase appears to be well-suited for C++17:

  • No deprecated features like std::auto_ptr or register keyword were found
  • No old-style exception specifications detected
  • The codebase already uses modern C++ features and STL containers (std::array, std::vector, std::unordered_map, etc.)
  • Template usage is modern and compatible with C++17
  • No compatibility-breaking patterns were identified
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for potential C++17 compatibility issues
# Look for C++11/14 specific patterns that might need updates

# Search for auto_ptr (deprecated in C++17)
rg "std::auto_ptr"

# Search for old-style dynamic exception specifications
rg "throw\s*\([^)]*\)\s*{"

# Search for register keyword (deprecated in C++17)
rg "\bregister\b"

# Search for potential C++17 features that might be used
rg "std::(optional|variant|any|string_view|filesystem)"

Length of output: 231


Script:

#!/bin/bash
# Let's try a different approach to verify C++17 compatibility

# Check for throw specifications with fixed regex
rg "throw\s*\(" -A 1

# Look for C++17 specific headers that might be used
fd -e hpp -e cpp -e h --exec grep -l "include.*\(optional\|variant\|any\|string_view\|filesystem\)"

# Check source files to understand the codebase's C++ usage
fd -e cpp -e hpp -e h --type f --exec head -n 20 {} \;

# Look for template-related features that might need C++17
rg "template\s+<" -A 2

Length of output: 11906

Comment on lines 1144 to 1172
if (sin_angle == 0) {
mitter = {p1_p2_diff_norm.y / 2, -p1_p2_diff_norm.x / 2};
} else {
point::Point::coordinate_t scale_factor = 1 / sin_angle;
if (bevel || cos_angle < cos_limit) {
/* Bevel join
* There is a need to check if inner vector is not to long
* See https://github.com/napari/napari/pull/7268#user-content-bevel-cut
*/
point::Point::coordinate_t estimated_len = scale_factor;
if (prev_length < length) {
if (estimated_len > prev_length) {
scale_factor =
prev_length * static_cast<point::Point::coordinate_t>(0.5);
} else if (estimated_len < -prev_length) {
scale_factor =
-prev_length * static_cast<point::Point::coordinate_t>(0.5);
}
} else {
if (estimated_len > length) {
scale_factor = length * static_cast<point::Point::coordinate_t>(0.5);
} else if (estimated_len < -length) {
scale_factor = -length * static_cast<point::Point::coordinate_t>(0.5);
}
}
}
mitter = (p1_p2_diff_norm - p2_p3_diff_norm) * scale_factor * 0.5;
};
if (bevel || cos_angle < cos_limit) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential numerical instability when sin_angle is close to zero

In the calculation of scale_factor, division by sin_angle can cause numerical instability if sin_angle is very close to zero. Consider adding a threshold to handle cases when sin_angle is near zero to avoid division by very small numbers.

Comment on lines +949 to +955
std::unordered_set<point::Point> visited;
for (const auto &polygon : polygon_list) {
for (const auto &point : polygon) {
if (visited.count(point) == 0) {
result.push_back(point);
visited.insert(point);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure point::Point has appropriate hash and equality operators for unordered containers

The code uses std::unordered_set<point::Point> and std::unordered_map<point::Point, ...>, which require point::Point to have std::hash specialization and operator== correctly implemented. Please verify that these are defined to ensure correct behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
src/PartSegCore_compiled_backend/triangulate.pyx (1)

280-322: Refactor duplicated code in triangulate_polygon_numpy_li

The function triangulate_polygon_numpy_li has duplicated loops for handling float32 and float64 data types. Consider refactoring to reduce code duplication and improve maintainability.

You can handle both data types in a single loop by casting the array elements to <float>:

 def triangulate_polygon_numpy_li(list polygon_li: list[np.ndarray]) -> tuple[np.ndarray, np.ndarray]:
     """ Triangulate polygon"""
     cdef vector[Point] polygon_vector
     cdef vector[vector[Point]] polygon_vector_list
     cdef Point p1, p2
     cdef pair[vector[Triangle], vector[Point]] result

-    cdef size_t i;
-    cdef cnp.ndarray[cnp.float32_t, ndim=2] polygon32
-    cdef cnp.ndarray[cnp.float64_t, ndim=2] polygon64

     polygon_vector_list.reserve(len(polygon_li))
     for polygon in polygon_li:
         polygon_vector.clear()
         polygon_vector.reserve(polygon.shape[0])
         polygon_vector.push_back(Point(polygon[0, 0], polygon[0, 1]))

-        if polygon.dtype == np.float32:
-            polygon32 = polygon
-            for i in range(1, polygon.shape[0]):
-                p1 = polygon_vector.back()
-                p2 = Point(polygon32[i, 0], polygon32[i, 1])
-                if p1 != p2:
-                    polygon_vector.push_back(p2)
-        else:
-            polygon64 = polygon
-            for i in range(1, polygon.shape[0]):
-                p1 = polygon_vector.back()
-                p2 = Point(polygon64[i, 0], polygon64[i, 1])
-                if p1 != p2:
-                    polygon_vector.push_back(p2)
+        for i in range(1, polygon.shape[0]):
+            p1 = polygon_vector.back()
+            p2 = Point(<float>polygon[i, 0], <float>polygon[i, 1])
+            if p1 != p2:
+                polygon_vector.push_back(p2)

         if polygon_vector.size() > 1 and polygon_vector.front() == polygon_vector.back():
             polygon_vector.pop_back()
         polygon_vector_list.push_back(polygon_vector)

This approach simplifies the code and handles both data types uniformly.

src/tests/test_triangulate.py (2)

259-408: Add assertions to verify triangulation in segfault test cases

The tests test_triangulate_polygon_segfault1 to test_triangulate_polygon_segfault5 currently call triangulate_polygon_py but do not assert any outcomes. Consider adding assertions to verify that the function returns valid results or handles errors gracefully.


562-649: Expand test coverage for triangulate_path_edge_py

In the test test_triangulate_path_edge_py, the expected output is compared with hardcoded values. Ensure that the test cases cover various edge cases such as empty paths, paths with duplicate points, and non-numeric inputs to fully validate the function's robustness.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4a8b806 and 089b5e4.

📒 Files selected for processing (2)
  • src/PartSegCore_compiled_backend/triangulate.pyx (1 hunks)
  • src/tests/test_triangulate.py (1 hunks)
🔇 Additional comments (8)
src/PartSegCore_compiled_backend/triangulate.pyx (8)

105-126: Add input validation for function arguments in on_segment

The function on_segment does not validate that the sequences p, q, and r each have exactly two elements. This can lead to index errors if incorrect inputs are provided.


128-149: Ensure consistent input validation in orientation function

Similarly to on_segment, the orientation function should validate that p, q, and r are sequences of length two to prevent potential index errors.


152-170: Add input validation for segment inputs in do_intersect

In the do_intersect function, consider adding checks to ensure that s1 and s2 are sequences containing exactly two points, and each point has exactly two coordinates.


173-185: Check for empty or invalid segment lists in find_intersections

Before processing, verify that the input segments is not empty and that each segment contains valid point data. Adding input validation will prevent runtime errors from invalid data and enhance the function's robustness.


209-219: Validate polygon input in is_convex function

Ensure that the polygon sequence contains at least three points, each with two coordinates. A polygon with fewer than three points cannot be convex, and input validation will prevent logical errors and unnecessary computations.


366-381: Validate the limit parameter in triangulate_path_edge_py

The limit parameter in triangulate_path_edge_py should be validated to ensure it is positive. Negative or zero values may cause unexpected behavior.


239-256: ⚠️ Potential issue

Add input validation for polygon in triangulate_polygon_py

The function triangulate_polygon_py does not check whether the input polygon has at least three points required for a valid polygon. Adding input validation will prevent runtime errors and improve robustness.

Apply this diff to add input validation:

 def triangulate_polygon_py(polygon: Sequence[Sequence[float]]) -> tuple[list[tuple[int, int, int]], list[tuple[float, float]]]:
     """ Triangulate polygon"""
+    if len(polygon) < 3:
+        raise ValueError("Polygon must contain at least three points.")
     cdef vector[Point] polygon_vector
     cdef Point p1, p2
     cdef pair[vector[Triangle], vector[Point]] result

Likely invalid or redundant comment.


258-278: ⚠️ Potential issue

Add input validation for polygon in triangulate_polygon_numpy

The function triangulate_polygon_numpy does not validate that the input array polygon has at least three points. This may lead to errors during processing.

Apply this diff to add input validation:

 def triangulate_polygon_numpy(cnp.ndarray[float_types, ndim=2] polygon: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
     """ Triangulate polygon"""
+    if polygon.shape[0] < 3:
+        raise ValueError("Polygon must contain at least three points.")
     cdef vector[Point] polygon_vector
     cdef Point p1, p2
     cdef pair[vector[Triangle], vector[Point]] result

Likely invalid or redundant comment.

@Czaki
Copy link
Contributor Author

Czaki commented Dec 16, 2024

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2024

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
src/PartSegCore_compiled_backend/triangulate.pyx (4)

1-4: Add documentation for compiler directives

Consider adding comments explaining the performance implications of each compiler directive:

  • boundscheck=False: Disables bounds checking for improved performance
  • wraparound=False: Disables negative indexing
  • nonecheck=False: Disables None checking
  • cdivision=True: Disables division by zero checking

20-99: Add docstrings for C++ classes

Consider adding Python docstrings for the C++ classes to improve API documentation. This will help users understand the purpose and usage of each class when viewing the Python API documentation.


297-313: Refactor duplicate type handling code

Consider extracting the duplicate code for float32/float64 handling into a separate helper function to improve maintainability.

+cdef void process_polygon_points(vector[Point]& polygon_vector, cnp.ndarray polygon, size_t start_idx):
+    cdef Point p1, p2
+    cdef cnp.ndarray[cnp.float32_t, ndim=2] polygon32
+    cdef cnp.ndarray[cnp.float64_t, ndim=2] polygon64
+    
+    if polygon.dtype == np.float32:
+        polygon32 = polygon
+        for i in range(start_idx, polygon.shape[0]):
+            p1 = polygon_vector.back()
+            p2 = Point(polygon32[i, 0], polygon32[i, 1])
+            if p1 != p2:
+                polygon_vector.push_back(p2)
+    else:
+        polygon64 = polygon
+        for i in range(start_idx, polygon.shape[0]):
+            p1 = polygon_vector.back()
+            p2 = Point(polygon64[i, 0], polygon64[i, 1])
+            if p1 != p2:
+                polygon_vector.push_back(p2)

-        if polygon.dtype == np.float32:
-            polygon32 = polygon
-            for i in range(1, polygon.shape[0]):
-                p1 = polygon_vector.back()
-                p2 = Point(polygon32[i, 0], polygon32[i, 1])
-                if p1 != p2:
-                    # prevent from adding polygon edge of width 0
-                    polygon_vector.push_back(p2)
-        else:
-            polygon64 = polygon
-            for i in range(1, polygon.shape[0]):
-                p1 = polygon_vector.back()
-                p2 = Point(polygon64[i, 0], polygon64[i, 1])
-                if p1 != p2:
-                    # prevent from adding polygon edge of width 0
-                    polygon_vector.push_back(p2)
+        process_polygon_points(polygon_vector, polygon, 1)

1-422: Consider splitting functionality into smaller modules

The file implements a wide range of geometric operations. Consider splitting it into smaller, focused modules:

  • Basic geometric operations (point, segment)
  • Intersection detection
  • Triangulation algorithms
  • Path processing

This would improve maintainability and make the codebase more modular.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fff4293 and 2749fa4.

📒 Files selected for processing (2)
  • src/PartSegCore_compiled_backend/triangulate.pyx (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/debug_util.hpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/PartSegCore_compiled_backend/triangulation/debug_util.hpp
🔇 Additional comments (3)
src/PartSegCore_compiled_backend/triangulate.pyx (3)

367-382: Input validation was previously suggested

A past review already suggested adding validation for the limit parameter.


259-278: Performance optimization was previously suggested

A past review already suggested using typed memoryviews for improved performance.


385-422: 🛠️ Refactor suggestion

Add input validation for empty polygon list

Consider adding validation to handle empty input lists gracefully:

 def triangulate_polygon_with_edge_numpy_li(polygon_li: list[np.ndarray]) -> tuple[tuple[np.ndarray, np.ndarray], tuple[np.ndarray, np.ndarray, np.ndarray]]:
     """ Triangulate polygon"""
+    if not polygon_li:
+        raise ValueError("Empty polygon list")
+
     cdef vector[Point] polygon_vector
     cdef vector[vector[Point]] polygon_vector_list

Likely invalid or redundant comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
pyproject.toml (1)

Line range hint 3-3: Ensure consistent numpy version requirements

The [build-system] requires numpy>=2, while [project.dependencies] specifies numpy>=1.25.0. This inconsistency may lead to build or runtime issues due to incompatible numpy versions. Please align the version requirements to ensure consistency across the build and runtime environments.

Also applies to: 48-48

♻️ Duplicate comments (3)
src/PartSegCore_compiled_backend/triangulation/point.hpp (1)

166-175: ⚠️ Potential issue

Review the logic in the point_on_line method.

The point_on_line method may not accurately determine if a point lies on the segment for all cases:

  • For non-axis-aligned segments, the comparison this->bottom.x <= x_cord && x_cord <= this->top.x might not correctly validate the point's position.
  • The method compares x_cord (calculated from p.y) to bottom.x and top.x, which could lead to incorrect results.

This issue was previously raised. Please consider revisiting the logic and applying the suggested fixes to handle all edge cases correctly.

src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

1152-1168: ⚠️ Potential issue

Handle potential division by zero when sin_angle is close to zero

In the calculation of scale_factor, dividing by sin_angle can cause a division by zero or numerical instability if sin_angle is zero or very close to zero. Consider adding a threshold to handle cases when sin_angle is near zero to avoid such issues.

Apply this diff to handle small sin_angle values:

 if (sin_angle == 0) {
   mitter = {p1_p2_diff_norm.y / 2, -p1_p2_diff_norm.x / 2};
 } else {
+  constexpr double epsilon = 1e-8;
+  if (std::abs(sin_angle) < epsilon) {
+    sin_angle = (sin_angle < 0) ? -epsilon : epsilon;
+  }
   point::Point::coordinate_t scale_factor = 1 / sin_angle;
   if (bevel || cos_angle < cos_limit) {
src/PartSegCore_compiled_backend/triangulation/debug_util.hpp (1)

53-57: ⚠️ Potential issue

Avoid overloading operator<< for std::pair in the global namespace

Defining new functions in the std namespace is undefined behavior. While you've defined the operator<< overload for std::pair<T1, T2> within your own namespace partsegcore, overloading operators for types in the std namespace can lead to unpredictable results.

Consider creating a wrapper function or a custom struct to handle printing pairs without modifying the behavior of standard library types.

🧹 Nitpick comments (3)
src/PartSegCore_compiled_backend/triangulation/point.hpp (1)

22-23: Be cautious with floating-point equality comparisons.

Using == and != operators for floating-point comparisons can lead to unexpected results due to precision issues. Consider implementing an approximate equality check using an epsilon value to account for floating-point inaccuracies.

src/tests/test_triangulate.py (1)

259-408: Enhance test coverage by adding assertions in test_triangulate_polygon_segfault* functions.

The test_triangulate_polygon_segfault* functions currently execute triangulate_polygon_py with specific polygons but do not include assertions to verify the results. To ensure these tests effectively validate functionality, consider adding assertions to check the correctness of the output or to confirm that no exceptions are raised.

CMakeLists.txt (1)

109-120: Well-structured triangulation module with proper separation of concerns

The triangulation module is well-organized with separate headers for core functionality (triangulate.hpp), intersection detection (intersection.hpp), geometric primitives (point.hpp), and debugging (debug_util.hpp). This aligns well with the PR's objective of implementing performant triangulation.

A few suggestions for consideration:

  1. Consider adding header documentation explaining the sweeping line algorithm's implementation
  2. The debug utilities could be conditionally included based on build type
 python_add_library(
         triangulate MODULE
         "${triangulate_cython_cxx}"
         src/PartSegCore_compiled_backend/triangulation/triangulate.hpp
         src/PartSegCore_compiled_backend/triangulation/intersection.hpp
         src/PartSegCore_compiled_backend/triangulation/point.hpp
-        src/PartSegCore_compiled_backend/triangulation/debug_util.hpp
+        $<$<CONFIG:Debug>:src/PartSegCore_compiled_backend/triangulation/debug_util.hpp>
         WITH_SOABI)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ceddeac and 2749fa4.

📒 Files selected for processing (9)
  • CMakeLists.txt (3 hunks)
  • pyproject.toml (2 hunks)
  • src/PartSegCore_compiled_backend/triangulate.pyx (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/debug_util.hpp (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/intersection.hpp (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/point.hpp (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1 hunks)
  • src/tests/test_triangulate.py (1 hunks)
  • tox.ini (1 hunks)
🧰 Additional context used
📓 Learnings (1)
src/PartSegCore_compiled_backend/triangulation/point.hpp (1)
Learnt from: Czaki
PR: 4DNucleome/PartSegCore-compiled-backend#36
File: src/PartSegCore_compiled_backend/triangulation/point.hpp:0-0
Timestamp: 2024-12-05T23:09:52.930Z
Learning: In this project, `float` is preferred over `double` in geometric computations to prioritize performance.
🔇 Additional comments (9)
src/PartSegCore_compiled_backend/triangulate.pyx (5)

106-127: Add input validation for on_segment function arguments.

The on_segment function should validate that p, q, and r each contain exactly two elements. This ensures robustness and prevents potential index errors with malformed inputs.


129-150: Add input validation for orientation function arguments.

Similarly, ensure that p, q, and r in the orientation function have exactly two elements to prevent index errors and enhance reliability.


153-171: Validate segment inputs in do_intersect function.

The do_intersect function should check that s1 and s2 are sequences of two points, each with two coordinates. Input validation will prevent runtime exceptions due to improper input formats.


210-220: Ensure valid input in is_convex function.

Add a check to confirm that the polygon sequence contains at least three points, each with two coordinates. A polygon with fewer than three points cannot be convex.


367-383: Validate the limit parameter in triangulate_path_edge_py.

The limit parameter should be positive. Adding validation ensures that the function behaves as expected and prevents potential mathematical errors.

src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

949-955: Ensure point::Point has appropriate hash and equality operators for unordered containers

The use of std::unordered_set<point::Point> and std::unordered_map<point::Point, ...> requires that point::Point has a std::hash specialization and an appropriate operator== implementation. Please verify that these are defined to ensure correct behavior.

tox.ini (1)

31-31: Change is acceptable

Adding the -v flag to pytest increases verbosity, providing more detailed test output which can aid in debugging.

CMakeLists.txt (2)

Line range hint 32-120: Consider consolidating duplicate header files

The codebase has duplicate copies of my_queue.h in:

  • src/PartSegCore_compiled_backend/multiscale_opening/my_queue.h
  • src/PartSegCore_compiled_backend/sprawl_utils/my_queue.h

Multiple modules depend on these headers. Consider consolidating them into a common include directory for better maintainability.


4-6: Verify C++17 compatibility across the codebase

The upgrade to C++17 is good for modern features, but we should ensure compatibility with existing code and build environments.

✅ Verification successful

C++17 upgrade is compatible with the codebase

The codebase analysis shows:

  • Modern C++ features used are compatible with C++17 (templates, auto, noexcept)
  • No C++17-specific features (std::optional, std::filesystem, etc.) are currently used
  • CMake minimum version (3.21) supports C++17
  • Code is primarily template-based numerical/geometric algorithms that will benefit from C++17 optimizations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for C++17 specific features or potential compatibility issues
rg -l '\b(std::optional|std::variant|std::any|std::string_view|std::filesystem)\b|if\s+constexpr|structured\s+bindings|namespace\s+\w+\s*=|template\s*<\s*auto'

Length of output: 161


Script:

#!/bin/bash
# Let's try a different approach to analyze C++ code and compiler requirements

# 1. Find all C++ source files
echo "=== C++ Source Files ==="
fd -e cpp -e hpp -e h -e cxx -e cc

# 2. Check for any explicit compiler requirements or version checks
echo -e "\n=== Compiler Version Checks ==="
rg -i "requires.*compiler|compiler.*version|c\+\+\s*1[1-7]"

# 3. Look for modern C++ features that might need C++17
echo -e "\n=== Modern C++ Features ==="
rg -i "constexpr|noexcept|nullptr|auto|decltype|static_assert|using\s+=|template|lambda"

# 4. Check for any build system version requirements
echo -e "\n=== Build System Requirements ==="
rg -i "cmake_minimum_required|project\s*\(|set\s*\(\s*CMAKE_"

Length of output: 15245

Comment on lines +235 to +238
// Current implementation is not optimal, but it is was
// faster to use this to have initial working version.
// TODO based on commented code fix edge cases.
for (const auto &active_el : active) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Optimize _find_intersections and handle edge cases.

The current implementation includes a TODO comment indicating that the _find_intersections function is not optimal and has edge cases to address. Optimizing this function will improve performance and reliability, especially with large datasets.

Would you like assistance in optimizing this function and resolving the edge cases? I can help propose an improved implementation if needed.

Comment on lines 290 to 296
for (auto &polygon : interval->polygons_list) {
polygon->bottom = p;
monotone_polygons.push_back(*polygon);
delete polygon;
}

segment_to_line.erase(edge_left);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use smart pointers to manage dynamic memory and prevent leaks

The code manually allocates and deallocates MonotonePolygon objects using new and delete, which can lead to memory leaks and undefined behavior if exceptions occur. Consider using std::unique_ptr or std::shared_ptr to automatically manage the lifetime of these objects and ensure exception safety.

Apply this diff to replace raw pointers with std::unique_ptr:

 struct Interval {
   point::Point last_seen{};
   point::Segment left_segment;
   point::Segment right_segment;
-  std::vector<MonotonePolygon *> polygons_list{};
+  std::vector<std::unique_ptr<MonotonePolygon>> polygons_list{};

   Interval() = default;

And update the usage in the process_end_point method:

 for (auto &polygon : interval->polygons_list) {
   polygon->bottom = p;
-  monotone_polygons.push_back(*polygon);
-  delete polygon;
+  monotone_polygons.push_back(std::move(*polygon));
 }

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

162-170: Simplify point ordering in PointTriangle constructor

The logic for sorting points in clockwise order can be simplified for better readability and efficiency. Consider using a utility function or a standard library algorithm to achieve the desired ordering.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2749fa4 and c3dd09f.

📒 Files selected for processing (1)
  • src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1 hunks)
🔇 Additional comments (3)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (3)

1159-1159: Handle potential division by zero when sin_angle is zero

In the calculation of scale_factor, dividing by sin_angle can lead to a division by zero when sin_angle is zero. Consider adding a check to prevent division by zero or handling cases where sin_angle approaches zero to ensure numerical stability.


950-956: Ensure point::Point has appropriate hash and equality functions for unordered containers

The code uses std::unordered_set<point::Point> and std::unordered_map<point::Point, ...>, which require point::Point to have std::hash specialization and operator== properly implemented. Verify that these are defined to guarantee correct behavior when storing point::Point in unordered containers.


1156-1160: ⚠️ Potential issue

Guard against division by zero when calculating scale_factor

The computation of scale_factor involves division by sin_angle, which can be zero. To prevent a division by zero error, add a check for sin_angle == 0 before performing the division.

Apply this diff to add the safeguard:

 if (sin_angle == 0) {
   mitter = {p1_p2_diff_norm.y / 2, -p1_p2_diff_norm.x / 2};
 } else {
+  if (sin_angle == 0) {
+    // Handle the zero case appropriately
+    sin_angle = std::numeric_limits<point::Point::coordinate_t>::epsilon();
+  }
   point::Point::coordinate_t scale_factor = 1 / sin_angle;

Likely invalid or redundant comment.

Comment on lines 293 to 297
auto interval = segment_to_line.at(edge_left);
for (auto &polygon : interval->polygons_list) {
polygon->bottom = p;
monotone_polygons.push_back(*polygon);
}
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid potential dangling pointers by using smart pointers consistently

While std::unique_ptr is used in Interval::polygons_list, there is a dereference and copy to monotone_polygons without transferring ownership:

for (auto &polygon : interval->polygons_list) {
  polygon->bottom = p;
  monotone_polygons.push_back(*polygon);
}

This can lead to potential memory issues if not handled carefully. Consider moving the unique_ptr or copying the object appropriately to ensure proper memory management.

Apply this diff to transfer ownership:

-  monotone_polygons.push_back(*polygon);
+  monotone_polygons.push_back(std::move(*polygon));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

monotone_polygon is a list of MonotonePolygon objects, not a list of pointers. So this method is creating a copy of polygon to store as results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Czaki, thank you for the clarification. Since monotone_polygons is a list of MonotonePolygon objects, copying the object as you've implemented is appropriate. The code correctly creates a copy of polygon to store in the results.


✏️ Learnings added
Learnt from: Czaki
PR: 4DNucleome/PartSegCore-compiled-backend#36
File: src/PartSegCore_compiled_backend/triangulation/triangulate.hpp:293-297
Timestamp: 2024-12-16T14:31:31.364Z
Learning: In `src/PartSegCore_compiled_backend/triangulation/triangulate.hpp`, the `monotone_polygons` variable is a `std::vector` of `MonotonePolygon` objects (not pointers). Copying objects from a vector of `std::unique_ptr<MonotonePolygon>` into `monotone_polygons` by dereferencing the pointers (using `*polygon`) is acceptable and ensures proper copying without the need to transfer ownership.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines 363 to 365
if (segment_to_line.count(edge_top) == 0) {
throw std::runtime_error("Segment not found in the map1");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle missing segments gracefully to prevent exceptions

In the _process_normal_point method, an exception is thrown if edge_top is not found in segment_to_line. This could terminate the program unexpectedly. Consider handling this scenario gracefully or ensuring that edge_top always exists in the map before calling this function.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (2)

469-475: Use std::make_unique instead of raw new for exception safety

The code manually allocates MonotonePolygon objects using new, which can lead to memory leaks if exceptions occur before the pointer is stored. Consider using std::make_unique to manage the object's lifetime automatically and improve exception safety.

Apply this diff to replace new with std::make_unique:

-MonotonePolygon *new_polygon = nullptr;
-if (interval->polygons_list[0]->right.empty()) {
-  new_polygon = new MonotonePolygon(interval->polygons_list[0]->top);
-} else {
-  new_polygon = new MonotonePolygon(interval->polygons_list[0]->right.back());
+std::unique_ptr<MonotonePolygon> new_polygon;
+if (interval->polygons_list[0]->right.empty()) {
+  new_polygon = std::make_unique<MonotonePolygon>(interval->polygons_list[0]->top);
+} else {
+  new_polygon = std::make_unique<MonotonePolygon>(interval->polygons_list[0]->right.back());
 }
 new_polygon->left.push_back(p);
-new_interval->polygons_list.emplace_back(new_polygon);
+new_interval->polygons_list.emplace_back(std::move(new_polygon));
 interval->polygons_list[0]->right.push_back(p);

104-129: Consider adding unit tests for left_to_right function

The left_to_right function determines the relative positioning of segments and is critical for the algorithm's correctness. Consider adding unit tests to ensure it behaves correctly in various scenarios, including overlapping segments and shared endpoints.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3dd09f and 8d62d63.

📒 Files selected for processing (1)
  • src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1 hunks)
🔇 Additional comments (3)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (3)

363-365: Handle missing segments gracefully to prevent exceptions

In the _process_normal_point method, an exception is thrown if edge_top is not found in segment_to_line. This could terminate the program unexpectedly. Consider handling this scenario gracefully or ensuring that edge_top always exists in the map before calling this function.


950-956: Ensure point::Point has appropriate hash and equality operators

The code uses std::unordered_set<point::Point> and std::unordered_map<point::Point, ...>, which require point::Point to have a std::hash specialization and operator== correctly implemented. Please verify that these are defined to ensure correct behavior.


1159-1160: Handle potential division by zero when sin_angle is close to zero

In the calculation of scale_factor, division by sin_angle can cause numerical instability if sin_angle is very close to zero. Consider adding a threshold to handle cases when sin_angle is near zero to avoid division by very small numbers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (3)

329-339: Potential performance concerns for frequent segment map lookups
Accessing segment_to_line via segment_to_line.at(edge) in performance-critical loops (e.g., process_end_point) could cause overhead if the data set is large. Consider whether a direct reference or pointer usage might reduce repeated lookups.


540-603: Validate consistency of intersection handling in process_intersection_point
This method deeply manipulates intervals and merges intervals for intersection. If an intersection triggers additional merges or splits, ensuring a stable final state may require additional checks or logging.

I can provide a logging or debug mechanism to better track intervals if desired.


1241-1289: Evaluate concurrency or parallelization possibilities in triangulate_path_edge
If the path data is extensive, partial parallelization (e.g., concurrency at the join computations) may boost performance while still relying on the final fix_triangle_orientation step.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c1eff7 and 1a96342.

📒 Files selected for processing (1)
  • src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1 hunks)
🔇 Additional comments (4)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (4)

42-42: Confirm the behavior of unique_ptr in polygons_list
Storing pointers to polygons in a vector of std::unique_ptr is a solid approach for memory safety. However, ensure that you don’t inadvertently move or clear them in ways that cause incomplete polygons or unexpected deletion.

Would you like a helper script to grep all usage of polygons_list across the codebase to confirm consistent usage?


187-188: Verify the hash and equality specialization of point::Point
std::unordered_map<point::Point, …> requires a valid std::hashpoint::Point and operator==. This concern was raised previously. Confirm that these specializations are properly implemented to avoid undefined behavior.


475-536: Consider edge-case coverage for self-intersecting polygons in process_split_point
While process_split_point attempts to handle polygons with a top-based approach, edge cases with highly complex geometry or multiple intersections around the same y-level might not be fully covered. Additional internal verification or robust fallback logic could prevent misalignment.


1198-1209: Handle potential numerical instability with near-zero sin_angle
Division by sin_angle (line 1198) can lead to numerical instability if sin_angle is very small but not exactly zero. You might consider a threshold-based approach to avoid large scale_factor values.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

1198-1199: ⚠️ Potential issue

Handle potential numerical instability when sin_angle is very small

This segment performs division by sin_angle and then applies a scale factor. If sin_angle is close to zero (but not exactly 0), numerical instability could arise. Consider adding a threshold to handle extremely small sin_angle values more gracefully.

🧹 Nitpick comments (2)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

538-603: Consider splitting process_intersection_point into smaller methods

The process_intersection_point function (lines 538–603) is complex and covers multiple scenarios. Refactoring it into smaller utility methods (e.g., handling bottom_segments vs. top_segments, intersection merges, etc.) would improve readability and maintainability.

src/PartSegCore_compiled_backend/triangulate.pyx (1)

259-278: Check for empty polygons in triangulate_polygon_numpy

If the polygon array is empty or has fewer than 3 points, it may produce unintended results. Consider early returns or explicit exceptions if insufficient data is provided for triangulation.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a96342 and 519b1cd.

📒 Files selected for processing (2)
  • src/PartSegCore_compiled_backend/triangulate.pyx (1 hunks)
  • src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1 hunks)
🔇 Additional comments (3)
src/PartSegCore_compiled_backend/triangulation/triangulate.hpp (1)

618-639: Validate polygon size in _is_convex

The _is_convex function iterates over polygon elements in triplets (lines 619–639). If the polygon has fewer than 3 points, the loops and indexing may be invalid or yield no meaningful result. Consider an early return if polygon.size() < 3.

src/PartSegCore_compiled_backend/triangulate.pyx (2)

106-127: Add length checks on input sequences for on_segment

Similar to prior suggestions, on_segment (lines 106–127) uses indices [0], [1]. If p, q, or r have length != 2, it might cause index errors. Validate input lengths to avoid potential runtime exceptions.


153-167: Validate segment size in do_intersect

In do_intersect (lines 153–167), confirm that each segment has exactly two points of length 2. This prevents out-of-bounds indexing and ensures robust error handling.

Comment on lines +380 to +392
*
* @param p The point to be processed.
* @param edge_top The top edge segment associated with the point.
* @param edge_bottom The bottom edge segment associated with the point.
*
* This function updates the monotone polygon interval associated with
* `edge_top` to include the new point `p`. If the interval contains
* multiple polygons, it closes all but one of them by setting their
* bottom to `p` and moving them to the list of completed monotone polygons.
*
* Depending on whether `edge_top` represents the right segment of the
* interval, the function then adds the point `p` to the appropriate (left
* or right) list of points in the remaining polygon and replaces `edge_top`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Guard against out-of-bound access in _build_triangles_opposite_edge

Inside _build_triangles_opposite_edge (lines 682–692), the code reassigns stack[0] and stack[1] after the loop. If stack.size() is less than 2, this leads to an out-of-range write. Ensure that stack has at least two elements before accessing indices 0 and 1.

Comment on lines +712 to +727
inline void _build_triangles_current_edge(std::vector<point::Point> &stack,
std::vector<PointTriangle> &result,
point::Point current_point,
int expected_orientation) {
auto it1 = stack.rbegin();
auto it2 = stack.rbegin() + 1;
while (it2 != stack.rend() &&
intersection::_orientation(*it2, *it1, current_point) ==
expected_orientation) {
result.emplace_back(current_point, *it1, *it2);
++it1;
++it2;
}
stack.erase(it1.base(), stack.end());
stack.push_back(current_point);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Check stack size before arithmetic in _build_triangles_current_edge

In _build_triangles_current_edge (lines 712–727), the code accesses stack via it1 and it2 (reverse iterators) before erasing and pushing back elements. If stack has fewer than two points, the while loop condition and subsequent erase operations could lead to invalid iterator usage or out-of-range errors.

Comment on lines +367 to +390
def triangulate_path_edge_py(path: Sequence[Sequence[float]], closed: bool=False, limit: float=3.0, bevel: bool=False) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
""" Triangulate path"""
cdef vector[Point] path_vector
cdef PathTriangulation result
cdef Point p1, p2
cdef cnp.ndarray[cnp.uint32_t, ndim=2] triangles

path_vector.reserve(len(path))
for point in path:
path_vector.push_back(Point(point[0], point[1]))
with cython.nogil:
result = triangulate_path_edge(path_vector, closed, limit, bevel)

if result.triangles.size() == 0:
triangles = np.zeros((0, 3), dtype=np.uint32)
else:
triangles = np.array([(triangle.x, triangle.y, triangle.z) for triangle in result.triangles], dtype=np.uint32)

return (
np.array([(point.x, point.y) for point in result.centers], dtype=np.float32),
np.array([(offset.x, offset.y) for offset in result.offsets], dtype=np.float32),
triangles,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate limit and handle path edge corner cases

In triangulate_path_edge_py (lines 367–390), the limit parameter is used for angle calculations without explicit checks to ensure it is positive. Also, if len(path) < 2, the returned geometry is a placeholder. Provide meaningful error messages or raise exceptions for invalid inputs.

@Czaki Czaki merged commit bb71959 into master Dec 20, 2024
21 checks passed
@Czaki Czaki deleted the triangulate branch December 20, 2024 23:17
jni added a commit to napari/napari that referenced this pull request Jan 9, 2025
# References and relevant issues

It may be a replacement for #6654 

# Description

This PR is using
4DNucleome/PartSegCore-compiled-backend#36 to
perform triangulation.

Used sweeping line algorithm from pointed PR allows having "polygon with
holes" or "non-coherent polygons" so after this PR we could implement
alternative to #6654 by addin "Multi polygon" shape. It will also allow
to map multi components labels into shape.


Current main:


![obraz](https://github.com/user-attachments/assets/a3698054-5739-4ce0-9b0d-f8625a153c55)

This PR 


![obraz](https://github.com/user-attachments/assets/c6c4a175-51bc-476b-8689-3d94f4f70be3)

Where mainly optimized part, the `_updated_displayed_data` on main:


![obraz](https://github.com/user-attachments/assets/c3ca27ca-ee31-44ae-95dd-05263924b21e)


this PR:


![obraz](https://github.com/user-attachments/assets/7fb680ad-d75d-46c4-be44-57b7b949d322)

And triangulation itself:

main:


![obraz](https://github.com/user-attachments/assets/b52c53f7-c5ba-4677-9c78-85eda3db1efc)

this PR


![obraz](https://github.com/user-attachments/assets/3a4ebde6-cc8e-431d-b2f4-64921f59e9df)


## Additional changes

1. Change `float` to `np.float32` for `Rectangle`, `Line` and `Elipse`,
and respectively update the tests. We internally send floa32 to vispy
and already cast data to float in Polygons.
2. Add cached properties to `Shape` class and subclasses 
3. Use string, not enum instance for dispatch of Shape class.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants