Skip to content

Fix: refactor ConvGpu for improved GPU convolution#121

Open
Pakhi-7831 wants to merge 2 commits intosumit3203:masterfrom
Pakhi-7831:patch_1
Open

Fix: refactor ConvGpu for improved GPU convolution#121
Pakhi-7831 wants to merge 2 commits intosumit3203:masterfrom
Pakhi-7831:patch_1

Conversation

@Pakhi-7831
Copy link
Copy Markdown

Description

This PR improves and fixes the GPU separable convolution path used by the ImageJ active segmentation filters.

Changes

  • Updated convolveSemiSep(FloatProcessor, ...) to correctly route separable filtering through axis-specific Ox and Oy convolutions, ensuring 1D kernels are applied along the intended image axes.
  • Optimized convolveFloat1D(FloatProcessor, ...) to reduce GPU launch overhead by batching work into a single CUDA execution per 2D image (instead of per line), while preserving clamp-to-edge boundary semantics.

Significance

The platform relies heavily on separable Gaussian/derivative-style filters. The previous GPU approach suffered from high per-line overhead and could route separable computations incorrectly, leading to both performance loss and potential output inconsistencies. These changes improve correctness of separable routing and increase throughput of GPU filtering.

Refactor ConvGpu class to improve GPU convolution methods, including convolve2DGpu and updates to convolveFloat1D. Added error handling and memory management enhancements.
@Pakhi-7831 Pakhi-7831 changed the title Refactor ConvGpu for improved GPU convolution Fix: refactor ConvGpu for improved GPU convolution Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant