I have noticed that when the SIMD (both SSE2 and ARM Neon) version of the YCoCg conversion to RGB are used the returned data is not exactly the same as from the "slow" version. Commenting out the SIMD YCoCg conversions result in the same data. Both SIMD version seems to produce the same data (at least for this tile).
Comparing .\test_tile_arm.png .\test\expected_output\testslide_example_tile_3_5_10.png
.\test_tile_arm.png is not equal to .\test\expected_output\testslide_example_tile_3_5_10.png
Arrays are not equal
Mismatched elements: 119446 / 262144 (45.6%)
Max absolute difference: 2
Max relative difference: 0.03333333
x: array([[[227, 224, 230, 255],
[225, 227, 229, 255],
[228, 229, 234, 255],...
y: array([[[226, 224, 229, 255],
[225, 227, 229, 255],
[228, 229, 234, 255],...
----------
Comparing .\test_tile_sse2.png .\test\expected_output\testslide_example_tile_3_5_10.png
.\test_tile_sse2.png is not equal to .\test\expected_output\testslide_example_tile_3_5_10.png
Arrays are not equal
Mismatched elements: 119446 / 262144 (45.6%)
Max absolute difference: 2
Max relative difference: 0.03333333
x: array([[[227, 224, 230, 255],
[225, 227, 229, 255],
[228, 229, 234, 255],...
y: array([[[226, 224, 229, 255],
[225, 227, 229, 255],
[228, 229, 234, 255],...
----------
Comparing .\test_tile_sse2.png .\test_tile_arm.png
.\test_tile_sse2.png is equal to .\test_tile_arm.png
----------
I´m not an expert on SIMD acceleration, but perhaps there is a over/underflow in the operations?
Hi,
I have noticed that when the SIMD (both SSE2 and ARM Neon) version of the YCoCg conversion to RGB are used the returned data is not exactly the same as from the "slow" version. Commenting out the SIMD YCoCg conversions result in the same data. Both SIMD version seems to produce the same data (at least for this tile).
Comparison using
np.testing.assert_array_equalTile produced using SSE2 SIMD:


Tile produced using ARM Neon SIMD
I´m not an expert on SIMD acceleration, but perhaps there is a over/underflow in the operations?