Skip to content

[rlsw][SEGFAULT] Fix triangle and quad spans applying pixels out of bounds#5849

Open
NighthowlerStudios wants to merge 5 commits intoraysan5:masterfrom
NighthowlerStudios:fix-arm32-raster-rlsw
Open

[rlsw][SEGFAULT] Fix triangle and quad spans applying pixels out of bounds#5849
NighthowlerStudios wants to merge 5 commits intoraysan5:masterfrom
NighthowlerStudios:fix-arm32-raster-rlsw

Conversation

@NighthowlerStudios
Copy link
Copy Markdown
Contributor

@NighthowlerStudios NighthowlerStudios commented May 8, 2026

Fixes #5838

The code in this commit and text in this PR are entirely human written.

It has been tested on a Raspberry Pi 2, the same board the original issue was found on.

Then it was tested on a Raspberry Pi 5 to verify texture and triangle warping wasn't present, using models_first_person_maze

Fixes RLSW Rastering buffer boundary out of bounds

In the raster routine SW_RASTER_QUAD, there was the following problem spotted by @ZX41R

In the quad path, SW_RASTER_QUAD derives xMin/yMin/xMax/yMax directly from the projected corners and then computes:

baseOffset = y * stride + xMin;
cPtr = cPixels + baseOffset * SW_FRAMEBUFFER_COLOR_SIZE;
without the clamp that the clear paths use. A negative xMin/yMin, or a projected quad extending outside the target, can therefore move cPtr before the color buffer before the inner loop even starts. On 32-bit ARM that underflow can plausibly land on a mapped ELF page, which matches dst=... "\177ELF" in the trace.

Additionally, in TRIANGLE_SPAN, the issue is much simpler. It only has an xMin that calculates into baseOffset, there is no yMin involved on that method. However it is the same boundary crossing bug.

This commit applies some fixes they suggested by ensuring to clamp the loop min and max to inside of RLSW.colorBuffer.

Then it moves the start of each interpolation for spans, textures, depths and vertex colours, to compensate for the new clamps so that only the correct colours are grabbed.

This fixes the segfault discovered on the issue and avoids the triangle warping that was previously discussed in the deleted fork on #5843

It should be much safer than the previous PR I self-rejected.

@ZX41R
Copy link
Copy Markdown

ZX41R commented May 9, 2026

Thanks for retesting on the RPi 2 + 5. Clipping interpolants by dxMin/dyMin is the right approach — but two issues in the bound math, both off-by-one on the upper clamp:

int yLoopMax = sw_clamp_int(yMax, 0, RLSW.colorBuffer->height - 1);
int xLoopMax = sw_clamp_int(xMax, 0, RLSW.colorBuffer->width  - 1);
// ...
for (int y = yLoopMin; y < yLoopMax; y++) { ... }
for (int x = xLoopMin; x < xLoopMax; x++) { ... }

yMax/xMax are exclusive bounds in these loops (the original code is y < yMax). Clamping the exclusive upper to height - 1 means a fully-on-screen quad ending at the bottom row never writes that row. Same in TRIANGLE_SPAN for xLoopEnd. Correct shape:

int yLoopMax = sw_clamp_int(yMax, 0, RLSW.colorBuffer->height); // not height-1
int xLoopMax = sw_clamp_int(xMax, 0, RLSW.colorBuffer->width);  // not width-1

Inclusive lower (xLoopMin/yLoopMin) is correctly clamped to width - 1/height - 1, since those do index a pixel.

Second concern — the per-row interpolant compensation in SW_RASTER_QUAD:

for (int y = yLoopMin; y < yLoopMax; y++) {
    ...
    float z = zRow + dZdy*dyMin;
    float u = uRow + dUdy*dyMin;
    float v = vRow + dVdy*dyMin;

If zRow/uRow/vRow are advanced once per row in the surrounding code (the typical pattern), then dZdy*dyMin is a constant offset that should be applied once before the y loop, not on every iteration. Otherwise every row reads at (y, dyMin) rather than (y, 0). If on the other hand zRow is held constant and the outer caller doesn't advance it, then this is correct but z = zRow + dZdy*(y - yMin) would be the more honest expression. Worth a quick read of the surrounding code to pick which one applies — I'd bet on the former from the variable naming (zRow suggests "row's starting z").

Once those two are sorted, the structure is good. RPi 2 retest after fixing the upper clamp will tell you immediately whether the bottom row was missing.

@raysan5
Copy link
Copy Markdown
Owner

raysan5 commented May 9, 2026

@Bigfoot71 please, could you take a look to this PR when possible? No hurries.

@NighthowlerStudios
Copy link
Copy Markdown
Contributor Author

NighthowlerStudios commented May 9, 2026

yMax/xMax are exclusive bounds in these loops (the original code is y < yMax). Clamping the exclusive upper to height - 1 means a fully-on-screen quad ending at the bottom row never writes that row. Same in TRIANGLE_SPAN for xLoopEnd. Correct shape:

Great catch! My bad. That makes sense because of the < check.

After some research I can agree that a -1 on LoopStart is still safe because these raster routines are reached after culling of triangles entirely outside anyway. Our issue is only with triangles overlapping the border.

Worth a quick read of the surrounding code to pick which one applies — I'd bet on the former from the variable naming (zRow suggests "row's starting z").

Yes, and I believe that I've discovered that those row vars don't just denote the start of the row, but actually behave as 2-dimensional "cursors" (or turtles) for the interpolation. At the end of the for loop, they get incremented by d(*)dy, which means their positions go down a new row.

So we need bilinear interpolation offsets. Instead of just moving them across the X, we need to move them across both X and Y for the correct start position. And yes, only do it once, outside of the loop. The incrementor for X gets put into a separate variable and the for loop controls how far that will sweep to the right and bottom anyway, and the incrementor at the end of the for loop adding a Y step (just adding row) would keep the x offset, so it's safe to do this only once.

This has to be in, because the for loop is just iterating, so yStart wouldn't have affected where interpolation started. cRow, zRow, uRow and vRow still had their values set at the top left corner of the quad. My changes moved that top left to inside the colorBuffer.

Tested our changes on both RPI2 and RPI5.

@NighthowlerStudios
Copy link
Copy Markdown
Contributor Author

NighthowlerStudios commented May 9, 2026

Yes, by fixing the off by one, a single pixel wide bar of pixels on the right hand and bottom end of the screen is now being filled by colour information correctly. The first commit in the PR had them white only, which was definitely something I should have spotted.

Bunnymark shows no issues with colour information after the movement of interpolation starts on raster_quad. Maybe there's a better example I could look toward that could reliably produce artifacts if our code is wrong here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[rlsw] Segfaulting some example and resolution combos on RPI with PLATFORM_DRM and OPENGL_SOFTWARE

3 participants