Fix inline assembly constraints in math3d for LTO compatibility #797
+2
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fix incorrect inline assembly register constraints in
calculate_verticesandcalculate_normalsfunctions that cause crashes when Link-Time Optimization (LTO) is enabled.Problem
The
calculate_verticesandcalculate_normalsfunctions inee/math3d/src/math3d.cuse inline assembly with loops that modify theoutput,count, andvertices/normalspointer operands:However, these operands were declared as input-only (
"r"):This tells GCC that the register values are not modified by the assembly block, which is incorrect.
Impact
Without LTO, this bug is usually hidden because functions are compiled separately and the corrupted register values aren't reused.
With LTO enabled, GCC optimizes across function boundaries and may:
This causes TLB Miss errors and crashes when programs using
math3d(like gsKit'scubeandhiresexamples) are compiled with LTO.Example crash:
Solution
Change the constraints from input-only (
"r") to read-write ("+r") for operands that are modified:The
"+r"constraint correctly tells GCC that these operands are both read and written.Changes
ee/math3d/src/math3d.ccalculate_normals(line 519):calculate_vertices(line 649):Testing
Tested with:
cube,hires) that usemath3dfunctions-O3 -flto -ftree-vectorize -ftree-slp-vectorizeBefore fix: TLB Miss crashes
After fix: All examples run correctly
References
"r"- Input operand (read-only)"+r"- Input/output operand (read-write)"=r"- Output operand (write-only)