Panic safe component drop#24393
Conversation
| last_element_index: usize, | ||
| row: TableRow, | ||
| ) { | ||
| self.added_ticks |
There was a problem hiding this comment.
If we have unchecked/unsafe code that we are touching, we should be trying to gradually wrapping these with unsafe blocks, so we can eventually enable the unsafe_op_in_unsafe_fn lint.
There was a problem hiding this comment.
This PR doesn't really affect existing safety requirements, so adding extra unsafe blocks around existing unsafe code would just make this PR harder to review. I'll do a separate PR for more unsafe_op_in_unsafe_fn though :)
chescock
left a comment
There was a problem hiding this comment.
Oh, nice! This will good to fix!
This is tricky enough code that I want to review it a little more carefully before approving than I have time for right now, but I wanted to leave a few thoughts.
| insert_mode: InsertMode, | ||
| caller: MaybeLocation, | ||
| ) { | ||
| ) -> Option<Box<dyn Any + Send>> { |
There was a problem hiding this comment.
I'd vote for using Result<(), Box<dyn Any + Send>> here to make it more clear that the payload represents an error case.
There was a problem hiding this comment.
I don't have a strong opinion here; changed (in all places).
| new_location, | ||
| sparse_sets, | ||
| move_result.new_table, | ||
| move_result.panic, |
There was a problem hiding this comment.
Making sure I understand: A panic here isn't actually possible, right? move_row can only return a panic if it drops a value, which it only does if the new table is missing a column, which will only be the case during a remove and never during an insert. But we have to handle that case anyway, and an assert! would just introduce a new way to panic.
There was a problem hiding this comment.
Yep, good catch. Got lost in the weeds and when I came back here I just saw the DROP=true. I've removed the check and added a comment.
| self.changed_by.as_mut().map(|changed_by| { | ||
| changed_by.swap_remove_unchecked_nonoverlapping(row.index(), last_element_index); | ||
| }); | ||
| self.data |
There was a problem hiding this comment.
The reordering here is subtle! Would you add a comment noting that this needs to be last in case drop panics (and similarly in the other methods)? I can imagine a future developer adding more things after this line without thinking about panic safety.
Hmm, even then, this seems really hard to get right. Maybe we could try to restructure this so that it doesn't depend on the order?
One approach might be to always put the catch_unwind directly around the call to drop, and propagate the panic payload explicitly as a Result or Option. So, BlobArray::swap_remove_and_drop_unchecked_nonoverlapping would return the panic payload, and the order of these lines wouldn't matter. That should also make it easier to verify that every call to drop has a catch_unwind, since they'd always be together.
Another approach might be to move the drop calls to happen later. If we do all the moves and swaps first, and then do all the drop calls together at the end, then a panic that escapes from a drop won't break any of our invariants. One downside of that approach is that we'd need to keep the values around longer, so we wouldn't be able to replace swaps with moves in more places like you did in #24312.
Objective
Fixes #2860
bevy_ecs is not panic-safe in regards to dropping components. Note that #12929 has since made some cases safe by permanently disabling the drop function in some cases, but others remained.
See also #20368 for lack of panic safety in required component constructors. The fix (#20933) to that differs a bit in that a panic in a required component constructor means we have to move the entity to a different archetype than originally planned, which is a bit more involved.
Solution
When a panic occurs while making changes that must be atomic (such as editing entity metadata or moving around component data), catch any panics originating from component drop functions and resume unwinding at the end.
While I've added a panics section to the doc comments of relevant internal functions, I have not done so for any of the safe wrappers.
Testing
Added the
panic_in_component_droptest based on the linked issue.