Use change lists instead of ticks for detecting when meshes need to be re-specialized and/or re-queued.#22966
Use change lists instead of ticks for detecting when meshes need to be re-specialized and/or re-queued.#22966pcwalton wants to merge 4 commits intobevyengine:mainfrom
Conversation
re-specialized and/or re-queued. Right now, every frame, all specialization and queuing systems iterate over all entities visible from a view and check to see whether they need to be updated by consulting a set of change ticks and comparing them to the current change ticks. To handle cases in which a mesh needs to be removed from the bins, a separate final *sweep* pass then finds entities that no longer exist and removes them manually from the bins. This process is complex, error-prone, and slow, as it involves visiting all visible entities multiple times every frame. This PR changes the setup so that, instead of examining change ticks, the visibility logic pushes the set of added and removed entities to each view explicitly. The visibility system determines which meshes need to be added and removed by first sorting the list of visible entities, then performing an O(n) diff process on the last frame's visible entities and this frame's visible entity list. The end result is that the specialization and queuing systems only process the entities that they need to every frame. If a mesh was visible last frame, remained visible this frame, and didn't change its mesh or material, then it's generally not examined at all. Not only is this significantly faster for virtually all realistic scenes, but it's also much simpler. In order to achieve the benefits of not examining every visible mesh every frame, I made sorted render passes retained via an `IndexMap`. This allows entities to be removed and added via random access while still allowing the list to be sorted by distance. Note that I had to remove the radix sort because `IndexMap` doesn't currently support that; I believe the enormous speed benefits of this patch outweigh any minor sorting regressions from this. I tested this PR by running `scene_viewer` on a test scene with many meshes and materials and implementing a material shuffler that randomly switches the materials around. I tested the following cases: * Moving the camera so that meshes become visible and invisible. * Switching opaque materials on meshes. * Moving meshes from opaque to alpha masked and vice versa. * Moving meshes from binned render passes to sorted render passes (i.e. transparent). * All of the above while the meshes were off screen, then moving them on screen to ensure that the changes took effect. This PR brings the `specialize_shadows` time on the `bevy_city` demo from 12.87 ms per frame to 0.1261 ms per frame, a 102x speedup. It brings the `queue_shadows` time on the same demo from 12.34 ms per frame to 0.1102 ms, a 111x speedup. Mean frame time goes from 50.16 ms to 23.26 ms, a 2.16x speedup.
|
It looks like your PR is a breaking change, but you didn't provide a migration guide. Please review the instructions for writing migration guides, then expand or revise the content in the migration guides directory to reflect your changes. |
|
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
2 similar comments
|
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
|
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
|
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
Right now, every frame, all specialization and queuing systems iterate over all entities visible from a view and check to see whether they need to be updated by consulting a set of change ticks and comparing them to the current change ticks. To handle cases in which a mesh needs to be removed from the bins, a separate final sweep pass then finds entities that no longer exist and removes them manually from the bins. This process is complex, error-prone, and slow, as it involves visiting all visible entities multiple times every frame.
This PR changes the setup so that, instead of examining change ticks, the visibility logic pushes the set of added and removed entities to each view explicitly. The visibility system determines which meshes need to be added and removed by first sorting the list of visible entities, then performing an O(n) diff process on the last frame's visible entities and this frame's visible entity list. The end result is that the specialization and queuing systems only process the entities that they need to every frame. If a mesh was visible last frame, remained visible this frame, and didn't change its mesh or material, then it's generally not examined at all. Not only is this significantly faster for virtually all realistic scenes, but it's also much simpler.
In order to achieve the benefits of not examining every visible mesh every frame, I made sorted render passes retained via an
IndexMap. This allows entities to be removed and added via random access while still allowing the list to be sorted by distance. Note that I had to remove the radix sort becauseIndexMapdoesn't currently support that; I believe the enormous speed benefits of this patch outweigh any minor sorting regressions from this.I tested this PR by running
scene_vieweron a test scene with many meshes and materials and implementing a material shuffler that randomly switches the materials around. I tested the following cases:Moving the camera so that meshes become visible and invisible.
Switching opaque materials on meshes.
Moving meshes from opaque to alpha masked and vice versa.
Moving meshes from binned render passes to sorted render passes (i.e. transparent).
All of the above while the meshes were off screen, then moving them on screen to ensure that the changes took effect.
This PR brings the
specialize_shadowstime on thebevy_citydemo from 12.87 ms per frame to 0.1261 ms per frame, a 102x speedup. It brings thequeue_shadowstime on the same demo from 12.34 ms per frame to 0.1102 ms, a 111x speedup. Mean frame time goes from 50.16 ms to 23.26 ms, a 2.16x speedup.specialize_shadowsinbevy_citybefore and after:queue_shadowsinbevy_citybefore and after:Frame graph of

bevy_citybefore:Frame graph of

bevy_cityafter: