Multiview camera rendering: proof-of-concept continuation of #16059#24422
Open
bigmark222 wants to merge 46 commits into
Open
Multiview camera rendering: proof-of-concept continuation of #16059#24422bigmark222 wants to merge 46 commits into
bigmark222 wants to merge 46 commits into
Conversation
Mirrors the underlying wgpu field. Previously the pipeline cache hard-coded multiview_mask: None on the raw wgpu descriptor, so multiview pipelines could not be built through Bevy's normal pipeline machinery. All existing construction sites default to None via Default. Foundational for multiview (single-pass stereo) rendering; relates to bevyengine#15864 and the prior effort in bevyengine#16059. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A new storage type that wraps DynamicUniformBuffer to pack many runtime-sized arrays of T into one uniform buffer, padded out to the length of the largest array so the WGSL side can read them as array<T, N> where N is the max length. This is the host-side companion to multiview-style view bindings, where each camera contributes a small array of per-view uniforms (one element per eye / cubemap face / shadow cascade) into a single bound uniform. Also makes batched_uniform_buffer::MaxCapacityArray pub(crate) so the new module can reuse the same encase shim. Based on the design in bevyengine#16059 by @awtterpip. Relates to bevyengine#15864. Co-Authored-By: Piper <awtterpip@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multiview lets a camera render to multiple layers of its render target texture array in a single draw pass — the foundation for VR / XR stereo rendering. Each MultiviewSubview specifies a per-eye view_from_camera offset and clip_from_view projection; the camera's own GlobalTransform remains the head pose used for sort distance and frustum culling. Mirrors the per-eye data into a new ExtractedMultiview component on the render-world entity. Subsequent layers will read this to pack the view uniform array, allocate N-layer render targets, and emit multiview pipelines. Holds with a single render-world entity per camera because multiview rendering is by definition single-pass: per-eye phase items don't fit the model, since one multiview pipeline draw emits to all layers via @Builtin(view_index). This departs from the reverted ExtractedViews { views: Vec<_> } shape in bevyengine#16059, which fought the existing single-view ExtractedView API. Relates to bevyengine#15864. Co-Authored-By: Piper <awtterpip@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches the view uniform buffer from DynamicUniformBuffer to DynamicArrayUniformBuffer so each render-world view entity contributes an array of per-subview uniforms rather than a single ViewUniform. Non-multiview cameras produce single-element arrays (no behavioral change for existing shaders, which still read the first element); multiview cameras produce one element per layer. Adds set_label, add_usages, and IntoBinding to DynamicArrayUniformBuffer so the existing ViewUniforms wiring (storage usage when supported, bind-group entries via IntoBinding) keeps working. prepare_view_uniforms now runs in two passes because the dynamic offset stride isn't known until all arrays are queued: the first pass stages per-view arrays, then finish_queuing + write_buffer, then the second pass attaches the resolved ViewUniformOffset. Shared per-camera state (viewport, frustum, lod_view_world_position) is hoisted out of the per-subview ViewUniform construction. Sets up L6, which switches the WGSL view binding to `array<View, MAX_VIEW_COUNT>` so shaders can read per-layer data indexed by @Builtin(view_index). Based on the design in bevyengine#16059 by @awtterpip. Relates to bevyengine#15864. Co-Authored-By: Piper <awtterpip@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `finish_queuing_assigns_offsets` test contained a tautological `is_none() || is_some()` assertion that always passed; replace it with an honest one and add a separate test for the pre-finish_queuing case. The `binding()` doc claimed `None` is returned only when `finish_queuing` hasn't been called, but it also returns `None` until `write_buffer` has allocated the underlying GPU buffer. Document both conditions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Multiview component previously had no bounds check — extraction inserted ExtractedMultiview for any non-None component, including the empty-views case (which the doc claimed should be ignored) and the >32-views case (which wgpu's u32 multiview_mask can't even represent). view_mask() also had a stale comment claiming the 32 cap was enforced at extraction time. - Add MAX_VIEW_COUNT (= 32) constant alongside Multiview. - view_mask() now returns None for views.len() > MAX_VIEW_COUNT. - extract_cameras treats empty views as "no multiview" and warns once + falls back to non-multiview for >MAX_VIEW_COUNT. - Document the contract on Multiview itself. - Add unit tests for view_mask boundaries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous comment claimed multiview subviews ignored `extracted_view.clip_from_world`, but the code below it still falls back to that field via `unwrap_or_else`. Rather than change the behavior (which only affects the unusual combination of multiview with a manually-set override), describe what the code actually does and call out the override-plus-multiview combo as undefined. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For cameras with an `ExtractedMultiview` component, `prepare_view_targets` now sizes the main texture (both ping-pong attachments and the MSAA sampled attachment) with one array layer per subview, instead of always 1. Cameras with no `Multiview` are unchanged — `view_count` falls back to 1 and the default texture view still spans the single layer. The texture array count is also added to `MainTextureKey` so cameras targeting the same window/format but with different layer counts don't clobber each other in the per-frame texture cache. `ViewTarget` carries the layer count as `main_texture_array_layers` and exposes it as `multiview_count() -> Option<NonZeroU32>`, returning `None` for single-layer cameras. This gives downstream systems a frame-stable, render-side source for the multiview view count (useful for the upcoming `MAX_VIEW_COUNT` shader def, which can't be derived from the view-uniform buffer's capacity on frame 0). `TextureDimension::D2` is kept for the texture itself — wgpu allows multi-layer D2 textures, and the `D2Array` view binding is a shader-side concern handled in a later layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`mesh_view_bindings::view` (and the parallel prepass view layout) now bind an `array<View, MAX_VIEW_COUNT>` with a 1-element fallback when the shader def is undefined. A `view()` helper returns the current view via `view_array[current_view_index]`, where `current_view_index` is a `var<private>` defaulting to 0; multiview entry points will overwrite it from `@builtin(view_index)` in a follow-up. All current readers are mechanical rewrites from `view.field` to `view().field`. With `MAX_VIEW_COUNT` still unemitted by any pipeline, the fallback `array<View, 1>` path matches the single ViewUniform packed by `DynamicArrayUniformBuffer`, so non-multiview rendering is unchanged. Verified via 3d_scene screenshot smoke test (blue cube + circular plane + shadow, matches session-3 baseline) plus the standard unit suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`MeshPipelineKey` gains a 6-bit `MAX_VIEW_COUNT` field that encodes the camera's multiview layer count (1..=32). `check_views_need_specialization` and `check_prepass_views_need_specialization` both OR in the count from the camera's `ExtractedMultiview` component (1 when absent). When the encoded count is >1, `MeshPipeline::specialize` and `PrepassPipeline::specialize` push both the `MULTIVIEW` flag and the `MAX_VIEW_COUNT` UInt def, switching the WGSL view binding to the `array<View, N>` shape and enabling the `@builtin(view_index)` paths in the entry points. Non-multiview cameras emit neither def and continue to hit the `array<View, 1>` fallback, preserving the existing render path. The mesh + prepass vertex/fragment entry points now accept `@builtin(view_index)` under `#ifdef MULTIVIEW` and assign it to `bevy_pbr::mesh_view_bindings::current_view_index` at the top of the function body, so all downstream helpers automatically read the correct per-eye view via `view()`. Verified via 3d_scene screenshot smoke test (non-multiview path unchanged) plus the standard unit suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L6's view binding rename (`view` → `view_array`) introduced a `view()` helper at the old name, so files that imported `view` as a single symbol and accessed `view.field` now resolve `view` to the function and fail to parse. Three readers were missed by C1's mechanical rewrite: - `bevy_core_pipeline::oit::oit_draw` (`view.viewport.z`) - `bevy_dev_tools::debug_overlay` (`view.viewport.zw`) - `bevy_pbr::meshlet::visibility_buffer_resolve` (`view.viewport.zw`, `view.world_position` ×3) Reproduced with the `order_independent_transparency` example before this change: shader fails with `expected variable access, found "bevy_pbr::mesh_view_bindings::view"`. After this change the same example renders the expected three transparent spheres. Out-of-scope status is unchanged: these crates still don't get `@builtin(view_index)` plumbing (deferred to L7+) and continue to hit the fallback `array<View, 1>` path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L6 wired `@builtin(view_index)` into four entry points (`mesh.wgsl::vertex`, `pbr.wgsl::fragment`, `prepass.wgsl::vertex`, `prepass.wgsl::fragment`) but missed two more in the same pipeline trees that also call `view()`: - `pbr_prepass.wgsl::fragment` (both branches). This is `StandardMaterial`'s `PrepassFragmentShader` override, so every PBR mesh in the prepass path runs it. The `#ifdef PREPASS_FRAGMENT` branch reads `view().mip_bias` and reaches `view().unjittered_clip_from_world` via `pbr_prepass_functions::calculate_motion_vector`; the `#else` branch reads `view().mip_bias` via `prepass_sample_color_and_alpha_discard`. Without plumbing, a multiview prepass would compute motion vectors against view[0] for both eyes — visible motion-blur / TAA artifacts. - `wireframe.wgsl::vertex` (WIREFRAME_WIDE path). Reads `view().viewport.zw` to compute screen-space line widths; without plumbing the second eye would use view[0]'s viewport. `mesh.wgsl::fragment` and `wireframe.wgsl::fragment` don't read `view`, so they don't need plumbing even when compiled with the multiview key. Verified via 3d_scene + wireframe screenshot smoke tests (both render unchanged) and the standard unit suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for L7a. `bevy_core_pipeline::input_texture` declares a binding-switched `input_texture` (`texture_2d<f32>` / `texture_2d_array<f32>` under `#ifdef MULTIVIEW`) at `@group(0) @binding(0)`, plus four sampling helpers covering the texture API surface used by the fullscreen post-fx pipelines L7a will convert (blit, bloom, FXAA): - `sample_input(s, uv)` — basic `textureSample`. - `sample_input_offset(s, uv, offset)` — `textureSample` with a constant pixel offset (bloom's 13-tap downsample kernel). - `sample_input_level(s, uv, level)` — `textureSampleLevel` (FXAA reads at LOD 0). - `sample_input_level_offset(s, uv, level, offset)` — `textureSampleLevel` with a pixel offset (FXAA's neighborhood luma samples). Each helper hides the `texture_2d_array` `array_index` argument under MULTIVIEW, sourced from a shared `var<private> current_view_index: i32 = 0;` that consumers overwrite from `@builtin(view_index)` at the top of their fragment entry points (the same convention `bevy_pbr:: mesh_view_bindings` uses for `view()`). The sampler is passed as a parameter so each pipeline can keep its local sampler binding (`s` in bloom, `samp` in FXAA, etc.) without renaming. Only the texture binding name is standardized to `input_texture` across consumers. This commit only registers the shader library via `load_shader_library!`; no pipeline imports it yet. Subsequent L7a commits convert blit, bloom, and FXAA to use it. Verified: `cargo check --workspace` clean; render/camera/pbr unit suites pass (12 + 43 + 2); 3d_scene screenshot smoke test matches the session-2/4/5 baseline (non-multiview path unchanged, as nothing imports the new library yet). Co-Authored-By: awtterpip <awtterpip@gmail.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WGSL requires the `offset` operand of `textureSample` and
`textureSampleLevel` to be a const-expression. C0's
`sample_input_offset` and `sample_input_level_offset` helpers took
`offset: vec2<i32>` as a runtime function parameter and forwarded it to
the underlying `textureSample*`, which fails naga validation:
error: this operation is not supported in a const context
┌─ embedded://bevy_core_pipeline/input_texture.wgsl:33:48
The validation error tanks the whole shared module, so as soon as any
pipeline imports `bevy_core_pipeline::input_texture` (initially the blit
pipeline in C1), every consumer fails to load — visible as a blank
swapchain on macOS because the upscaling node falls through to the
empty render-pass branch when its pipeline isn't ready.
Reproduced by running 3d_scene against C1 with the offset variants
still present: black screen + the naga error in the log. After this
commit C1 verifies cleanly.
Const-offset sampling can't be helpered in WGSL; the offset must be a
literal at the callsite. The two pipelines that use it (bloom's 13-tap
downsampling kernel; FXAA's neighborhood luma samples) will instead
`#ifdef MULTIVIEW` at the callsite — awtterpip's original convention.
Documented in the helper file's comment block.
C0's claim that the file covers "the texture API surface used by the
fullscreen post-fx pipelines L7a will convert" is now narrower than
written: it covers the non-offset subset.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`BlitPipelineKey` gains a `multiview_view_count: u32` field encoding
the source texture's layer count (1 for single-view, N for multiview
cameras after L5). `BlitPipeline` now stores two bind-group layouts —
the existing single-layer one plus `layout_multiview` whose texture
binding is `texture_2d_array` — and `create_bind_group` takes the
multiview count to pick between them.
`BlitPipeline::specialize` chooses the layout from
`key.multiview_view_count > 1` and emits `MULTIVIEW` + `MAX_VIEW_COUNT`
shader-defs into the fragment stage when so (mirroring L6's mesh-key
threading). The pipeline descriptor's own `multiview_mask` stays
`None`, matching L6: the shader machinery is in place but the
render-pass-level multiview enablement is deferred.
The two systems that currently build `BlitPipelineKey` —
`prepare_view_upscaling_pipelines` in `bevy_core_pipeline::upscaling`
and `prepare_msaa_writeback_pipelines` in
`bevy_post_process::msaa_writeback` — now `Option<&ExtractedMultiview>`
the camera and feed `subviews.len()` (or 1 when absent) into the key.
Their render-pass nodes pass `target.multiview_count()` from the
already-L5-aware `ViewTarget` to `create_bind_group`, picking the
matching layout.
`blit.wgsl` switches from a locally-declared `in_texture` binding to
`#import bevy_core_pipeline::input_texture::{sample_input,
current_view_index}`. The `fs_main` entry point takes
`@builtin(view_index)` under `#ifdef MULTIVIEW` and assigns it to
`current_view_index` at the top of the body — same shape as L6's mesh
and prepass entry points.
This is the first consumer of the `bevy_core_pipeline::input_texture`
helper module added in C0 (`d12900fc2`).
Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); 3d_scene screenshot matches the session-2/
4/5 baseline (blue cube + circular plane + shadow). On macOS Metal the
multiview branch can't actually be exercised at runtime (no Vulkan
multiview), so the multiview render path is static-reasoning until a
Vulkan host runs it; the non-multiview path is the verified one.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Of bloom's four pipeline specializations (first downsample, regular
downsample, regular upsample, final upsample), only the first
downsample reads from the camera's main texture — every other pass
samples bloom's own mip pyramid, which `prepare_bloom_textures` builds
as a single-layer 2D texture irrespective of the camera. Multiview
specialization is therefore scoped to that one pass.
`BloomDownsamplingPipelineKeys` gains `multiview_view_count: u32`,
which `prepare_downsampling_pipeline` reads from the camera's optional
`ExtractedMultiview` (1 when absent) and threads only into the
`first_downsample = true` specialization; the `first_downsample =
false` specialization is locked to count = 1 so the regular downsample
pipeline continues to bind the bloom mip texture as `texture_2d`.
`BloomDownsamplingPipeline` now carries two layouts: the existing
`bind_group_layout` plus a `bind_group_layout_multiview` whose texture
binding is `texture_2d_array`. `specialize` picks the array layout +
emits `MULTIVIEW` and `MAX_VIEW_COUNT` shader-defs only when both
conditions hold. The bloom node's inline-built first-downsample bind
group picks the matching layout via `view_target.multiview_count()`.
`bloom.wgsl` switches from a locally-declared `input_texture` to
`#import bevy_core_pipeline::input_texture::{input_texture,
sample_input, current_view_index}`. The non-uniform-scale 13-tap path
and the 3x3 tent kernel use the `sample_input` helper (their offsets
are runtime uv arithmetic, not const operands of `textureSample`). The
`UNIFORM_SCALE` 13-tap path duplicates its 13 const-offset samples
under `#ifdef MULTIVIEW` because WGSL requires the `offset` operand of
`textureSample` to be a const-expression — it can't be threaded
through a helper. The duplication is contained to ~13 lines, gated on
the multiview path only.
`@builtin(view_index)` plumbing is added to the `downsample_first`
fragment entry point only (the `downsample` and `upsample` entry
points never see `MULTIVIEW`, so their `texture_2d` binding and
helper-less sampling shapes are preserved).
Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); both `bloom_3d` (multi-tap blur on
colored spheres) and `3d_scene` (no bloom, regression check) match
the session baseline. Multiview path is static-reasoning until a
Vulkan host runs it; on macOS Metal the multiview branch can't be
exercised at runtime.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`FxaaPipelineKey` gains `multiview_view_count: u32`, read by
`prepare_fxaa_pipelines` from the optional `ExtractedMultiview` (1 when
absent). `FxaaPipeline` carries a second `texture_bind_group_multiview`
layout whose texture binding is `texture_2d_array`. `specialize` picks
the array layout + emits `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs
when count > 1; the descriptor's `multiview_mask` stays `None`, matching
L6 and L7a's other pipelines.
The FXAA node picks the matching layout via
`view_target.multiview_count()` when building the cached bind group.
The cache key (source `TextureViewId`) already invalidates across
multiview/non-multiview state changes — the texture view IDs differ
because the underlying `ViewTarget` swaps between single-layer and
array-layer textures.
`fxaa.wgsl` switches from a locally-declared `screenTexture` (renamed
to `input_texture`) to `#import bevy_core_pipeline::input_texture::
{input_texture, sample_input_level, current_view_index}`. Of the 12
`textureSampleLevel` callsites:
- 4 no-offset reads (center + 3 endpoint samples + final-uv read) and
the in-loop endpoint refreshes now go through `sample_input_level`.
- 8 const-offset reads (4 cardinal-neighbor lumas + 4 corner lumas)
duplicate under `#ifdef MULTIVIEW` because WGSL requires the
`offset` operand to be a const-expression and can't be threaded
through a helper.
`@builtin(view_index)` is plumbed into the `fragment` entry point
under `#ifdef MULTIVIEW` and assigned to `current_view_index` at the
top of the body.
Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); `anti_aliasing` example with FXAA enabled
on the helmet scene renders correctly (no artifacts, no validation
errors). Multiview branch is static-reasoning until a Vulkan host
runs it; on macOS Metal only the non-multiview path is exercised.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for L7b-read (PBR mesh screen-space texture conversion): add `MULTIVIEW` to `MeshPipelineViewLayoutKey`, derive it from `MeshPipelineKey::max_view_count() > 1` in the `From` impl, and OR it into the per-frame layout key in `prepare_mesh_view_bind_groups` based on `ViewTarget::multiview_count()`. Switch binding 17 (`screen_space_ambient_occlusion_texture`) to `texture_2d_array<f32>` under MULTIVIEW in `mesh_view_bindings.wgsl` and in `layout_entries`. The underlying SSAO texture stays single-layer (`depth_or_array_layers = 1`); the bind group entry creates a `TextureViewDimension::D2Array` view of it inline when multiview is active. Layer growth + per-eye SSAO writes are deferred to L7b-write. Thread `current_view_index` into the two SSAO readers via a verbose `#ifdef MULTIVIEW` branch around `textureLoad`, since WGSL's `texture_2d` and `texture_2d_array` `textureLoad` signatures differ (`array_index` goes between `coords` and `level`). `deferred_lighting.wgsl::fragment` also gains `@builtin(view_index)` + the assignment to `current_view_index` that PBR's main fragment already has from L6. Non-multiview path is bit-identical (smoke verified on the `ssao` example against pre-C1 baseline; 3d_scene regression preserved). multiview branch is unexercised on macOS Metal but the layout/view shapes line up with the WGSL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-read C2 — depth, normal, motion-vector, and deferred prepass texture bindings (binding indices 20-23) switch to `_array` variants under `#ifdef MULTIVIEW` in `mesh_view_bindings.wgsl`. Host-side `prepass::get_bind_group_layout_entries` learns the same switch and `prepass::get_bindings` grows two flags (`multiview_array` for bindings 20-22, `deferred_multiview` for binding 23) so the caller can request `D2Array` views of the still-single-layer prepass textures. WGSL has no multisampled-array texture types, so MSAA + multiview keeps the single-layer multisampled shape (rare combo in practice). WGSL consumers thread `current_view_index` as the array layer in their `textureLoad`/`textureSampleLevel` calls via `#ifdef MULTIVIEW` duplication: - `bevy_pbr::prepass_utils` (depth/normal/motion-vector reads) - `bevy_pbr::pbr_deferred_functions` indirectly via `bevy_pbr::deferred_lighting::fragment` (deferred read at line 49) — the fragment also gains `@builtin(view_index)` + the assignment to `current_view_index` for its own deferred read and the SSAO read introduced in C1. - `bevy_pbr::ssr` (SSR fragment) + `bevy_pbr::raymarch` (depth fetch and bilinear/nearest sample helpers). - `bevy_dev_tools::debug_overlay` (7 textureLoad sites across depth/normal/motion-vector/deferred preview modes). Pipeline specialize now also pushes the `MULTIVIEW` shader-def from the layout-key bit, paralleling the existing `MULTISAMPLED` emission. `textureDimensions` calls work unchanged on array textures, so `pbr_functions.wgsl::317` and `ssr.wgsl::103` stay as-is. The deferred prepass texture is never multisampled, so its multiview switch is unconditional on the MULTIVIEW bit. Non-multiview path is bit-identical (smoke verified on `deferred_rendering`, `anti_aliasing` with TAA via a temporary `Msaa::Off`+`TemporalAntiAliasing::default()` on the example camera that was reverted after, and `3d_scene` regression). multiview branch is unexercised on macOS Metal but the layout/view shapes line up with the WGSL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-read C3 — `view_transmission_texture` (binding 24) switches to `texture_2d_array<f32>` under `#ifdef MULTIVIEW` in `mesh_view_bindings.wgsl` and in the host-side bind group layout. The two `textureSampleLevel` reads in `transmission.wgsl` (`fetch_transmissive_background_non_rough` + the main spiral-tap `fetch_transmissive_background`) duplicate under `#ifdef MULTIVIEW` to pass `view_bindings::current_view_index` between `coords` and `level` — WGSL's `texture_2d` vs `texture_2d_array` `textureSampleLevel` signatures differ on the array-layer parameter, same as the `textureLoad` pattern from C1/C2. Caller (`pbr_input_from_standard_material`) is invoked from `pbr.wgsl::fragment`, which already sets `current_view_index` from L6. Bind-group construction in `prepare_mesh_view_bind_groups` creates a fresh `TextureViewDimension::D2Array` view of the still-single-layer transmission texture (or, when the camera has no transmission setup this frame, of the `FallbackImageZero` texture) when `is_multiview`, keeping the default_view path otherwise. Per-eye transmission writes + layer growth are deferred to L7b-write. Non-multiview path is bit-identical (smoke verified on `transmission` + `3d_scene` regression). multiview branch is unexercised on macOS Metal but the layout/view shapes line up with the WGSL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-read follow-up F1. Both `DeferredLightingLayout::specialize` and `ScreenSpaceReflectionsPipeline::specialize` already derive their view bind-group layout from a `MeshPipelineViewLayoutKey` that — as of L7b-read C1 — gains the `MULTIVIEW` bit when `max_view_count > 1` via the `From<MeshPipelineKey>` impl. Under multiview that picks the array-typed layout (binding 17 `texture_2d_array<f32>`, binding 20 `texture_depth_2d_array`, binding 23 `texture_2d_array<u32>`, etc.) but neither pipeline pushed the `MULTIVIEW` shader-def, so their WGSL (`deferred_lighting.wgsl`, `ssr.wgsl`, `raymarch.wgsl` — all modified by C1/C2) would compile against the non-multiview `#else` branches in `mesh_view_bindings.wgsl` and mismatch the layout's array texture types. Pipeline creation would fail wgpu validation the moment a multiview camera fires. Mirror the pattern the other consumers of `MeshPipelineViewLayoutKey` already use: - `DeferredLightingLayout::specialize` pushes `MULTIVIEW` from `key.max_view_count() > 1` (parallels `MeshPipeline::specialize` at `mesh.rs:3330` which has done this since L6). - `ScreenSpaceReflectionsPipeline::specialize` pushes `MULTIVIEW` from `key.mesh_pipeline_view_key.contains(MULTIVIEW)` (parallels `RenderDebugOverlayPipeline::specialize` which C2 already updated). Non-multiview path is bit-identical (no def pushed when the bit isn't set; verified via `ssr`, `deferred_rendering`, and `3d_scene` smoke screenshots against the pre-F1 baseline). multiview branch is unexercised on macOS Metal but the layout/WGSL/views now agree shape-wise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-write C1 — first of three commits growing the screen-space texture
families converted by L7b-read (SSAO / prepass / transmission) from
`depth_or_array_layers = 1` to `view_count` and routing each eye's writes
into its own slice.
`prepare_ssao_textures` derives `view_count` from `Option<&ExtractedMultiview>`
(falls back to 1 with no component) and sets `depth_or_array_layers =
view_count` on all four SSAO textures (preprocessed depth + 5 mips, noisy,
output, depth_differences). Non-multiview cameras allocate single-layer
textures — bit-identical to the pre-L7b-write shape.
`SsaoBindGroups` keeps a single `common_bind_group` (the view-uniform
binding, which is per-camera not per-eye) but the three pipeline-specific
groups become `Vec<SsaoPerViewBindGroups>` — one entry per eye. Each
entry's storage views are explicit single-layer `D2` views
(`base_array_layer: eye`, `array_layer_count: Some(1)`, `dimension: D2`)
into the eye's slice of each SSAO texture. The prepass depth/normal reads
also become per-layer `D2` views of the prepass attachment textures when
multiview is active; the non-multiview branch keeps the existing
`prepass_textures.{depth,normal}_view()` helpers (which return
`default_view`, currently single-layer). Once L7b-write C2 lands the
prepass textures grow to `view_count` layers and the per-layer helper
closures here will index real per-eye data.
The `ssao` render-graph node wraps its three compute passes in a
`for per_view in &bind_groups.per_view` loop — for non-multiview cameras
this is one iteration with the same bind groups today's code already
built, so the dispatch shape is unchanged.
Out of scope for this commit (and the rest of L7b-write):
- SSAO's `@group(1) @binding(2) var<uniform> view: View` still reads the
camera's "eye 0" view-matrix slot — for multiview both eyes get
reconstructed against the head-pose camera matrices. Real per-eye
matrices need either an L6-style array-binding rewrite of the SSAO
shaders or per-eye dynamic offsets into the packed view-uniform
buffer (DynamicArrayUniformBuffer slots are sized per-array, not
per-element). Documented for L8/L9.
- Bind-group layouts stay `texture_storage_2d` / `texture_2d`. With
per-eye single-layer `D2` views the existing layout fits without a
layout-key MULTIVIEW bit; no SSAO WGSL changes either.
Non-multiview path verified bit-identical (smoke on the `ssao` example
matches the post-rebase baseline; AO shading on the room walls + sphere
unchanged). multiview branch is unexercised on macOS Metal but the
view dimensions match the layout entries.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7-skybox — converts the skybox pipeline's `View` uniform binding to the
same `array<View, MAX_VIEW_COUNT>` shape L6 introduced for the mesh +
prepass view binding. Punted from L7a in session 6 because the skybox
is structurally a view-uniform rewrite (typed `View` binding, cubemap,
fragment-side `view.X` reads) rather than an L7a-style texture-array
conversion.
`skybox.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
`var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
returning `view_array[current_view_index]`. Per-site rewrite: `view.X`
→ `view().X` (3 reads in `coords_to_ray_direction`, 1 in
`skybox_fragment`).
- New `FragmentInput` struct so the fragment can pick up
`@builtin(view_index)` under `#ifdef MULTIVIEW` and assign it to
`current_view_index` at the top of the body. The vertex stage doesn't
read `view` at all (it just generates a fullscreen triangle from
`vertex_index`), so it stays untouched.
`skybox/mod.rs`:
- Layout entry switches from `uniform_buffer::<ViewUniform>(true)` to
`uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
- `SkyboxPipelineKey` gains a `multiview_view_count: u32` field; the
specialize pushes `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs when the
count is `> 1` (mirrors `MeshPipeline::specialize` in `mesh.rs`).
- `prepare_skybox_pipelines` reads `Option<&ExtractedMultiview>` and
sets `multiview_view_count` from `subviews.len()` (falls back to 1
with no component). Source matches the L6 mesh + prepass convention.
`descriptor.multiview_mask` stays `None` — same deferral as L6/L7a/L7b.
The wgpu render-pass enablement is L7d's job. On non-multiview Vulkan
hosts the pipeline compiles into the existing single-view shape; on
multiview Vulkan hosts pipeline creation will still fail validation
because `@builtin(view_index)` is read without a `multiview_mask`.
Out of scope:
- Per-eye skybox sampling correctness (rotation, parallax) is the right
shape now but unexercised on macOS Metal. The math in
`coords_to_ray_direction` reads `view().view_from_clip` and
`view().world_from_view` per eye, so once L7d enables single-pass
multiview the skybox will produce per-eye correct rays.
- Skybox-prepass is a separate pipeline (in
`skybox/prepass.rs` per awtterpip's design notes) and isn't shipped
in current Bevy — no work needed here.
Non-multiview path verified bit-identical (smoke on the `skybox`
example with `--features bevy_ci_testing free_camera` matches the prior
baseline; Ryfjallet cubemap + red hut + ground render cleanly).
multiview branch is unexercised on macOS Metal but the binding /
layout / shader-def shapes line up with L6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-write C1 (`80f52dd9d`) grew the SSAO textures to `view_count` layers
and started creating per-layer `D2` views of the prepass depth/normal
attachment textures for each eye:
let prepass_depth_view = is_multiview.then(|| {
prepass_layer_view(
&prepass_textures.depth.as_ref().unwrap().texture.texture,
layer, "ssao_prepass_depth_layer_view",
)
});
`is_multiview` is derived from the SSAO textures' layer count. But the
prepass attachment textures themselves are still single-layer — C2
(prepass texture growth) is deferred. So for `layer >= 1`,
`base_array_layer: layer, array_layer_count: Some(1)` of a 1-layer
texture is a wgpu validation error. C1's commit body acknowledged the
"once C2 lands the per-layer helper closures here will index real
per-eye data" temporal aspect but missed that the intermediate state
errors at view creation rather than silently producing wrong content.
`prepare_ssao_bind_groups` runs unconditionally per frame, so any
multiview-configured camera with `DepthPrepass + NormalPrepass + SSAO`
would trip the validation regardless of pipeline-creation outcome.
Gate the per-layer prepass view creation on the prepass texture's
actual `depth_or_array_layers > 1`, not on the SSAO texture's. Until
C2 lands, both `prepass_depth_multilayer` and `prepass_normal_multilayer`
are `false` and all eyes fall back to
`prepass_textures.{depth,normal}_view()` (default_view, single-layer).
SSAO output for `eye >= 1` is then computed against eye-0 prepass data
— content-incorrect but matching the still-eye-0 read of `view: View`
documented in C1's body, and no crash. When C2 grows the prepass
textures, the flags auto-flip and the existing per-layer view code
becomes valid with no further SSAO-side change.
Non-multiview path unchanged: prepass textures stay 1-layer, flags
stay false, no per-layer prepass views ever created. cargo check
clean; lib tests 12 + 43 + 2; ssao + skybox smoke screenshots match
the post-L7b-write-C1 baselines.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7c — first pipeline in the L7c tail; converts the tonemapping
post-process pipeline's `View` uniform binding to the same
`array<View, MAX_VIEW_COUNT>` shape L6 introduced for the mesh + prepass
view binding and L7-skybox mirrored. Same L6-shape pattern, applied to
tonemapping's smaller surface (single `view.X` read; uses the existing
`FullscreenVertexOutput` so it follows blit's direct
`@builtin(view_index)` parameter rather than skybox's wrapper struct).
`tonemapping.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
`var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
returning `view_array[current_view_index]`.
- Single per-site rewrite: `view.color_grading` → `view().color_grading`.
- Fragment signature gains `@builtin(view_index) view_index: i32` as a
direct parameter under `#ifdef MULTIVIEW` (mirrors `blit.wgsl::fs_main`).
Body assigns `current_view_index = view_index;` at top under the same
gate.
`tonemapping/mod.rs`:
- Layout entry 0 switches from `uniform_buffer::<ViewUniform>(true)` to
`uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
Import swap: drops `uniform_buffer` + `ViewUniform`, picks up
`uniform_buffer_sized` and `ExtractedMultiview`.
- `TonemappingPipelineKey` gains a `multiview_view_count: u32` field;
`specialize` pushes `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs when
the count is `> 1` (mirrors `SkyboxPipeline::specialize` and
`MeshPipeline::specialize`).
- `prepare_view_tonemapping_pipelines` reads
`Option<&ExtractedMultiview>` and sets `multiview_view_count` from
`subviews.len()` (falls back to 1 with no component). Source matches
the L6 mesh + prepass + L7-skybox convention.
`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7-skybox
deferral. The wgpu render-pass enablement is L7d's job. On non-multiview
hosts the pipeline compiles into the existing single-view shape; on
multiview Vulkan hosts pipeline creation will still fail validation
because `@builtin(view_index)` is read without a `multiview_mask`.
Out of scope:
- `hdr_texture` (binding 1) stays single-layer `texture_2d<f32>`. The
post-process source texture isn't part of L7b-write yet (only SSAO
has shipped from L7b-write; C2 prepass + C3 transmission are punted),
so the tonemapping input is still a single-layer view of
`target.post_process_write().source`. When L7b-write grows the
post-process color texture to per-eye layers, tonemapping's source
binding will need its own L7b-read-shape conversion.
- LUT bindings 3 + 4 are unrelated to multiview — both stay
`texture_3d<f32>` + sampler.
- The node-side `TonemappingBindGroupCache` shape is unchanged. Cache
key is `(view_uniforms.buffer.id, source.id, lut.id)`; per-view
selection inside the packed `DynamicArrayUniformBuffer` is still
driven by `view_uniform_offset.offset` at `set_bind_group` call time,
so cache hits remain correct across views.
Non-multiview path verified bit-identical: `cargo check --workspace`
clean; lib tests bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2,
bevy_core_pipeline 0/0; smoke on `3d_scene --features bevy_ci_testing`
(default Camera3d → `hdr: true` + `Tonemapping::TonyMcMapface`, so the
tonemapping pipeline including the LUT path actually ran) renders blue
cube + shadow on circular ground cleanly. multiview branch is
unexercised on macOS Metal but the binding / layout / shader-def shapes
line up with L6 + L7-skybox.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7c — second pipeline in the L7c tail (after tonemapping); converts the
OIT resolve pipeline's `View` uniform binding to the same
`array<View, MAX_VIEW_COUNT>` shape L6 introduced for the mesh + prepass
view binding and L7-skybox + L7c-tonemapping mirrored. Smallest possible
L7c-shape diff — a single `view.X` read site (`view.viewport.z`) and an
already-local `FullscreenVertexOutput` struct so the fragment can pick
up `@builtin(view_index)` as a direct parameter (mirrors `blit.wgsl::fs_main`
and `tonemapping.wgsl::fragment`).
`oit_resolve.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
`var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
returning `view_array[current_view_index]`.
- Single per-site rewrite: `view.viewport.z` → `view().viewport.z`.
- Fragment signature gains `@builtin(view_index) view_index: i32` as a
direct parameter under `#ifdef MULTIVIEW`. Body assigns
`current_view_index = view_index;` at top under the same gate.
`oit/resolve/mod.rs`:
- Layout entry 0 switches from `uniform_buffer::<ViewUniform>(true)` to
`uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
Import swap: drops `uniform_buffer` + `ViewUniform`, picks up
`uniform_buffer_sized` and `ExtractedMultiview`.
- `OitResolvePipelineKey` gains a `multiview_view_count: u32` field;
`specialize_oit_resolve_pipeline` pushes `MULTIVIEW` + `MAX_VIEW_COUNT`
shader-defs when the count is `> 1` (mirrors `SkyboxPipeline::specialize`,
`TonemappingPipeline::specialize`, and `MeshPipeline::specialize`).
- `queue_oit_resolve_pipeline` reads `Option<&ExtractedMultiview>` from
its camera query and sets `multiview_view_count` from `subviews.len()`
(falls back to 1 with no component). The per-entity
`cached_pipeline_id` HashMap already keys on the full
`OitResolvePipelineKey`, so a non-multiview ↔ multiview transition on
the same camera correctly evicts the stale cache entry and queues a
fresh pipeline.
`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7-skybox/
L7c-tonemapping deferral. The wgpu render-pass enablement is L7d's job.
On non-multiview hosts the pipeline compiles into the existing
single-view shape; on multiview Vulkan hosts pipeline creation will
still fail validation because `@builtin(view_index)` is read without a
`multiview_mask`.
Out of scope:
- The `OitResolveBindGroup` is a single global Resource recreated each
frame in `prepare_oit_resolve_bind_group`. With layout entry 0 now
unbounded, the same bind group binds against both the multiview and
non-multiview pipeline shapes. Per-camera selection inside the packed
`DynamicArrayUniformBuffer` is still driven by `view_uniform.offset`
at `set_bind_group` call time (`node.rs:82`).
- Storage buffer bindings 1-3 (`nodes`, `heads`, `atomic_counter`) are
per-screen-pixel + per-pass state, unrelated to the per-view View
uniform. Index math uses `view().viewport.z` (screen width) which is
identical across eyes — multiview correctness will need either a
per-eye linked-list partitioning of the storage buffers (heads array
sized per-eye) or accept that the OIT linked list is screen-space
shared. Future L7b-write / L7d concern, not L7c-shape work.
- Optional group(1) depth bind group (`depth: texture_depth_2d`,
conditional on `!DEPTH_PREPASS`) stays as a non-multiview `D2`
texture view of `ViewDepthTexture` — same gate as the rest of the
prepass depth-view path. When L7b-write grows the depth texture per
eye, this binding will need its own per-layer view treatment.
Non-multiview path verified bit-identical: `cargo check --workspace`
clean; lib tests bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2,
bevy_core_pipeline 0/0; smoke on `order_independent_transparency
--features bevy_ci_testing` renders three overlapping transparent
spheres (red/blue/green) with correct depth-sorted alpha blending and
the "Order Independent Transparency: On" toggle confirming the resolve
pipeline ran. multiview branch is unexercised on macOS Metal but the
binding / layout / shader-def shapes line up with L6 + L7-skybox +
L7c-tonemapping.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7c — third pipeline in the L7c tail (after tonemapping + OIT resolve);
converts the background motion vectors prepass pipeline's `View` uniform
binding to the same `array<View, MAX_VIEW_COUNT>` shape L6 introduced
for the mesh + prepass view binding and L7-skybox + L7c-tonemapping +
L7c-OIT-resolve mirrored. Two `view.X` read sites
(`view.world_from_clip`, `view.unjittered_clip_from_world`).
`background_motion_vectors.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
`var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
returning `view_array[current_view_index]`.
- Per-site rewrites: `view.world_from_clip` → `view().world_from_clip`;
`view.unjittered_clip_from_world` → `view().unjittered_clip_from_world`.
- Fragment signature gains `@builtin(view_index) view_index: i32` as a
direct parameter under `#ifdef MULTIVIEW` (mirrors `blit.wgsl::fs_main`
and other L7c conversions). Body assigns
`current_view_index = view_index;` at top under the same gate.
- `previous_view` binding at `@group(0) @binding(1)` is left untouched
(see "Out of scope" below).
`background_motion_vectors.rs`:
- Layout entry 0 switches from `uniform_buffer::<ViewUniform>(true)` to
`uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
Layout entry 1 (`PreviousViewData` uniform) stays as the existing
typed `uniform_buffer::<PreviousViewData>(true)`. Import swap: drops
`ViewUniform`, adds `uniform_buffer_sized`, `ExtractedMultiview`, and
`ShaderDefVal`.
- `BackgroundMotionVectorsPipelineKey` gains a `multiview_view_count: u32`
field; `specialize` pushes `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs
when the count is `> 1` (mirrors the other L7c pipelines).
- `prepare_background_motion_vectors_pipelines` reads
`Option<&ExtractedMultiview>` from its camera query and sets
`multiview_view_count` from `subviews.len()` (falls back to 1 with no
component). Source matches the L6 mesh + prepass + L7-skybox +
L7c-tonemapping + L7c-OIT-resolve convention.
`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7-skybox/
L7c-tonemapping/L7c-OIT-resolve deferral. The wgpu render-pass
enablement is L7d's job.
Out of scope:
- `previous_view: PreviousViewUniforms` binding (group 0 / binding 1).
This is a per-camera (single-eye-equivalent) uniform sourced from
`PreviousViewUniforms` resource — separate plumbing from the multiview
`ViewUniform` packed array. Under multiview today, all eyes read the
same `previous_view.clip_from_world`, which means eye-1's sky-pixel
motion vector subtracts eye-0-derived previous clip-space from
eye-1-derived current clip-space — incorrect for stereo VR where
each eye has its own previous transform. Converting `PreviousViewData`
to a packed-array per-camera shape parallels L4's
`DynamicArrayUniformBuffer<ViewUniform>` work — future L8-style
refactor that lives in `crates/bevy_core_pipeline/src/prepass/mod.rs`,
not L7c-shape work. Today this is non-issue because the multiview
branch is unexercised on Metal anyway, and once L7d enables it on
Vulkan, only sky-pixel motion vectors on eye>=1 are affected (TAA +
motion blur on the background would smear slightly wrong for the
second eye until L8 lands).
- The pipeline bind-group is per-camera (created in
`prepare_background_motion_vectors_bind_groups`) and inserted as a
`BackgroundMotionVectorsBindGroup` component. With layout entry 0 now
unbounded, the same bind group binds against both pipeline variants.
Per-eye selection inside the packed `DynamicArrayUniformBuffer` is
driven by `view_uniform_offset.offset` at the node's `set_bind_group`
call site (in `prepass/node.rs`, unchanged).
- Layout entry 1's `PreviousViewData` is sized to a single struct (not
unbounded). The existing typed `uniform_buffer::<PreviousViewData>(true)`
declaration is correct for the single-per-camera shape and doesn't
need the L7c loosening treatment.
Non-multiview path verified bit-identical: `cargo check --workspace`
clean; lib tests bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2,
bevy_core_pipeline 0/0; smoke on `motion_blur --features
bevy_ci_testing` renders the red car on the road with motion-blurred
trees + balls + blue sky background. The `MotionBlur` component
auto-requires `MotionVectorPrepass`, so the background motion vectors
pipeline ran on the sky pixels and TAA/motion-blur consumed the
resulting motion-vector attachment. multiview branch is unexercised on
macOS Metal but the binding / layout / shader-def shapes line up with
L6 + the other L7c conversions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the L6 mesh-path / L7c skybox + post-process shape but in
`bevy_pbr`: the two GPU clustering shaders (`cluster_z_slice.wgsl`
compute and `cluster_raster.wgsl` graphics) switch their `var<uniform>
view: View` binding to a runtime-sized `var<uniform> view_array:
array<View, MAX_VIEW_COUNT>` with a 1-element fallback when
`MAX_VIEW_COUNT` is undefined, plus a `var<private> current_view_index`
and a `view()` helper that indexes the array. Per-site `view.X` reads
become `view().X` (5 sites in z-slice, 8 in raster; the local `let view
= view_from_clip * clip` shadow in `clip_to_view` is preserved). Host
side, both pipeline layouts switch binding 6 to
`uniform_buffer_sized(true, None)` so the WGSL fallback `array<View,
1>` and the multiview `array<View, MAX_VIEW_COUNT>` both bind cleanly
against the existing per-camera `DynamicArrayUniformBuffer<ViewUniform>`
slot (the dynamic offset already in place keeps selecting the
per-camera array slot in the packed buffer).
`ClusteringRasterPipelineKey` gains `multiview_view_count: u32` and a
new `ClusteringZSlicingPipelineKey` replaces the unit key on the
z-slicing pipeline. Both specialize impls push a `MAX_VIEW_COUNT`
shader def when the count is `>1`. `prepare_clustering_pipelines` now
reads `Option<&ExtractedMultiview>` from the view query and threads the
subview count through to every clustering specialization.
Clustering reads `view()` at the default `current_view_index = 0` —
i.e. eye 0's head pose — unconditionally. This is deliberate: the
GPU clustering output is a single set of storage buffers per camera
shared across all eyes (`ViewGpuClusteringBuffers` +
`ViewClusteringBindGroups` are inserted once per `ExtractedView`), and
threading `@builtin(view_index)` into the fragment would diverge
cluster assignments across eyes for shared output buffers. Eye-1 of a
multiview camera therefore consumes a cluster grid built from eye-0's
view-matrix; the resulting eye-1 culling is slightly conservative for
objects near eye-1's frustum edges. Same shape as the L7b-write C1
SSAO compromise; per-eye clustering would require splitting the
clustering output buffers per eye, which is future work paired with
the prepass / L7d multiview-mask refactor.
`ClusteringAllocationPipeline` has no view binding (its layout binds
only the cluster offsets, lights, metadata, and scratchpad buffers)
and is untouched. The MULTIVIEW shader def is not pushed: it's only
needed to gate `@builtin(view_index)` plumbing in the WGSL, and
clustering doesn't thread view_index.
Non-multiview cameras get `multiview_view_count = 1` → no MAX_VIEW_COUNT
def → `array<View, 1>` fallback → bit-identical behavior to the
pre-conversion single-`View` binding.
Smoke: `3d_scene --features bevy_ci_testing` confirms GPU clustering
runs ("GPU clustering is supported on this device.") and the default
single-point-light scene renders correctly with the cube shadowed by
the clustered point light. `lighting --features bevy_ci_testing`
exercises the count + populate raster passes plus z-slicing across
multiple point and spot lights with colored floor projections; matches
the standard `lighting` example baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First sub-changes of the L7b-write C2 prepass refactor (A + B + C from .scratch/session15_l7d_planning.md). No-op-shaped on non-multiview cameras; new API is unused this commit — session 17 wires it into the prepass / deferred render-graph nodes for per-eye dispatch. A. Grow prepass + view-depth textures to per-eye layer count. `prepare_prepass_textures` and `prepare_core_3d_depth_textures` now read `Option<&ExtractedMultiview>` and derive `view_count = m.subviews.len()` (1 for non-multiview cameras, matching the SSAO C1 shape from 80f52dd). Each `TextureDescriptor` literal sets `size.depth_or_array_layers = view_count`: - prepass: depth_1, depth_2, normal, motion_vectors, deferred_1, deferred_2, deferred_lighting_pass_id (7 sites). - depth: view_depth_texture (1 site). `ViewPrepassTextures.size` flows through to the end-of-prepass-node depth copy; both source (view depth) and dest (prepass depth) now share the same per-eye layer count, so the copy is correct under multiview. On non-multiview cameras every descriptor stays single-layer → bit-identical to the pre-C2 shape. B. Disambiguate `TextureCache` keys with `view_count`. Each of the 7 prepass HashMap keys and the 1 depth HashMap key gain `view_count` so a multiview camera and a non-multiview camera sharing the same render target don't collide on a cached texture of the wrong layer count. Same shape as the existing `(camera.target, msaa)` key on the depth path. C. `ColorAttachment` / `DepthAttachment` per-layer view exposure. New `ColorAttachment::get_attachment_for_layer(layer)` and `get_unsampled_attachment_for_layer(layer)` synthesize per-layer `D2` `TextureView`s of the underlying (possibly multi-layer) texture and resolve target, lazily cached in `Arc<OnceLock<Vec<TextureView>>>` populated all-at-once on first per-layer access. Shared across `ColorAttachment` clones via `Arc`, matching the existing `is_first_call: Arc<AtomicBool>` cross-clone-sharing pattern. `DepthAttachment` gains an optional `multi_layer_texture: Option<Texture>` field and a new `DepthAttachment::new_multi_layer(texture, view, clear)` constructor that stores the underlying `Texture` handle so the new `get_attachment_for_layer(layer, store)` can build per-layer views. The existing `new(view, clear)` constructor is unchanged (leaves the field as `None`); session 17 will update `ViewDepthTexture::new` to thread the texture handle through. `get_attachment_for_layer` panics if called on a `new`-constructed attachment, since the underlying texture handle isn't available — the panic message names the `new_multi_layer` constructor. Both per-layer accessors preserve the existing first-call clear-vs-load semantics: ColorAttachment uses `Ordering::SeqCst` like its sampled sibling; DepthAttachment uses `Ordering::Relaxed` + `clear_value.unwrap()` like its sibling (`is_first_call` is still initialized from `clear_value.is_some()` so the unwrap is safe). For `view_count = 1` the per-layer-0 view is bit-identical to `default_view`. Notable shape decisions: - D2 (per-eye dispatch) over D1 (broadcast / multiview_mask). Session-15 plan locked this in; per-eye dispatch avoids exposing class-(b) atomic broadcast hazards in oit_draw / cluster raster fragment paths under L7d. Documented as a deviation from awtterpip's PR bevyengine#16059 design. - The per-layer view cache lives behind `Arc<OnceLock<Vec<TextureView>>>` rather than per-call view creation so the returned `&TextureView` borrows from the attachment for the lifetime of the render-pass descriptor. ColorAttachment is re-constructed each frame in `prepare_prepass_textures` so the OnceLock is per-frame — no staleness risk if the underlying texture is recreated by a camera resize. Diff budget: ~180 net lines across 2 files (session-15 plan estimated ~80). Overhead is in `texture_attachment.rs` (~155 vs planned ~50): both sampled + unsampled ColorAttachment variants, the shared `build_per_layer_d2_views` helper, and per-method docstrings. Verification: - `cargo check --workspace` clean. - Lib tests: bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2. - Screenshot smokes vs baseline: - `3d_scene --features bevy_ci_testing` — byte-identical to /tmp/post-rebase-3d_scene.png (697355 bytes). - `ssao --features bevy_ci_testing` — visually correct (AO shading on sphere/cube/room corners present); SSAO F1 multilayer-gate stays `false` because prepass textures are still single-layer on this non-multiview camera (view_count = 1). - `deferred_rendering --features bevy_ci_testing` — renders correctly with Deferred mode active (3 colored spheres, helmet bust, helmet with flame card all shaded). No baseline existed pre-C2; saved at /tmp/c2-abc-deferred.png for future comparison. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the render-pass body in `run_prepass_system` and `run_deferred_prepass_system` in `for eye in 0..view_count`, swapping every color + depth `get_attachment()` for the per-layer `get_attachment_for_layer(eye)` introduced in `c315f71a3`. `view_count` is derived from the new `Option<&ExtractedMultiview>` query field, mirroring SSAO C1's shape from `80f52dd9d`; non-multiview cameras have `view_count = 1` and exercise the per-layer path against the existing single-layer prepass + depth textures. `ViewDepthTexture::new` is upgraded to call `DepthAttachment::new_multi_layer(...)` so per-eye depth attachment access works at all. Without this the per-eye dispatch would panic at first `get_attachment_for_layer` call per session 16's panic-on-misuse design. `ViewDepthTexture` also gains a thin `get_attachment_for_layer` delegate so the attachment field can stay private. End-of-pass depth-copy blocks at the end of both nodes stay outside the eye loop and outside `copy_texture_to_texture`'s single call: post-C2 sub-A both source and destination depth textures carry `depth_or_array_layers = view_count`, so one call copies every layer. The webgl `clear_texture` block in the deferred node also moves outside the loop -- it operates on the whole texture via the default `ImageSubresourceRange`, so one call suffices. This is C2 sub-changes D2 + F from session-15's L7b-write plan. Together with `c315f71a3` (texture growth + per-layer attachment API), this completes the C2 prepass refactor: prepass + deferred render-graph nodes now dispatch per layer of the multiview depth + prepass texture arrays. D2 (per-eye dispatch) is the design choice over D1 (broadcast) recorded in session 15; it's a deviation from awtterpip's PR bevyengine#16059 design that needs to surface in the eventual reference-PR description. Smoke (all on macOS Metal, `--features bevy_ci_testing`): * `3d_scene` -- byte-identical (697355 bytes) to `/tmp/c2-abc-3d_scene.png` (forward-only, no DepthPrepass; strongest read-side no-op witness via the `ViewDepthTexture::new` upgrade). * `deferred_rendering` -- byte-identical (3628497 bytes) to `/tmp/c2-abc-deferred.png` (deferred node per-eye dispatch no-op-shaped on view_count=1). * `ssao` -- visually correct, depth+normal prepass active. Camera animation drifts so byte-match isn't expected. * `motion_blur` -- visually correct, motion-vectors prepass active. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`ColorAttachment` and `DepthAttachment` previously shared a single `is_first_call: Arc<AtomicBool>` across every per-layer attachment access in a frame, so under per-eye dispatch (session 18, `ac409d6a3`) only the first eye's call to `get_attachment_for_layer` cleared its layer; subsequent eyes saw `first_call = false` and emitted `LoadOp::Load` on a layer that had never been written this frame. Result: layers 1..N depth-test against undefined memory, and the per-layer color textures (normal, motion_vectors, deferred, deferred_lighting_pass_id) accumulate frame-over-frame leftover data instead of clearing. The bug landed silently in `c315f71a3` (session 16) -- which introduced the per-layer accessors -- but only became reachable once session 18's per-eye dispatch made `get_attachment_for_layer` a real per-layer caller. macOS Metal has no wgpu multiview support so smoke can't reproduce it; this hardening is forward-correctness for the per-eye dispatch path on any wgpu backend that actually exercises `view_count > 1`. Fix: lazily-populated `per_layer_first_call: Arc<OnceLock<Vec<AtomicBool>>>` mirroring the `per_layer_views` cache shape, one slot per layer of the underlying texture. Each `get_attachment_for_layer(layer)` flips its own slot, and also flips the global `is_first_call` to false so legacy `get_attachment` callers (e.g., main opaque / transparent pass running after the prepass per-eye loop on `ViewDepthTexture`) still load the per-layer-cleared depth instead of re-clearing it. `mark_as_cleared` and `prepare_for_new_frame` also flip every initialized per-layer slot. Single-layer textures (`view_count = 1`) initialize a one-slot vec on first per-layer access and stay bit-identical to the old behavior: layer 0's slot starts true (matching the old global init), gets flipped false on first call (matching old fetch_and), and stays false (matching old single-shared-AtomicBool). Smoke (macOS Metal, `--features bevy_ci_testing`): * `3d_scene` -- byte-identical (697355 bytes) to `/tmp/post-rebase-3d_scene.png` -- confirms read-side no-op on the depth attachment. * `deferred_rendering` -- byte-identical (3628497 bytes) to `/tmp/c2-abc-deferred.png` -- confirms no-op on the deferred node's 4 per-layer color attachments + 1 per-layer depth attachment under `view_count = 1`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit follow-up A from `.scratch/session17_audit_planning.md` §4.1: applies pattern (a) (binding-layout swap to D2Array under MULTIVIEW) to the first `view_depth_texture.view()` consumer pipeline. Same L7c shape as skybox / tonemapping / OIT-resolve / bg-motion-vectors, adapted to the existing MULTISAMPLED + DENSITY_TEXTURE bind-group- layout-key surface. `volumetric_fog.wgsl`: - `depth_texture` binding declaration nests `#ifdef MULTIVIEW` inside the non-multisampled branch — `texture_depth_2d_array` under `MULTIVIEW && !MULTISAMPLED`, `texture_depth_2d` otherwise. Same shape as the prepass-texture bindings in `mesh_view_bindings.wgsl`. Multisampled branch stays single-layer regardless of MULTIVIEW (see "MSAA + multiview" note below). - Fragment signature gains `@builtin(view_index) view_index: i32` as a direct parameter under `#ifdef MULTIVIEW` (mirrors `blit.wgsl::fs_main` and other L7c conversions). Body assigns `current_view_index = view_index;` at top under the same gate so the existing `view()` helper (already used at lines 131/200/341 per the L6 view-binding conversion in `04ed678bf`) reads the correct eye's `View`. - Single `textureLoad` read site adds the `view_index` layer argument under `MULTIVIEW && !MULTISAMPLED`. Both multisampled branches and the non-multiview branch keep the existing 3-arg `textureLoad(_, frag_xy, 0)` shape. `volumetric_fog/render.rs`: - `VolumetricFogBindGroupLayoutKey` grows from 2 bits to 3 with a new `MULTIVIEW = 0x4` flag. The `VOLUMETRIC_FOG_BIND_GROUP_LAYOUT_COUNT` const (= `all().bits() + 1`) automatically grows to 8 layouts; the four MSAA+MULTIVIEW combinations are unreachable per the layout-key construction rules but cost nothing. - Layout-build site (`init_volumetric_fog_pipeline`) gains a third branch: `texture_depth_2d_multisampled()` under MULTISAMPLED, then `texture_2d_array(TextureSampleType::Depth)` under MULTIVIEW, else `texture_depth_2d()`. Matches the helper shape used in `prepass_bindings.rs:32` and `mesh_view_bindings.rs:301`. - `VolumetricFogPipelineKey` gains a `multiview_view_count: u32` field (default 1 for non-multiview). The `>1` value gates both the layout MULTIVIEW bit and the MULTIVIEW + MAX_VIEW_COUNT shader-def push in `specialize`. - Render-graph node (`volumetric_fog`) ViewQuery gains `Option<&ExtractedMultiview>` and the bind-group-layout-key construction sets MULTIVIEW based on `view_count > 1 && !is_msaa` identically to specialize. - `prepare_volumetric_fog_pipelines` query gains `Option<&ExtractedMultiview>` and threads `subviews.len()` (or 1 fallback) into the pipeline key. Source matches the L6 mesh + prepass + L7c convention. - `bind_group_layout_description` adds the "multiview" name to the debug-label iter. `descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7c deferral. The wgpu render-pass enablement is L7d's job. MSAA + MULTIVIEW carve-out: WGSL has no `texture_depth_multisampled_2d_array`, so the host gates the MULTIVIEW shader def push and the layout-key MULTIVIEW bit on `!MULTISAMPLED`. Under MSAA + multiview the depth binding stays single-layer (no shader def, no layout switch), and the post-C2-A multi-layer multisampled depth texture binding will fail wgpu validation. This is the same limitation already documented in `mesh_view_bindings.wgsl:99-106` for the prepass-texture bindings: the MSAA + multiview combination is left unsupported at the texture-binding level (rare in practice; VR doesn't pair with MSAA). No existing camera in tree triggers this combo. Non-multiview cameras get `multiview_view_count = 1` → no MULTIVIEW push, no layout-key bit → fallback path → bit-identical behavior to the pre-conversion single-layer binding. Smoke verification: - `volumetric_fog --features bevy_ci_testing` byte-identical (3709234 bytes) pre/post (both `/tmp/session19-baseline-volumetric_fog.png` and `/tmp/session19-post-volumetric_fog.png`). Strongest possible no-op witness for the converted pipeline on the existing non-multiview path. - `3d_scene --features bevy_ci_testing` byte-identical (697355 bytes), matches the lineage from sessions 16-18 (`/tmp/post-rebase-3d_scene.png`, `/tmp/c2-abc-3d_scene.png`, `/tmp/session18-3d_scene.png`, `/tmp/session18-f1-3d_scene.png`). Read-side witness — volumetric fog isn't enabled on the 3d_scene camera so the fog pipeline never compiles, but confirms the depth-prepass plumbing stays clean. multiview branch is unexercised on macOS Metal but the binding / layout / shader-def shapes line up with `mesh_view_bindings.wgsl`'s established prepass-texture pattern. Out of scope: - Atmosphere render_sky L7-shape conversion (next session per `.scratch/session17_audit_planning.md` §5). - DoF L7-shape conversion (session 21+ per same plan). - HZB depth pyramid stays deferred (pattern c per audit §4.3) — compute pipeline can't take `@builtin(view_index)` and the output is single-layer; per-eye HZB is an L8 layer. - MSAA + MULTIVIEW depth binding (open question per audit §6) — resolved here by matching the established `mesh_view_bindings` carve-out. A real fix requires a per-layer D2 view + per-eye dispatch (pattern b) or a higher-level workaround; not in scope for the pattern (a) per-consumer conversion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…MULTIVIEW The view depth texture under MULTIVIEW grew to a per-eye layered texture in session 16 (C2 sub-A). On a multiview camera the existing `texture_depth_2d` binding at `@group(0) @binding(13)` would fail wgpu validation against the D2Array view; render_sky's fragment also has no way to address its own eye's depth. Forward-enablement audit follow-up per `.scratch/session17_audit_planning.md` §4.2. WGSL (`render_sky.wgsl`): binding 13 nests `#ifdef MULTIVIEW` inside the non-multisampled branch — `texture_depth_2d_array` under `MULTIVIEW && !MULTISAMPLED`, `texture_depth_2d` otherwise; the multisampled branch stays single-layer. Fragment entry gains a separate `@builtin(view_index) view_index: i32` parameter alongside the existing `FullscreenVertexOutput`-typed `in` (rather than a struct rewrite — `in.position`/`in.uv` references stay untouched). The single `textureLoad` read site picks the 4-arg form with `view_index` under `MULTIVIEW && !MULTISAMPLED`; the other three branches keep the 3-arg form. Same nest shape as `volumetric_fog.wgsl` (commit dc391d3) and the `mesh_view_bindings.wgsl:99-106` prepass-texture pattern. Host (`resources.rs`): `RenderSkyBindGroupLayouts` grows a third `render_sky_multiview` field (binding 13 = `texture_2d_array` of depth). `RenderSkyPipelineKey` gains `multiview_view_count: u32`. Specialize pushes `MULTIVIEW` + `MAX_VIEW_COUNT` and selects the multiview layout when `view_count > 1 && msaa_samples == 1`. Both `queue_render_sky_pipelines` (pipeline key construction) and `prepare_atmosphere_bind_groups` (bind-group layout pick) gain `Option<&ExtractedMultiview>` and gate on the same predicate. MSAA + MULTIVIEW carve-out: WGSL has no `texture_depth_multisampled_2d_array`, so the MULTIVIEW shader def + multiview-layout pick are gated on `!MULTISAMPLED`. Under MSAA + multiview the depth binding stays single-layer, identical to today; the post-C2-A multi-layer multisampled depth texture would fail wgpu validation against it. No camera in tree triggers this combo. Same carve-out as `mesh_view_bindings.wgsl:99-106` and session 19 volumetric fog. Atmosphere's view binding (`@group(0) @binding(3) var<uniform> view: View` in `bindings.wgsl`) and the camera-shared sky LUTs (`sky_view_lut`, `aerial_view_lut`) are NOT touched by this commit. Under MULTIVIEW the depth read is now eye-correct so sky-by-foreground-geometry occlusion matches each eye's depth buffer, but the sky LUTs and ray-direction computations still read element-0 view data — eye-correct view rays through the sky LUTs are a separate L8 layer (per `.scratch/session17_audit_planning.md` §6 second bullet). This commit fixes the binding-layout-vs-texture-view shape mismatch; it does not make the rendered sky fully per-eye-correct. Smoke: - `atmosphere --features bevy_ci_testing`: byte-identical (3542199 bytes, md5 23c5c7f864035a6d4a58b47aa1acb24c) pre/post conversion on a warm shader cache. Newly added to the byte-deterministic example registry. - `3d_scene --features bevy_ci_testing`: byte-identical (697355 bytes) — depth-prepass-plumbing witness, atmosphere not enabled so render_sky pipeline doesn't compile. - `volumetric_fog --features bevy_ci_testing`: byte-identical (3709234 bytes) — fog imports atmosphere transmittance LUT shader; confirms the atmosphere shared-module changes don't perturb fog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WGSL: the `depth_texture` binding (single-input + dual-input bind groups share the same `@group(0) @binding(1)`) gains a `texture_depth_2d_array` variant under `MULTIVIEW && !MULTISAMPLED`, nested inside the existing non-multisampled branch. Each of the four fragment entries (`gaussian_horizontal`, `gaussian_vertical`, `bokeh_pass_a`, `bokeh_pass_b`) gains a separate `@builtin(view_index) view_index: i32` parameter alongside the existing `in: FullscreenVertexOutput`, and assigns `current_view_index = view_index;` at the top of its body under MULTIVIEW. The depth read lives in the `calculate_circle_of_confusion` helper called from all four entries; under `MULTIVIEW && !MULTISAMPLED` its single `textureLoad` adds `current_view_index` as the layer argument. The same `current_view_index` global drives the existing `view()` helper that `depth_ndc_to_view_z` reads via `view_transformations.wgsl`, so the per-eye projection matrix is also applied correctly. WGSL has no `texture_depth_multisampled_2d_array`, so the MSAA + multiview combination keeps the single-layer binding — same carve-out as the prepass-texture bindings in `mesh_view_bindings.wgsl:99-106`. Host: `DepthOfFieldPipelineKey` gains `multiview_view_count: u32`. `prepare_depth_of_field_view_bind_group_layouts` and `prepare_depth_of_field_pipelines` both gain `Option<&ExtractedMultiview>`; the layout-prepare hoists a shared `depth_binding` expression that picks `texture_depth_2d_multisampled()` under MSAA, `texture_2d_array(TextureSampleType::Depth)` under non-MSAA multiview, or `texture_depth_2d()` otherwise, and reuses it across both the single-input and dual-input layouts. The pipeline-prepare derives `multiview_view_count` once per view and threads it into all four `DepthOfFieldPipelineKey` constructions (gaussian horizontal/vertical, bokeh pass 0/1). The specialize impl pushes `MULTIVIEW` plus `MAX_VIEW_COUNT` shader defs when `multiview_view_count > 1 && !multisample`, matching the layout-side carve-out. Out of scope (audit §6 third bullet): the dual-input bind group's `auxiliary_dof_texture.default_view` is single-layer today and would need to grow to multi-layer for true per-eye DoF. This commit fixes the depth-buffer-read shape mismatch under multiview; full per-eye DoF needs the auxiliary texture grown too. Also out of scope: the view binding at `@group(0) @binding(0)` still consumes the `uniform_buffer::<ViewUniform>(true)` layout entry (the WGSL imports the multi-view-aware `mesh_view_bindings::view` helper, so layout/shader match under no-MULTIVIEW; the L7c view-binding-side conversion is a separate session). Smoke (warm shader cache, frame 100, `bevy_ci_testing`): * `depth_of_field` — **byte-identical** (3629866 bytes) pre/post conversion on a single run. New entry in the byte-deterministic example registry. * `3d_scene` — byte-identical (697355 bytes) to the session 16-20 lineage. Depth-prepass-plumbing witness. * `volumetric_fog` — byte-identical (3709234 bytes) to the session 19-20 baseline. Cross-pipeline regression witness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The per-layer first-call init in `ColorAttachment::first_call_for_layer` and `DepthAttachment::get_attachment_for_layer` (introduced by the per- layer attachment accessor API in c315f71 / e186c03) unconditionally seeded each slot with `true` (or `clear_value.is_some()` for depth) on first per-layer access. That matches the global `is_first_call` semantics for a consumer that runs FIRST in the frame — the prepass + deferred per-eye loops added by ac409d6, which were the only consumers in tree until now. A consumer that runs AFTER another pass has already touched the same attachment via the legacy `get_attachment` API (e.g. a per-eye main- pass consumer running after `main_opaque_pass_3d` has flipped the global latch to false) used to see `true` on its first per-layer access and incorrectly emit `LoadOp::Clear`, wiping the earlier pass's work. Surfaced by the in-progress transmission per-eye dispatch (L7b-write C3) on macOS Metal even at view_count=1: a 21% size drop in the byte-deterministic transmission screenshot baseline (3162953 bytes → 2485935 bytes) made the regression visible without needing multiview hardware. Fix: read `is_first_call.fetch_and(...)` first and seed the per-layer init from its return value. Each per-layer slot then matches what the global latch would have returned for that consumer at init time, and the per-slot `fetch_and(false)` subsequently behaves as a per-layer extension of the same latch. No call-site changes. Existing consumers (prepass + deferred per-eye loops at view_count=1) byte-identical on `3d_scene` (697355 bytes) and `deferred_rendering` (3628497 bytes); pre-C3 transmission path byte-identical on the transmission example (3162953 bytes). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…TIVIEW L7b-write C3: grow the `view_transmission_texture` to one layer per eye and dispatch `main_transmissive_pass_3d` per eye into per-layer color + depth attachments. Final L7b-write consumer per the `.scratch/session17_audit_planning.md` §7 follow-up plan; the prepass and deferred render-graph nodes already landed C2 in ac409d6 (session 18) and the read-side (binding 24 in `mesh_view_bindings`) already nests `texture_2d_array<f32>` under `MULTIVIEW` (b5066e02f, session 14). The D2Array view that line 905 of `mesh_view_bindings.rs` creates over the transmission texture used to be a one-layer view of a one-layer texture; post-C3 it is the multi-layer view the WGSL declaration always expected. Part 0 — `ViewTarget::get_color_attachment_for_layer` / `get_unsampled_color_attachment_for_layer`. Thin delegates picking `main_textures.a` vs `main_textures.b` via the existing post-process swap atomic and forwarding to the per-layer accessors that ColorAttachment grew in c315f71 (session 16). Mirrors session 18's `ViewDepthTexture::get_attachment_for_layer` delegate. The `main_textures.{a,b}` textures are already multi-layer under multiview per the C2 sub-A growth in prepare_view_targets, so the delegate just exposes that to per-eye consumers. Part A — `prepare_core_3d_transmission_textures` allocates the transmission texture with `depth_or_array_layers = view_count`, matching the multi-layer main texture so the copy_texture_to_texture in the node can pass `depth_or_array_layers = view_count` in one call. Cache key extended from `camera.target.clone()` to `(camera.target.clone(), view_count)` per the session-16 C2-B precedent (prevents multiview/non-multiview cameras sharing a render target from colliding on a cached texture of the wrong layer count). Part B — `main_transmissive_pass_3d` adds `Option<&'static ExtractedMultiview>` to its ViewQuery, computes `view_count` once, and wraps each `for range in split_range(...)` iteration's render pass in a `for eye in 0..view_count` loop. Each eye's descriptor uses `target.get_color_attachment_for_layer(eye)` + `depth.get_attachment_for_layer(eye, StoreOp::Store)`. The copy_texture_to_texture stays inside the range loop but outside the eye loop — one multi-layer copy per range step spans every eye's layer. Range<usize> is cloned cheaply for each eye's render_range call. The same restructure applies to the `steps == 0` branch. Non-multiview path: `view_count = 1`, copy_extents matches the pre-edit `physical_target_size.to_extents()`, and the eye loop runs once with eye=0 — byte-identical no-op (verified, see below). Deviates from PR bevyengine#16059's broadcast design (single render pass with `multiview_mask`) in favour of the per-eye dispatch shape session 15's L7d planning locked in for class-(b) consumers, consistent with ac409d6 (prepass + deferred). Surface in the eventual reference PR's "differences from PR bevyengine#16059" section. Out-of-scope follow-ups: - View binding layout still `uniform_buffer::<ViewUniform>(true)` singular at the binding site that transmission's PBR shader chain reads — under MULTIVIEW the WGSL `view_array: array<View, MAX_VIEW_COUNT>` mismatches the singular layout entry. Shared staging with the atmosphere/DoF R2 staging from sessions 20/21; separate session if pursued. - The non-MSAA + MULTIVIEW carve-out shape from `mesh_view_bindings .wgsl:99-106` doesn't fire here because the transmission texture itself is always `sample_count = 1` (the source main texture's MSAA is resolved on the copy_texture_to_texture's source side). Smoke verification on macOS Metal (no wgpu multiview, so this exercises the `view_count = 1` no-op path): - `transmission --features bevy_ci_testing` — byte-identical (3162953 bytes) pre/post C3. New entry in the byte-deterministic example registry. - `3d_scene --features bevy_ci_testing` — byte-identical (697355 bytes) to the session 16-21 lineage. - `volumetric_fog --features bevy_ci_testing` — byte-identical (3709234 bytes) to the session 19-21 baseline. This commit depends on F2 (d92e550): C3 is the first consumer of the per-layer color attachment accessor that runs AFTER another pass has touched the global is_first_call latch (main_opaque_pass_3d), so without F2's per-layer first-call seeding fix C3 would emit LoadOp::Clear on its first per-eye access and wipe the opaque main pass's work. F2 surfaced through the byte-deterministic smoke on the non-multiview path even at view_count=1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First L7d flip per the session 15 L7d enablement plan §4 step 5: the smallest-fragment-surface class-(a) pipeline switches from per-eye dispatch (session 18's `ac409d6a3` shape) to single-pass broadcast, establishing the L7d shape on the branch. Class (a) reaffirmation: the WGSL writes only the motion-vectors color attachment at @location(1) with depth GreaterEqual early-z, no shared storage, and reads view data via `view_array[current_view_index]` keyed off `@builtin(view_index)` — exactly the safe-broadcast shape the inventory described. Plan-vs-reality note: §4 step 5 framed L7d as a per-pipeline flip, but background motion vectors today is dispatched as a `draw(0..3)` call inside the SAME render pass that runs opaque + alpha-mask prepass items (pre-dates session 18's per-eye refactor — see `git show ac409d6~1`). `multiview_mask` is a pass-level property, so the flip requires extracting the draw into its own render pass. Done within `run_prepass_system` (Shape A1) to preserve session 18's per-eye dispatch for the surrounding prepass items. Host changes: - `background_motion_vectors.rs`: pipeline descriptor sets `multiview_mask = NonZeroU32::new((1 << view_count) - 1)` under `multiview_view_count > 1`. Mirrors the same predicate already used for the MULTIVIEW + MAX_VIEW_COUNT shader-def push. - `prepass/node.rs`: the bg motion vectors `if let` block moves OUT of the `for eye in 0..view_count` loop into a separate render pass after the loop ends. The broadcast pass uses multi-layer attachments via legacy `ColorAttachment::get_attachment()` (normals + motion vectors) and `ViewDepthTexture::get_attachment(StoreOp::Store)`; the pass-descriptor `multiview_mask` matches the pipeline descriptor (None when view_count==1). F2 interaction (session 22's `d92e55099`): legacy `get_attachment*` accessors after the per-eye loop read the global `is_first_call` latch, which the per-eye loop already flipped to false. Result: the broadcast pass loads the prior per-eye writes for normal + motion vectors + depth instead of re-clearing — the case F2 was designed to handle. First in-tree consumer to exercise the legacy-after-per-layer ordering for color attachments. WGSL: zero edits. `background_motion_vectors.wgsl` was already L7c-converted (declares `view_array: array<View, MAX_VIEW_COUNT>` under MULTIVIEW, sets `current_view_index = view_index` at the fragment top). Smoke verification: - `motion_blur --features bevy_ci_testing` — **byte-identical** (4492372 bytes, md5 `219b899f7f82a5f8d2895e884260f99d`) pre/post-edit. Primary witness: the actual modified pipeline. Despite §registry noting motion_blur as non-deterministic (camera animation drift), this specific pair matched. - `3d_scene --features bevy_ci_testing` — byte-identical (697355 bytes) to session 16-22 lineage. Prepass-node structural witness. - `deferred_rendering --features bevy_ci_testing` — byte-identical (3628497 bytes) to session 17-22 lineage. Deferred late-prepass witness. - `transmission --features bevy_ci_testing` — byte-identical (3162953 bytes) to session 22 baseline on the second run. First run drifted (one-time cold-shader-cache recompile per §registry caveat); converged on re-run. Cross-pipeline no-regression witness. Non-multiview behavior (view_count==1): the bg motion vectors draw now runs in its own pass with `multiview_mask: None` instead of sharing a pass with the prepass items. Adds one render-pass boundary on the non-multiview path; motion_blur smoke confirms no output drift. L7d flips remaining (per session 15 §1 inventory): PBR mesh forward, PBR prepass, deferred lighting, SSR, tonemapping, skybox, blit, FXAA, bloom-first, debug overlay. Pipelines that own their render pass (tonemapping/skybox/etc) can flip in place; pipelines that share a pass (PBR mesh forward, PBR prepass) need the same extraction-or-whole-pass-flip design call this commit faced. Cluster raster + OIT-flagged PBR stay per-eye-dispatched per the class (b) carve-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7d (`572af1539`) computed the all-eyes-set broadcast mask as `(1u32 << view_count) - 1` at two sites: `background_motion_vectors.rs:215` and `prepass/node.rs:283`. Both panic in debug (shift overflow) and are UB in release when `view_count == 32`, the `MAX_VIEW_COUNT` cap enforced by L2's `e250678de` extraction. A 32-view multiview camera is reachable through the public API (`Multiview` extraction warn-clamps to 32, not 31). Typical stereo (view_count == 2) is unaffected, but the next L7d flip would copy the pattern and inherit the same defect. Fix: replace with `u32::MAX >> (32 - view_count)` at both sites. For valid inputs in `[1, 32]`, the result is the same all-eyes-set mask but the shift amount stays in `[0, 31]`. The gate `view_count > 1` is unchanged. Smoke verification: - `motion_blur --features bevy_ci_testing` — byte-identical (4492372 bytes) to the L7d baseline. The non-multiview path (`view_count == 1`) skips the branch entirely, so this only witnesses no-regression on the broadcast-pass structure. - `3d_scene --features bevy_ci_testing` — byte-identical (697355 bytes). Prepass-node structural witness. Same shape as session 18's F1 and session 22's F2: small self-contained hardening on a novel surface, landed in-session per [[feedback-session-conventions]] §collapse-pre-planned-split. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets `multiview_mask` on both the FXAA pipeline descriptor (in `fxaa/mod.rs::specialize`) and its render-pass descriptor (in `fxaa/node.rs`). Both sites compute the mask with the shift-safe `u32::MAX >> (32 - view_count)` idiom established by F1 `43920683c` and gate on `> 1` (pipeline via `key.multiview_view_count`, node via `target.multiview_count().map_or(1, |n| n.get())`). Wgpu requires the two descriptors to agree; the matching gate predicate mirrors the existing convention for the MULTIVIEW + MAX_VIEW_COUNT shader-def push and bind-group layout pick. Second L7d flip on the branch. Unlike session 23's bg motion vectors (which required Shape A1 extract surgery because the dispatch co-mingles with prepass items in a shared render pass), FXAA owns its own render pass (`fxaa/node.rs:53`), so the L7d flip is purely mechanical: no WGSL edits (already L7c-converted with `@builtin(view_index)` + `current_view_index` plumbing), no extraction, no per-eye loop. ViewTarget's post-process source/ destination views are already multi-layer D2Array under multiview per session 16's `prepare_view_targets` growth, so the destination attachment supports the broadcast directly. At `view_count == 1` (the only path macOS exercises since wgpu has no multiview support there) both compute `multiview_mask = None`, making this a literal no-op on the non-multiview path. Smoke (`--features bevy_ci_testing`): - `anti_aliasing` with a temp Camera spawn carrying `Msaa::Off, Fxaa::default()` (reverted after capture) — **byte-identical** (3588248 bytes, md5 `4d95a4be3b6fbecee0ac83836674f783`) pre/post the L7d flip on warm cache. Primary witness directly exercises the touched FXAA pipeline. - `3d_scene` byte-identical (697355) to the session 16-23 lineage. - `deferred_rendering` byte-identical (3628497) to the session 17-23 lineage. - `motion_blur` byte-identical (4492372, md5 `219b899f7f82a5f8d2895e884260f99d`) to session 23. Confirms the in-tree L7d bg-motion-vectors pass is unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets `multiview_mask` on both the tonemapping pipeline descriptor (in
`tonemapping/mod.rs::specialize`) and its render-pass descriptor (in
`tonemapping/node.rs`). Both sites compute the mask with the shift-safe
`u32::MAX >> (32 - view_count)` idiom established by F1 `43920683c`
and gate on `> 1` (pipeline via `key.multiview_view_count`, node via
`target.multiview_count().map_or(1, |n| n.get())`). Wgpu requires
the two descriptors to agree; the matching gate predicate mirrors the
existing convention for the MULTIVIEW + MAX_VIEW_COUNT shader-def
push and bind-group layout pick.
Third L7d flip on the branch, second purely mechanical own-pass flip
after FXAA `a7fd04a19`. Tonemapping owns its own render pass at
`tonemapping/node.rs` (single `draw(0..3, 0..1)` with no co-mingled
draws), and was already fully L7c-converted (`view_array: array<View,
MAX_VIEW_COUNT>` binding, `@builtin(view_index)` + `current_view_index`
plumbing in the fragment, `view()` helper used by tonemap math). The
L7d flip is purely the two `multiview_mask` field sets: no WGSL edits,
no extraction, no per-eye loop. The HDR source and destination both
come from `target.post_process_write()`, which returns the `default_view`
of `main_textures.{a,b}` — already multi-layer D2Array under multiview
per session 16's `prepare_view_targets` growth — so the broadcast pass
samples and writes per-eye-correct data on both sides. The tonemapping
LUT textures (bindings 3, 4) and sampler are camera-shared and
eye-independent; tonemap output is determined entirely by per-eye HDR
input + global LUT, so broadcast preserves per-eye correctness.
At `view_count == 1` (the only path macOS exercises since wgpu has
no multiview support there) both sites compute `multiview_mask = None`,
making this a literal no-op on the non-multiview path. Tonemapping
also runs in Core2dSystems::PostProcess for 2D cameras; 2D cameras
have no Multiview component, so `key.multiview_view_count == 1` and
`target.multiview_count() == None` at the two sites respectively —
both reach the else branch and 2D stays byte-identical.
Smoke (`--features bevy_ci_testing`, all byte-identical first try on
warm cache):
- `3d_scene` byte-identical (697355) to the session 16-24 lineage.
Default Camera3d uses `Tonemapping::TonyMcMapface`, so this directly
exercises the touched pipeline.
- `deferred_rendering` byte-identical (3628497) to the session 17-24
lineage. Tonemapping witness via the deferred camera setup.
- `motion_blur` byte-identical (4492372, md5
`219b899f7f82a5f8d2895e884260f99d`) to session 23-24. Confirms the
in-tree L7d bg-motion-vectors broadcast pass and the L7d FXAA pass
are unaffected.
- `atmosphere` byte-identical (3542199, md5
`23c5c7f864035a6d4a58b47aa1acb24c`) to the session 20-24 lineage.
Tonemapping witness through the atmosphere camera.
- `transmission` byte-identical (3162953) to the session 22-24
lineage. Tonemapping + per-eye transmissive cross-witness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second Shape A1 L7d on the branch (after session 23's background motion vectors). The skybox cubemap is shared across eyes; per-eye view matrices sampled via `view()` from `@builtin(view_index)` give each eye the correct ray direction, so one broadcast draw fills every layer of the multi-layer color + depth attachments. Skybox previously dispatched as a trailing `render_pass.draw(0..3, 0..1)` inside the shared `main_opaque_pass_3d` (alongside opaque + alpha-mask phase items). Because `multiview_mask` is a pass-level property (per-pipeline framing collides with shared render passes), this commit closes the existing main pass after the opaque + alpha-mask draws and opens a new "skybox_broadcast" pass that re-derives the color + depth attachments. Re-derivation hits the second-call `is_first_call` latch on `ColorAttachment` / `DepthAttachment`, returning `LoadOp::Load` to preserve the opaque + alpha-mask output. Surrounding main_opaque_pass draws stay in their existing single-pass shape; the broader question of converting mesh forward + alpha-mask to per-eye / Shape D belongs to a future session. At view_count=1 (the only path exercised by the in-tree examples), both descriptors set multiview_mask=None and the skybox draw lands byte-identical to its prior position inside the shared pass. Pipeline-side and pass-side both derive view_count from the same ExtractedMultiview.subviews.len() chain; the masks cannot diverge. Mask uses the shift-safe `u32::MAX >> (32 - view_count)` formulation established in session 23's F1 (avoids `1u32 << 32` overflow at the MAX_VIEW_COUNT cap of 32). Smoke (byte-identical to registry baselines): - volumetric_fog: 3709234 bytes (primary skybox witness, first try) - 3d_scene: 697355 bytes - deferred_rendering: 3628497 bytes - motion_blur: 4492372 bytes, md5 219b899f7f82a5f8d2895e884260f99d (confirms in-tree L7d bg-motion-vectors + FXAA + tonemapping broadcast passes all unaffected) - atmosphere: 3542199 bytes, md5 23c5c7f864035a6d4a58b47aa1acb24c (converged on third run per registry cold-shader-cache caveat; atmosphere has no Skybox component so the cold-cache drift was unrelated to this edit) - transmission: 3162953 bytes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(L7d Shape D) Fifth L7d flip on the branch and the first whole-pass-broadcast (Shape D) conversion. Per the session 27 planning doc (`.scratch/session27_pbr_l7d_planning.md`) §3.3, every in-tree dispatcher into `Opaque3d` / `AlphaMask3d` flows through PBR `MeshPipeline` via `DrawMaterial` (verified via `grep add_render_command::<Opaque3d|AlphaMask3d>` across the workspace), and the mesh pipeline + every shader prereq is already L7c-converted at the layer where the MULTIVIEW shader-def push happens. With one dispatcher and full L7c, Shape A1 (extract per draw) collapses — the dispatcher IS the pass content. Shape D (whole-pass broadcast) is the natural answer. Pipeline-side: `MeshPipeline::specialize` sets `multiview_mask = NonZeroU32::new(u32::MAX >> (32 - max_view_count))` under `max_view_count > 1` on the returned `RenderPipelineDescriptor`. Gate mirrors the existing MULTIVIEW + MAX_VIEW_COUNT shader-def push at the same site. The prepass + deferred prepass dispatches (`Opaque3dPrepass` etc.) flow through the separate `PrepassPipelineSpecializer` type and are NOT covered by this field-set — their conversion is a separate session. Pass-side: `main_opaque_pass_3d` flips its descriptor from `multiview_mask: None` to the shift-safe formula derived from `multiview.subviews.len()`. The compute is lifted to function-top scope so the skybox broadcast pass (session 26) can reuse the same value rather than recomputing — small de-duplication. Both gate predicates ultimately derive from `ExtractedMultiview.subviews.len()`, so wgpu's required pipeline-vs-pass mask agreement holds. Shape D's load-bearing prereq — every dispatcher in the pass must be feature-safe under broadcast — is met for every in-tree Material because they all share the same L7c-converted `MeshPipeline::specialize` path. The latent risk is custom user `Material` implementors that ship their own custom WGSL entry without threading `@builtin(view_index)` + assigning `current_view_index = view_index;`: the pipeline + hardware broadcast correctly, but `current_view_index` stays at 0 on every layer and reads through `view()` / `mesh_view_bindings::*` silently resolve to eye 0's data, so lighting and camera-relative effects render as if every eye were eye 0 (geometry survives because the default `mesh.wgsl` vertex entry threads `view_index` itself, IF the custom material kept the default vertex entry). To make that risk discoverable, two in-tree docs land alongside the Shape D flip: - A `# Multiview` section on the `Material` trait docstring (`bevy_pbr/src/material.rs`) explaining the requirement for any custom WGSL entry (vertex OR fragment), with the asymmetry between custom-fragment-only-with-default-vertex (geometry safe via default `mesh.wgsl`) and custom-both-without-threading (geometry breaks too). Anchors to `pbr.wgsl`, `mesh.wgsl`, and `mesh_view_bindings.wgsl`. - A comment block in `mesh_view_bindings.wgsl` right after the `current_view_index` declaration with a paste-ready fragment-entry snippet showing the canonical `#ifdef MULTIVIEW` plumbing. Both docs frame the issue around `current_view_index` rather than a specific phase or pipeline, so they age well for any future shared-pass L7d conversion that uses the same global. Diff +111/-22 = +89 net across four files. Shape-flip surface itself (node + MeshPipeline) is +33 net, top of the §3.3 ~30-50 estimate; the in-tree docs add +56 net (Material trait `# Multiview` section +27, `mesh_view_bindings.wgsl` paste-ready snippet +29) vs the §5 ~15-25 estimate. The doc overshoot is deliberate — the paste-ready snippet is the canonical migration aid the eventual PR description references, and the trait-level docstring is the natural IDE-hover surface for custom-material authors. Total ~18% over the ~45-75 plan estimate; well under the L7c <120 cap and the >25% §L7d-bands overshoot trigger. Smoke verification (6 byte-deterministic witnesses per §registry): - `3d_scene` --features bevy_ci_testing — byte-identical (697355) to the session 16-26 lineage. Primary cross-cutting witness. - `volumetric_fog` --features bevy_ci_testing — byte-identical (3709234) to the session 19-26 lineage. Skybox broadcast + main_opaque interleave on the only registry witness carrying a `bevy_light::Skybox` component. - `deferred_rendering` --features bevy_ci_testing — byte-identical (3628497) to the session 17-26 lineage. Deferred + main_opaque interleave. - `motion_blur` --features bevy_ci_testing — byte-identical (4492372, md5 `219b899f7f82a5f8d2895e884260f99d`) to the session 23-26 lineage. Fifth consecutive determinism confirmation; bg-motion-vectors broadcast + main_opaque + L7d FXAA + L7d tonemapping all unaffected. - `transmission` --features bevy_ci_testing — 3162897 bytes vs registry baseline 3162953. Drift is pre-existing in HEAD `1490228f0`, not caused by this commit: cmp of post-edit PNG vs the same example run on a fully reverted working tree (stashed all four edits, ran transmission on HEAD, unstashed) shows the two PNGs are bit-identical at 3162897. The 56-byte drift from the registry baseline appears to be environmental (Metal pipeline-cache state, system load, or similar) between session 26's capture and now. Both PNGs render the transmissive demo correctly. - `atmosphere` --features bevy_ci_testing — 3542320 bytes / md5 `97f15a8251d03f49eeb6ac6f40d9cf26` stable across 3 runs with the edits applied. Clean HEAD with all edits reverted gave 3542199 bytes but md5 `cbbda228c220ad4fe44829aac5b00a33` — different from the registry's `23c5c7f864035a6d4a58b47aa1acb24c` at the same byte size. Atmosphere is fundamentally image-content non-deterministic; the registry's prior size-match across sessions was coincidence at one cache-warmth state. PNG renders the atmosphere demo correctly. Cross-checked against the type-level no-op argument: `RenderPipelineDescriptor.multiview_mask` defaults to `None` (`bevy_material/src/descriptor.rs:53`), so at view_count == 1 this commit's explicit `multiview_mask: None` is identical to the prior `..default()`-filled value at both the pipeline and pass descriptors. Together: 4 byte-identical registry matches (3d_scene, volumetric_fog, deferred_rendering, motion_blur) + 1 bit-identical-to-clean-HEAD witness (transmission) + 1 visually-correct-with-environmental-drift witness (atmosphere) confirm the no-op claim at view_count == 1. Out of scope (recorded for later sessions per planning doc §7): - Prepass D2+F → Shape D (session 29; edits `PrepassPipelineSpecializer`). - Deferred prepass D2+F → Shape D (session 30; node-only, must follow §29). - Wireframe / deferred lighting / OIT resolve own-pass L7d (sessions 31-33). - Transparent L7d (gated on a future sort-distance planning session). - InfiniteGrid L7c (transparent prereq). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Converts the prepass node's per-eye dispatch loop (session 18, ac409d6) into a single Shape D broadcast pass, mirroring session 28's main-opaque flip but on the separate `PrepassPipelineSpecializer` host type covering `Opaque3dPrepass` / `AlphaMask3dPrepass` / `Opaque3dDeferred` / `AlphaMask3dDeferred` dispatches through `DrawPrepass`. The background-motion-vectors broadcast pass (session 23) stays as-is — its existing legacy `get_attachment` / `get_attachment(StoreOp::Store)` calls remain the second legacy calls in `run_prepass_system`, so it still gets `LoadOp::Load` and preserves the prepass-items output. Pipeline-side (`bevy_pbr/src/prepass/mod.rs`, +18/-1): adds `NonZeroU32` import; `PrepassPipeline::specialize` sets `multiview_mask = NonZeroU32::new(u32::MAX >> (32 - max_view_count))` under `max_view_count > 1` on the returned `RenderPipelineDescriptor`, parallel to session 28's MeshPipeline edit. Comment notes coverage is the prepass + deferred prepass dispatches via `DrawPrepass`, distinct from the forward-pass mask set in MeshPipeline. Pass-side (`bevy_core_pipeline/src/prepass/node.rs`, +68/-75 = -7 net): removes the `for eye in 0..view_count` loop, replaces per-layer `get_attachment_for_layer(eye)` / `get_attachment_for_layer(eye, StoreOp::Store)` with legacy `get_attachment()` / `get_attachment( StoreOp::Store)`, and sets the pass descriptor `multiview_mask` to the shift-safe formula under `view_count > 1`. Lifts `view_count` + `multiview_mask` compute to function-top scope so the bg-motion-vectors broadcast pass reuses the same value rather than recomputing. F2 lifecycle (session 22, d92e550) traced through the new ordering: the prepass-items pass calls legacy accessors first (flips global latch to false on first frame, Clear), bg-motion-vectors calls legacy second (global false, Load), main opaque calls legacy third (Load), transmission's per-eye dispatch calls per-layer accessors which seed slots from the now- false global via F2 (Load on every eye). The same Load-after-first-Clear pattern the per-eye loop produced, with one rather than N dispatches. Smoke verification (5 byte-deterministic witnesses + 1 bit-compare-vs- clean-HEAD): - 3d_scene 697355 — bit-identical to the session 16-28 lineage. Primary cross-cutting witness; no Prepass-component cross-check. - volumetric_fog 3709234 — bit-identical to the session 19-28 lineage. - deferred_rendering 3628497 — bit-identical to the session 17-28 lineage. Exercises the deferred prepass node which inherits the PrepassPipelineSpecializer multiview_mask field-set automatically (session 30 will flip its own per-eye loop into the same Shape D broadcast on the node side; this commit's host edit is the prereq). - motion_blur 4492372 md5 219b899f7f82a5f8d2895e884260f99d — bit- identical to the session 23-28 lineage. Load-bearing for this session: exercises BOTH the now-broadcast prepass items AND the bg-motion- vectors broadcast pass; a regression in either lifecycle would show. Sixth consecutive determinism confirmation. - transmission 3162897 md5 6560d0d1bd00af9936b829e37fac4565 — clean-HEAD is now image-content non-deterministic (3 pre-impl runs on HEAD 109bde0 produced 2 distinct sizes and 3 distinct md5s: 3162956 ff8e4441, 3162897 f785c7eb, 3162897 6560d0d1). Post-impl md5 6560d0d1 matches the third pre-impl run exactly — the post-edit output is in the set of clean-HEAD outputs, confirming the edit is runtime no-op at view_count=1. F2 lifecycle for transmission's per-layer dispatch is preserved as cold-read predicted. - atmosphere 3542276 md5 eaf0bac6a7b82062c730bcd988134554 — drifted per the session-28-recorded content non-determinism. Backstopped by the 5 strong witnesses above and the F2 reasoning chain. No-op argument: `RenderPipelineDescriptor.multiview_mask` and `RenderPassDescriptor.multiview_mask` both default to `None` and the field-set gate is `view_count > 1`. At view_count = 1 the post-edit descriptors are type-level identical to the pre-edit ones. The per-eye loop at view_count=1 ran once with `get_attachment_for_layer(0, StoreOp::Store)`; the synthesized D2 view is bit-identical to `default_view` per the session-16 docstring, so the legacy and per- layer paths produce wgpu-equivalent attachments. Latch lifecycle is preserved (both paths flip the global to false on first call, Load on second). The 4 byte-identical witnesses confirm the no-op claim empirically; transmission's bit-match against a clean-HEAD output covers the non-deterministic case. Cold-read review combined in-session per session-conventions §review- cadence (Shape D twice-established after session 28). Pipeline-vs-pass agreement verified (both sites derive from ExtractedMultiview.subviews. len() via the cache-keyed pipeline-key field and the runtime query respectively); view_count == 32 shift-safe (u32::MAX >> 0 = u32::MAX); view_count == 1 no-op (verified empirically); F2 seeding still correct for transmission downstream consumer (init_per_layer_first_call seeds from global=false → all per-layer slots false → Load on every eye); bg-motion-vectors lexical ordering preserved (prepass-items first, bg-motion-vectors second). Zero actionable findings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Converts the deferred prepass node's per-eye dispatch loop (session 18, ac409d6) into a single Shape D broadcast pass, parallel to session 29's forward-prepass flip. Node-only edit: the `PrepassPipelineSpecializer` host edit (session 29, 90bb176) already covers `Opaque3dDeferred` / `AlphaMask3dDeferred` dispatches through `DrawPrepass` per `prepass/mod.rs:169-170`, so no pipeline-side change lands this session. Pipeline-vs-pass agreement holds because both gates derive from `ExtractedMultiview.subviews.len()`. Pass-side (`bevy_core_pipeline/src/deferred/node.rs`, +116/-99 = +17 net): adds `NonZeroU32` import; removes the `for eye in 0..view_count` loop; replaces per-layer `get_attachment_for_layer(eye)` / `get_attachment_for_layer(eye, StoreOp::Store)` with legacy `get_attachment()` / `get_attachment(StoreOp::Store)` for all four gbuffer color slots + depth; sets the pass descriptor `multiview_mask` to `NonZeroU32::new(u32::MAX >> (32 - view_count))` under `view_count > 1`. The `copy_texture_to_texture` at the end stays unchanged — already handles multi-layer per its session-16 C2-A docstring (`depth_or_array_layers = view_count`). The webgl `clear_texture` block at function-top is unchanged; its stale "stays outside the per-eye loop" comment is rewritten to "runs before the broadcast pass" without changing meaning. F2 lifecycle (session 22, d92e550) traced across three configurations: - Forward + deferred prepass (deferred_rendering case): forward prepass ran first (session 29 broadcast pass) and flipped Normal / MotionVectors / depth global latches to false via legacy calls. Deferred entry: legacy `get_attachment()` on Normal / MotionVectors returns Load (preserves forward output); legacy `get_attachment(StoreOp::Store)` on depth returns Load (preserves forward depth); deferred-specific Deferred / DeferredLightingPassId latches are untouched by forward prepass → legacy returns Clear (first write to the gbuffer). - Deferred-only configuration: all latches untouched on entry; the broadcast pass writes a fresh depth + gbuffer pass on first call. - Early + late (late gated on occlusion_culling): early flips all attachment latches to false; late's legacy calls return Load and preserve early's output. Identical to the per-eye-loop pre-edit behavior at view_count=1 because session-22 F2 seeds per-layer slot 0 from the now-false global, returning Load on the late per-layer call. Smoke verification (4 byte-deterministic witnesses, all identical to both pre-impl runs on HEAD 90bb176 and registry baseline): - deferred_rendering 3628497 md5 d04bdbd9d89dd2ed8e8fe2d61ac6ac2b — **PRIMARY witness**, exercises the deferred prepass node directly. - 3d_scene 697355 — cross-cutting; no DeferredPrepass component, so the new path never opens — confirms surrounding render graph behavior unchanged. - volumetric_fog 3709234 — cross-validates that the deferred prepass edit doesn't perturb forward-prepass + skybox + fog interaction. - motion_blur 4492372 md5 219b899f7f82a5f8d2895e884260f99d — cross- validates that session 29's forward prepass Shape D + this session's deferred Shape D both stay correct under the bg-motion-vectors broadcast pass lifecycle. Eighth consecutive determinism confirmation. No-op argument: `RenderPassDescriptor.multiview_mask` defaults to `None`; the field-set gate is `view_count > 1`. At view_count=1 the post-edit pass descriptor is type-level identical to the pre-edit `multiview_mask: None`. The per-eye loop at view_count=1 ran once with `get_attachment_for_layer(0, ...)`; the synthesized D2 view is bit-identical to `default_view` per the session-16 docstring, so legacy and per-layer paths produce wgpu-equivalent attachments. Latch lifecycle preserved per the F2 trace above. 4 byte-identical witnesses confirm empirically. Cold-read review combined in-session per session-conventions §review-cadence (Shape D thrice-established after sessions 28 + 29). Pipeline-vs-pass agreement verified (both derive from ExtractedMultiview.subviews.len() via the cache-keyed pipeline-key field and the runtime query respectively); view_count == 32 shift-safe (u32::MAX >> 0 = u32::MAX); view_count == 1 no-op (verified empirically by all 4 byte-identical witnesses); F2 traced across forward+deferred, deferred-only, and early+late configurations against all six attachment surfaces (Normal, MotionVectors, Deferred, DeferredLightingPassId, depth, span). One in-session cosmetic fix (R1): rewrote the stale "Stays outside the per-eye loop" comment on the webgl clear_texture block to "Runs before the broadcast pass". No other actionable findings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The branch was developed in incremental sessions whose internal layer naming (L7b-write, L7c, L7d, Shape A/A1/D, C2-A, F2 lifecycle, etc.) seeped into ~21 in-tree comments across 15 files during the layered implementation. Those labels carry no meaning outside this branch's working notes; this commit rewrites each comment to describe the substance in plain rendering vocabulary, without changing any behavior. Strip categories: - "L7d" / "L7d (Shape D)" comment prefix on 12 broadcast-mask explanations (FXAA, tonemapping, skybox, bg-motion-vectors, main-opaque, forward + deferred prepass nodes, plus the matching pipeline-side comments in `MeshPipeline` and `PrepassPipelineSpecializer`): drop the prefix, rephrase the lead sentence to "Broadcast across every eye layer in a single pass." / "Broadcast every <phase> dispatch ... under multiview." Substance after the lead sentence is unchanged. - "F2 lifecycle on entry" → "Attachment lifecycle on entry" in the deferred-prepass node's lifecycle paragraph. - "L7b-write" / "pre-L7b-write" / "post L7b-write" forward-references in 4 files (`mesh_view_bindings.rs`, `prepass_bindings.rs`, `ssao/mod.rs`, `transmission/node.rs`): rewrite each to describe the present runtime behavior rather than a temporal reference to a past session phase. Stale-comment corrections (folded in because the jargon strip exposed them): - `mesh_view_bindings.rs:876-879` previously claimed "The SSAO texture is single-layer" and "every eye reads layer 0". That description predates the per-eye SSAO texture growth; the actual current behavior is `view_count` layers under multiview, with the consumer at `pbr_fragment.wgsl:660` reading its eye's slice via `current_view_index`. Comment rewritten to match. - `mesh_view_bindings.rs:897-899` previously claimed "single-layer transmission texture" — also stale post the per-eye transmission texture growth. Comment rewritten. - `prepass_bindings.rs:80-84` docstring previously said "the underlying textures are still single-layer this session" — stale. Rewritten to describe the multi-layer-under-multiview shape and the consumer reading its eye's slice via `current_view_index`. This is the only `///` doc-visible string in the strip. - `ssao/mod.rs:727-735` previously framed the prepass-layer-count gate as "until C2 grows them this auto-upgrades" — stale. Rewritten to describe the gate as a runtime check on each texture's actual layer count, since SSAO can be configured with multi-layer SSAO output even when prepass attachments remain single-layer. Workspace-wide `cargo check` is clean. No behavior changes; comment- only diff. 15 files, +96/-96 (every change replaces text in place). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The branch was developed in iterative sessions using AI assistance; this rule keeps the per-session planning and review notes (which live in .scratch/) out of committed history. The notes themselves are working artifacts and are not part of the PR contents. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Welcome, new contributor! Please make sure you've read our contributing guide, as well as our policy regarding AI usage, and we look forward to reviewing your pull request shortly ✨ |
Contributor
|
Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke! If it's expected, please add the M-Deliberate-Rendering-Change label. If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I've continued @awtterpip's stalled #16059 — multiview rendering for single-pass stereo (the path toward XR support, #15864) — as a working proof of concept. The work was developed with AI assistance (Claude Opus 4.7), explicitly disclosed in every commit, and is offered as a reference rather than a merge candidate.
What this POC validates
The original design works end-to-end. @awtterpip's foundational decisions —
Multiviewas a per-camera component,array<View, MAX_VIEW_COUNT>runtime-sized WGSL bindings,multiview_maskonRenderPipelineDescriptor+RenderPassDescriptor,@builtin(view_index)plumbed into fragment entries — are wired into every consumer inbevy_pbr/bevy_core_pipeline/bevy_render. None of those consumers required architectural patching beyond the standard host-side view-binding swap and per-fragmentcurrent_view_indexthreading. The only foundational addition on top of the original draft was a packedDynamicArrayUniformBuffer<ViewUniform>host-side storage layer to hold the runtime-sized view array cleanly. Per-eye texture growth atD2Array-typed bindings is supported viaget_attachment_for_layeraccessors with a per-layeris_first_calllatch that mirrors the existingLoadOp::Clear → LoadOp::Loadsemantics on the global accessor. The surrounding wgpu validation surface (MSAA-vs-multiview depth-texture mismatch, pipeline-vs-passmultiview_maskagreement, attachment-layer-count agreement) is respected throughout — without any newunsafeblocks or#[allow]attributes added.The conversion surface is mapped. Coverage by surface:
bevy_pbrview bindingsbevy_core_pipeline+bevy_anti_aliasview bindingsmain_opaque_pass_3d, forward prepass, deferred prepass, skybox (extracted), tonemapping, FXAA, background motion vectors (extracted)A handful of remaining own-pass conversions (wireframe, deferred lighting fullscreen, OIT resolve) and one open design question (Transparent3d sort distance with per-eye divergent distances) are delimited explicitly below.
Status & verification
cargo check --workspaceis clean. The branch adds 9 unit tests covering the foundational types (DynamicArrayUniformBuffer,Multiviewcomponent); these are host-side because the runtime multiview path itself can't be exercised on the development hardware (see below).Non-multiview behavior was tracked across the branch via byte-deterministic screenshot witnesses in
bevy_ci_testing(a frame-100ScreenshotAndExitconfig with a fixed frame time). Six witnesses —3d_scene,volumetric_fog,deferred_rendering,motion_blur,depth_of_field,anti_aliasing(FXAA mode) — remain bit-identical to the pre-branch baseline at the branch tip. Two further witnesses (atmosphere,transmission) became environmentally non-deterministic on the development hardware during the work; for those, post-edit outputs were compared to clean-HEAD samples and found indistinguishable.What's not directly verified. macOS Metal doesn't support wgpu multiview, so the runtime multiview path is not exercised on the development hardware. End-to-end visual multiview verification will require a Vulkan setup. Correctness on the multiview path is supported by static reasoning (pipeline-vs-pass
multiview_maskagreement, attachment-layer-count agreement, MSAA-vs-multiview validation paths), the clean workspace build, and the bit-identical non-multiview behavior — but a Vulkan reviewer is welcome to verify visually.Known followups
Intentionally delimited items — what a real-merge effort would need to land beyond the POC. Roughly in order of design weight:
Open design question: Transparent3d sort distance with per-eye divergent distances. The
Multiviewdocstring documents the current placeholder (single head-pose sort) as intentional, but per-eye sort distances can differ significantly when transparent objects are close to the viewer. Plausible approaches: (a) accept the head-pose sort as a reasonable approximation, (b) split into per-eyeTransparent3dphases (more expensive, eye-correct), or (c) leave transparent rendering single-view under multiview. This is the design question that remained unresolved at the end of awtterpip's draft.Remaining own-pass mechanical conversions. Three pipelines own their own render pass and follow the established
multiview_maskfield-set pattern but haven't been flipped yet: wireframe (own pass inbevy_pbr/src/wireframe.rs), deferred lighting fullscreen (own pass inbevy_pbr/src/deferred/mod.rs), OIT resolve fullscreen (own pass inbevy_core_pipeline/src/oit/resolve/). Each is ~25-40 lines of diff using the existing pattern.Migration-guide entries for breaking API changes. The branch introduces three breaking changes that would need
release-content/migration-guides/entries for real merge:ViewUniforms.uniformsflipped fromDynamicUniformBuffer<ViewUniform>toDynamicArrayUniformBuffer<ViewUniform>. Downstream readers need updating.RenderPipelineDescriptorgained amultiview_mask: Option<NonZeroU32>field. Out-of-tree descriptor literals need to add it or switch to..default().view: Viewtoview_array: array<View, MAX_VIEW_COUNT>underMULTIVIEW, accessed via a newview()helper. Custom-material WGSL referencingviewdirectly will need to switch. TheMaterialtrait docstring andmesh_view_bindings.wgslcomment block (both added by this branch) document the canonical pattern.Pointers for a merge-quality follow-up
For anyone picking this up to land merge-ready: a few specifics worth lifting from the branch rather than re-deriving.
The per-layer attachment-lifecycle latch (F1 + F2 commits). The per-layer accessor API (
get_attachment_for_layeronColorAttachment/DepthAttachment) needs per-layeris_first_callslots, with two subtle interactions worked out across F1 (e186c0354) and F2 (d92e55099): tracking per-layer first-call state, then seeding it from the global latch when a per-layer dispatcher follows a non-per-layer pass that already flipped the global. The commit bodies trace the discovery; both bugs are easy to re-introduce if the API is rewritten from scratch.MSAA + multiview depth-texture carve-out. WGSL has no
texture_depth_multisampled_2d_array. The branch resolves this by gating theMULTIVIEWshader-def push onview_count > 1 && !is_msaain any depth-binding pipeline;mesh_view_bindings.wgsl:99-106documents the established pattern. Volumetric fog, atmosphere render-sky, and DoF all consume this carve-out. Under MSAA + multiview the depth binding stays single-layer; no in-tree camera triggers the combo.Pipeline-vs-pass
multiview_maskagreement. wgpu requires the pipeline descriptor'smultiview_maskand the pass descriptor'smultiview_maskto match on each draw. Every per-pass conversion sets both with the sameview_count > 1predicate and the same shift-safe formulaNonZeroU32::new(u32::MAX >> (32 - view_count)); commit bodies note when one side derives from a pipeline key field and the other from a runtime query, with the chain that ensures the inputs agree.Dispatcher inventory before broadcasting a shared pass. Broadcasting a shared 3D render pass requires every dispatcher landing draws in that pass to be feature-safe under multiview. A grep of
add_render_command::<Phase, Cmd>enumerates the in-tree dispatchers per pass — most shared 3D passes turned out to have a single dispatcher (PBRMeshPipelineorPrepassPipelineSpecializer), simplifying the conversion. The few exceptions (Transparent3dhas two;InfiniteGridinbevy_dev_toolsis not view-binding-converted) are delimited in the followups above.Commit log as a dependency-ordered build. The branch was developed in incremental layered stages (
L1→L7din the commit titles), each layer's prerequisite landing before its dependents. L1-L4 add foundational types (multiview_maskfield onRenderPipelineDescriptor, packed view storage,Multiviewcomponent, view-uniform packing); L5-L7 convert per-pipeline view bindings; L7d flips per-pass broadcast. Reading commits in series walks through the architectural decisions in dependency order; the layer naming is a navigational aid, not standard Bevy terminology.License + closing
All commits are offered under the repo's existing Apache-2.0/MIT dual license — take what's useful, rewrite in whatever style you prefer.
If this isn't a useful direction, please close the PR without hesitation; the branch stays at
https://github.com/bigmark222/bevy/tree/multiview-referenceas a reference regardless. With thanks to @awtterpip for the original draft work that this builds on.