Skip to content

Multiview camera rendering: proof-of-concept continuation of #16059#24422

Open
bigmark222 wants to merge 46 commits into
bevyengine:mainfrom
bigmark222:multiview-reference
Open

Multiview camera rendering: proof-of-concept continuation of #16059#24422
bigmark222 wants to merge 46 commits into
bevyengine:mainfrom
bigmark222:multiview-reference

Conversation

@bigmark222
Copy link
Copy Markdown

I've continued @awtterpip's stalled #16059 — multiview rendering for single-pass stereo (the path toward XR support, #15864) — as a working proof of concept. The work was developed with AI assistance (Claude Opus 4.7), explicitly disclosed in every commit, and is offered as a reference rather than a merge candidate.

What this POC validates

The original design works end-to-end. @awtterpip's foundational decisions — Multiview as a per-camera component, array<View, MAX_VIEW_COUNT> runtime-sized WGSL bindings, multiview_mask on RenderPipelineDescriptor + RenderPassDescriptor, @builtin(view_index) plumbed into fragment entries — are wired into every consumer in bevy_pbr / bevy_core_pipeline / bevy_render. None of those consumers required architectural patching beyond the standard host-side view-binding swap and per-fragment current_view_index threading. The only foundational addition on top of the original draft was a packed DynamicArrayUniformBuffer<ViewUniform> host-side storage layer to hold the runtime-sized view array cleanly. Per-eye texture growth at D2Array-typed bindings is supported via get_attachment_for_layer accessors with a per-layer is_first_call latch that mirrors the existing LoadOp::Clear → LoadOp::Load semantics on the global accessor. The surrounding wgpu validation surface (MSAA-vs-multiview depth-texture mismatch, pipeline-vs-pass multiview_mask agreement, attachment-layer-count agreement) is respected throughout — without any new unsafe blocks or #[allow] attributes added.

The conversion surface is mapped. Coverage by surface:

Surface Coverage
bevy_pbr view bindings mesh forward, prepass, deferred prepass, deferred lighting, SSR, atmosphere, fog, DoF, transmission, SSAO, cluster
bevy_core_pipeline + bevy_anti_alias view bindings skybox, tonemapping, OIT resolve, blit, bloom first-downsample, FXAA, background motion vectors
Per-pass broadcast (7 passes) main_opaque_pass_3d, forward prepass, deferred prepass, skybox (extracted), tonemapping, FXAA, background motion vectors (extracted)
Per-eye dispatch SSAO compute pipelines, transmission per-eye iterative refinement

A handful of remaining own-pass conversions (wireframe, deferred lighting fullscreen, OIT resolve) and one open design question (Transparent3d sort distance with per-eye divergent distances) are delimited explicitly below.

Status & verification

cargo check --workspace is clean. The branch adds 9 unit tests covering the foundational types (DynamicArrayUniformBuffer, Multiview component); these are host-side because the runtime multiview path itself can't be exercised on the development hardware (see below).

Non-multiview behavior was tracked across the branch via byte-deterministic screenshot witnesses in bevy_ci_testing (a frame-100 ScreenshotAndExit config with a fixed frame time). Six witnesses — 3d_scene, volumetric_fog, deferred_rendering, motion_blur, depth_of_field, anti_aliasing (FXAA mode) — remain bit-identical to the pre-branch baseline at the branch tip. Two further witnesses (atmosphere, transmission) became environmentally non-deterministic on the development hardware during the work; for those, post-edit outputs were compared to clean-HEAD samples and found indistinguishable.

What's not directly verified. macOS Metal doesn't support wgpu multiview, so the runtime multiview path is not exercised on the development hardware. End-to-end visual multiview verification will require a Vulkan setup. Correctness on the multiview path is supported by static reasoning (pipeline-vs-pass multiview_mask agreement, attachment-layer-count agreement, MSAA-vs-multiview validation paths), the clean workspace build, and the bit-identical non-multiview behavior — but a Vulkan reviewer is welcome to verify visually.

Known followups

Intentionally delimited items — what a real-merge effort would need to land beyond the POC. Roughly in order of design weight:

Open design question: Transparent3d sort distance with per-eye divergent distances. The Multiview docstring documents the current placeholder (single head-pose sort) as intentional, but per-eye sort distances can differ significantly when transparent objects are close to the viewer. Plausible approaches: (a) accept the head-pose sort as a reasonable approximation, (b) split into per-eye Transparent3d phases (more expensive, eye-correct), or (c) leave transparent rendering single-view under multiview. This is the design question that remained unresolved at the end of awtterpip's draft.

Remaining own-pass mechanical conversions. Three pipelines own their own render pass and follow the established multiview_mask field-set pattern but haven't been flipped yet: wireframe (own pass in bevy_pbr/src/wireframe.rs), deferred lighting fullscreen (own pass in bevy_pbr/src/deferred/mod.rs), OIT resolve fullscreen (own pass in bevy_core_pipeline/src/oit/resolve/). Each is ~25-40 lines of diff using the existing pattern.

Migration-guide entries for breaking API changes. The branch introduces three breaking changes that would need release-content/migration-guides/ entries for real merge:

  • ViewUniforms.uniforms flipped from DynamicUniformBuffer<ViewUniform> to DynamicArrayUniformBuffer<ViewUniform>. Downstream readers need updating.
  • RenderPipelineDescriptor gained a multiview_mask: Option<NonZeroU32> field. Out-of-tree descriptor literals need to add it or switch to ..default().
  • WGSL view binding shape changed from view: View to view_array: array<View, MAX_VIEW_COUNT> under MULTIVIEW, accessed via a new view() helper. Custom-material WGSL referencing view directly will need to switch. The Material trait docstring and mesh_view_bindings.wgsl comment block (both added by this branch) document the canonical pattern.

Pointers for a merge-quality follow-up

For anyone picking this up to land merge-ready: a few specifics worth lifting from the branch rather than re-deriving.

The per-layer attachment-lifecycle latch (F1 + F2 commits). The per-layer accessor API (get_attachment_for_layer on ColorAttachment / DepthAttachment) needs per-layer is_first_call slots, with two subtle interactions worked out across F1 (e186c0354) and F2 (d92e55099): tracking per-layer first-call state, then seeding it from the global latch when a per-layer dispatcher follows a non-per-layer pass that already flipped the global. The commit bodies trace the discovery; both bugs are easy to re-introduce if the API is rewritten from scratch.

MSAA + multiview depth-texture carve-out. WGSL has no texture_depth_multisampled_2d_array. The branch resolves this by gating the MULTIVIEW shader-def push on view_count > 1 && !is_msaa in any depth-binding pipeline; mesh_view_bindings.wgsl:99-106 documents the established pattern. Volumetric fog, atmosphere render-sky, and DoF all consume this carve-out. Under MSAA + multiview the depth binding stays single-layer; no in-tree camera triggers the combo.

Pipeline-vs-pass multiview_mask agreement. wgpu requires the pipeline descriptor's multiview_mask and the pass descriptor's multiview_mask to match on each draw. Every per-pass conversion sets both with the same view_count > 1 predicate and the same shift-safe formula NonZeroU32::new(u32::MAX >> (32 - view_count)); commit bodies note when one side derives from a pipeline key field and the other from a runtime query, with the chain that ensures the inputs agree.

Dispatcher inventory before broadcasting a shared pass. Broadcasting a shared 3D render pass requires every dispatcher landing draws in that pass to be feature-safe under multiview. A grep of add_render_command::<Phase, Cmd> enumerates the in-tree dispatchers per pass — most shared 3D passes turned out to have a single dispatcher (PBR MeshPipeline or PrepassPipelineSpecializer), simplifying the conversion. The few exceptions (Transparent3d has two; InfiniteGrid in bevy_dev_tools is not view-binding-converted) are delimited in the followups above.

Commit log as a dependency-ordered build. The branch was developed in incremental layered stages (L1L7d in the commit titles), each layer's prerequisite landing before its dependents. L1-L4 add foundational types (multiview_mask field on RenderPipelineDescriptor, packed view storage, Multiview component, view-uniform packing); L5-L7 convert per-pipeline view bindings; L7d flips per-pass broadcast. Reading commits in series walks through the architectural decisions in dependency order; the layer naming is a navigational aid, not standard Bevy terminology.

License + closing

All commits are offered under the repo's existing Apache-2.0/MIT dual license — take what's useful, rewrite in whatever style you prefer.

If this isn't a useful direction, please close the PR without hesitation; the branch stays at https://github.com/bigmark222/bevy/tree/multiview-reference as a reference regardless. With thanks to @awtterpip for the original draft work that this builds on.

bigmark222 and others added 30 commits May 23, 2026 20:13
Mirrors the underlying wgpu field. Previously the pipeline cache
hard-coded multiview_mask: None on the raw wgpu descriptor, so
multiview pipelines could not be built through Bevy's normal
pipeline machinery. All existing construction sites default to
None via Default.

Foundational for multiview (single-pass stereo) rendering;
relates to bevyengine#15864 and the prior effort in bevyengine#16059.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A new storage type that wraps DynamicUniformBuffer to pack many
runtime-sized arrays of T into one uniform buffer, padded out to
the length of the largest array so the WGSL side can read them
as array<T, N> where N is the max length.

This is the host-side companion to multiview-style view bindings,
where each camera contributes a small array of per-view uniforms
(one element per eye / cubemap face / shadow cascade) into a
single bound uniform.

Also makes batched_uniform_buffer::MaxCapacityArray pub(crate) so
the new module can reuse the same encase shim.

Based on the design in bevyengine#16059 by @awtterpip. Relates to bevyengine#15864.

Co-Authored-By: Piper <awtterpip@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multiview lets a camera render to multiple layers of its render
target texture array in a single draw pass — the foundation for
VR / XR stereo rendering. Each MultiviewSubview specifies a
per-eye view_from_camera offset and clip_from_view projection;
the camera's own GlobalTransform remains the head pose used for
sort distance and frustum culling.

Mirrors the per-eye data into a new ExtractedMultiview component
on the render-world entity. Subsequent layers will read this to
pack the view uniform array, allocate N-layer render targets,
and emit multiview pipelines.

Holds with a single render-world entity per camera because
multiview rendering is by definition single-pass: per-eye phase
items don't fit the model, since one multiview pipeline draw
emits to all layers via @Builtin(view_index). This departs from
the reverted ExtractedViews { views: Vec<_> } shape in bevyengine#16059,
which fought the existing single-view ExtractedView API.

Relates to bevyengine#15864.

Co-Authored-By: Piper <awtterpip@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches the view uniform buffer from DynamicUniformBuffer to
DynamicArrayUniformBuffer so each render-world view entity
contributes an array of per-subview uniforms rather than a single
ViewUniform. Non-multiview cameras produce single-element arrays
(no behavioral change for existing shaders, which still read the
first element); multiview cameras produce one element per layer.

Adds set_label, add_usages, and IntoBinding to
DynamicArrayUniformBuffer so the existing ViewUniforms wiring
(storage usage when supported, bind-group entries via
IntoBinding) keeps working.

prepare_view_uniforms now runs in two passes because the dynamic
offset stride isn't known until all arrays are queued: the first
pass stages per-view arrays, then finish_queuing + write_buffer,
then the second pass attaches the resolved ViewUniformOffset.
Shared per-camera state (viewport, frustum, lod_view_world_position)
is hoisted out of the per-subview ViewUniform construction.

Sets up L6, which switches the WGSL view binding to
`array<View, MAX_VIEW_COUNT>` so shaders can read per-layer data
indexed by @Builtin(view_index).

Based on the design in bevyengine#16059 by @awtterpip. Relates to bevyengine#15864.

Co-Authored-By: Piper <awtterpip@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `finish_queuing_assigns_offsets` test contained a tautological
`is_none() || is_some()` assertion that always passed; replace it with
an honest one and add a separate test for the pre-finish_queuing case.

The `binding()` doc claimed `None` is returned only when `finish_queuing`
hasn't been called, but it also returns `None` until `write_buffer` has
allocated the underlying GPU buffer. Document both conditions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Multiview component previously had no bounds check — extraction
inserted ExtractedMultiview for any non-None component, including the
empty-views case (which the doc claimed should be ignored) and the
>32-views case (which wgpu's u32 multiview_mask can't even represent).
view_mask() also had a stale comment claiming the 32 cap was enforced
at extraction time.

- Add MAX_VIEW_COUNT (= 32) constant alongside Multiview.
- view_mask() now returns None for views.len() > MAX_VIEW_COUNT.
- extract_cameras treats empty views as "no multiview" and warns
  once + falls back to non-multiview for >MAX_VIEW_COUNT.
- Document the contract on Multiview itself.
- Add unit tests for view_mask boundaries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous comment claimed multiview subviews ignored
`extracted_view.clip_from_world`, but the code below it still falls
back to that field via `unwrap_or_else`. Rather than change the
behavior (which only affects the unusual combination of multiview
with a manually-set override), describe what the code actually does
and call out the override-plus-multiview combo as undefined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For cameras with an `ExtractedMultiview` component, `prepare_view_targets`
now sizes the main texture (both ping-pong attachments and the MSAA
sampled attachment) with one array layer per subview, instead of always
1. Cameras with no `Multiview` are unchanged — `view_count` falls back
to 1 and the default texture view still spans the single layer.

The texture array count is also added to `MainTextureKey` so cameras
targeting the same window/format but with different layer counts don't
clobber each other in the per-frame texture cache.

`ViewTarget` carries the layer count as `main_texture_array_layers` and
exposes it as `multiview_count() -> Option<NonZeroU32>`, returning
`None` for single-layer cameras. This gives downstream systems a
frame-stable, render-side source for the multiview view count (useful
for the upcoming `MAX_VIEW_COUNT` shader def, which can't be derived
from the view-uniform buffer's capacity on frame 0).

`TextureDimension::D2` is kept for the texture itself — wgpu allows
multi-layer D2 textures, and the `D2Array` view binding is a shader-side
concern handled in a later layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`mesh_view_bindings::view` (and the parallel prepass view layout) now bind
an `array<View, MAX_VIEW_COUNT>` with a 1-element fallback when the shader
def is undefined. A `view()` helper returns the current view via
`view_array[current_view_index]`, where `current_view_index` is a
`var<private>` defaulting to 0; multiview entry points will overwrite it
from `@builtin(view_index)` in a follow-up.

All current readers are mechanical rewrites from `view.field` to
`view().field`. With `MAX_VIEW_COUNT` still unemitted by any pipeline, the
fallback `array<View, 1>` path matches the single ViewUniform packed by
`DynamicArrayUniformBuffer`, so non-multiview rendering is unchanged.

Verified via 3d_scene screenshot smoke test (blue cube + circular plane +
shadow, matches session-3 baseline) plus the standard unit suites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`MeshPipelineKey` gains a 6-bit `MAX_VIEW_COUNT` field that encodes the
camera's multiview layer count (1..=32). `check_views_need_specialization`
and `check_prepass_views_need_specialization` both OR in the count from
the camera's `ExtractedMultiview` component (1 when absent).

When the encoded count is >1, `MeshPipeline::specialize` and
`PrepassPipeline::specialize` push both the `MULTIVIEW` flag and the
`MAX_VIEW_COUNT` UInt def, switching the WGSL view binding to the
`array<View, N>` shape and enabling the `@builtin(view_index)` paths in
the entry points. Non-multiview cameras emit neither def and continue to
hit the `array<View, 1>` fallback, preserving the existing render path.

The mesh + prepass vertex/fragment entry points now accept
`@builtin(view_index)` under `#ifdef MULTIVIEW` and assign it to
`bevy_pbr::mesh_view_bindings::current_view_index` at the top of the
function body, so all downstream helpers automatically read the correct
per-eye view via `view()`.

Verified via 3d_scene screenshot smoke test (non-multiview path
unchanged) plus the standard unit suites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L6's view binding rename (`view` → `view_array`) introduced a `view()`
helper at the old name, so files that imported `view` as a single symbol
and accessed `view.field` now resolve `view` to the function and fail to
parse. Three readers were missed by C1's mechanical rewrite:

- `bevy_core_pipeline::oit::oit_draw` (`view.viewport.z`)
- `bevy_dev_tools::debug_overlay` (`view.viewport.zw`)
- `bevy_pbr::meshlet::visibility_buffer_resolve` (`view.viewport.zw`,
  `view.world_position` ×3)

Reproduced with the `order_independent_transparency` example before this
change: shader fails with `expected variable access, found
"bevy_pbr::mesh_view_bindings::view"`. After this change the same example
renders the expected three transparent spheres.

Out-of-scope status is unchanged: these crates still don't get
`@builtin(view_index)` plumbing (deferred to L7+) and continue to hit
the fallback `array<View, 1>` path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L6 wired `@builtin(view_index)` into four entry points
(`mesh.wgsl::vertex`, `pbr.wgsl::fragment`, `prepass.wgsl::vertex`,
`prepass.wgsl::fragment`) but missed two more in the same pipeline trees
that also call `view()`:

- `pbr_prepass.wgsl::fragment` (both branches). This is
  `StandardMaterial`'s `PrepassFragmentShader` override, so every PBR
  mesh in the prepass path runs it. The `#ifdef PREPASS_FRAGMENT` branch
  reads `view().mip_bias` and reaches `view().unjittered_clip_from_world`
  via `pbr_prepass_functions::calculate_motion_vector`; the `#else`
  branch reads `view().mip_bias` via
  `prepass_sample_color_and_alpha_discard`. Without plumbing, a
  multiview prepass would compute motion vectors against view[0] for
  both eyes — visible motion-blur / TAA artifacts.
- `wireframe.wgsl::vertex` (WIREFRAME_WIDE path). Reads
  `view().viewport.zw` to compute screen-space line widths; without
  plumbing the second eye would use view[0]'s viewport.

`mesh.wgsl::fragment` and `wireframe.wgsl::fragment` don't read `view`,
so they don't need plumbing even when compiled with the multiview key.

Verified via 3d_scene + wireframe screenshot smoke tests (both render
unchanged) and the standard unit suites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for L7a. `bevy_core_pipeline::input_texture` declares a
binding-switched `input_texture` (`texture_2d<f32>` /
`texture_2d_array<f32>` under `#ifdef MULTIVIEW`) at `@group(0)
@binding(0)`, plus four sampling helpers covering the texture API
surface used by the fullscreen post-fx pipelines L7a will convert
(blit, bloom, FXAA):

- `sample_input(s, uv)` — basic `textureSample`.
- `sample_input_offset(s, uv, offset)` — `textureSample` with a constant
  pixel offset (bloom's 13-tap downsample kernel).
- `sample_input_level(s, uv, level)` — `textureSampleLevel` (FXAA reads
  at LOD 0).
- `sample_input_level_offset(s, uv, level, offset)` — `textureSampleLevel`
  with a pixel offset (FXAA's neighborhood luma samples).

Each helper hides the `texture_2d_array` `array_index` argument under
MULTIVIEW, sourced from a shared `var<private> current_view_index: i32 =
0;` that consumers overwrite from `@builtin(view_index)` at the top of
their fragment entry points (the same convention `bevy_pbr::
mesh_view_bindings` uses for `view()`).

The sampler is passed as a parameter so each pipeline can keep its
local sampler binding (`s` in bloom, `samp` in FXAA, etc.) without
renaming. Only the texture binding name is standardized to
`input_texture` across consumers.

This commit only registers the shader library via `load_shader_library!`;
no pipeline imports it yet. Subsequent L7a commits convert blit, bloom,
and FXAA to use it.

Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); 3d_scene screenshot smoke test matches the
session-2/4/5 baseline (non-multiview path unchanged, as nothing
imports the new library yet).

Co-Authored-By: awtterpip <awtterpip@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WGSL requires the `offset` operand of `textureSample` and
`textureSampleLevel` to be a const-expression. C0's
`sample_input_offset` and `sample_input_level_offset` helpers took
`offset: vec2<i32>` as a runtime function parameter and forwarded it to
the underlying `textureSample*`, which fails naga validation:

    error: this operation is not supported in a const context
       ┌─ embedded://bevy_core_pipeline/input_texture.wgsl:33:48

The validation error tanks the whole shared module, so as soon as any
pipeline imports `bevy_core_pipeline::input_texture` (initially the blit
pipeline in C1), every consumer fails to load — visible as a blank
swapchain on macOS because the upscaling node falls through to the
empty render-pass branch when its pipeline isn't ready.

Reproduced by running 3d_scene against C1 with the offset variants
still present: black screen + the naga error in the log. After this
commit C1 verifies cleanly.

Const-offset sampling can't be helpered in WGSL; the offset must be a
literal at the callsite. The two pipelines that use it (bloom's 13-tap
downsampling kernel; FXAA's neighborhood luma samples) will instead
`#ifdef MULTIVIEW` at the callsite — awtterpip's original convention.
Documented in the helper file's comment block.

C0's claim that the file covers "the texture API surface used by the
fullscreen post-fx pipelines L7a will convert" is now narrower than
written: it covers the non-offset subset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`BlitPipelineKey` gains a `multiview_view_count: u32` field encoding
the source texture's layer count (1 for single-view, N for multiview
cameras after L5). `BlitPipeline` now stores two bind-group layouts —
the existing single-layer one plus `layout_multiview` whose texture
binding is `texture_2d_array` — and `create_bind_group` takes the
multiview count to pick between them.

`BlitPipeline::specialize` chooses the layout from
`key.multiview_view_count > 1` and emits `MULTIVIEW` + `MAX_VIEW_COUNT`
shader-defs into the fragment stage when so (mirroring L6's mesh-key
threading). The pipeline descriptor's own `multiview_mask` stays
`None`, matching L6: the shader machinery is in place but the
render-pass-level multiview enablement is deferred.

The two systems that currently build `BlitPipelineKey` —
`prepare_view_upscaling_pipelines` in `bevy_core_pipeline::upscaling`
and `prepare_msaa_writeback_pipelines` in
`bevy_post_process::msaa_writeback` — now `Option<&ExtractedMultiview>`
the camera and feed `subviews.len()` (or 1 when absent) into the key.
Their render-pass nodes pass `target.multiview_count()` from the
already-L5-aware `ViewTarget` to `create_bind_group`, picking the
matching layout.

`blit.wgsl` switches from a locally-declared `in_texture` binding to
`#import bevy_core_pipeline::input_texture::{sample_input,
current_view_index}`. The `fs_main` entry point takes
`@builtin(view_index)` under `#ifdef MULTIVIEW` and assigns it to
`current_view_index` at the top of the body — same shape as L6's mesh
and prepass entry points.

This is the first consumer of the `bevy_core_pipeline::input_texture`
helper module added in C0 (`d12900fc2`).

Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); 3d_scene screenshot matches the session-2/
4/5 baseline (blue cube + circular plane + shadow). On macOS Metal the
multiview branch can't actually be exercised at runtime (no Vulkan
multiview), so the multiview render path is static-reasoning until a
Vulkan host runs it; the non-multiview path is the verified one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Of bloom's four pipeline specializations (first downsample, regular
downsample, regular upsample, final upsample), only the first
downsample reads from the camera's main texture — every other pass
samples bloom's own mip pyramid, which `prepare_bloom_textures` builds
as a single-layer 2D texture irrespective of the camera. Multiview
specialization is therefore scoped to that one pass.

`BloomDownsamplingPipelineKeys` gains `multiview_view_count: u32`,
which `prepare_downsampling_pipeline` reads from the camera's optional
`ExtractedMultiview` (1 when absent) and threads only into the
`first_downsample = true` specialization; the `first_downsample =
false` specialization is locked to count = 1 so the regular downsample
pipeline continues to bind the bloom mip texture as `texture_2d`.

`BloomDownsamplingPipeline` now carries two layouts: the existing
`bind_group_layout` plus a `bind_group_layout_multiview` whose texture
binding is `texture_2d_array`. `specialize` picks the array layout +
emits `MULTIVIEW` and `MAX_VIEW_COUNT` shader-defs only when both
conditions hold. The bloom node's inline-built first-downsample bind
group picks the matching layout via `view_target.multiview_count()`.

`bloom.wgsl` switches from a locally-declared `input_texture` to
`#import bevy_core_pipeline::input_texture::{input_texture,
sample_input, current_view_index}`. The non-uniform-scale 13-tap path
and the 3x3 tent kernel use the `sample_input` helper (their offsets
are runtime uv arithmetic, not const operands of `textureSample`). The
`UNIFORM_SCALE` 13-tap path duplicates its 13 const-offset samples
under `#ifdef MULTIVIEW` because WGSL requires the `offset` operand of
`textureSample` to be a const-expression — it can't be threaded
through a helper. The duplication is contained to ~13 lines, gated on
the multiview path only.

`@builtin(view_index)` plumbing is added to the `downsample_first`
fragment entry point only (the `downsample` and `upsample` entry
points never see `MULTIVIEW`, so their `texture_2d` binding and
helper-less sampling shapes are preserved).

Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); both `bloom_3d` (multi-tap blur on
colored spheres) and `3d_scene` (no bloom, regression check) match
the session baseline. Multiview path is static-reasoning until a
Vulkan host runs it; on macOS Metal the multiview branch can't be
exercised at runtime.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`FxaaPipelineKey` gains `multiview_view_count: u32`, read by
`prepare_fxaa_pipelines` from the optional `ExtractedMultiview` (1 when
absent). `FxaaPipeline` carries a second `texture_bind_group_multiview`
layout whose texture binding is `texture_2d_array`. `specialize` picks
the array layout + emits `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs
when count > 1; the descriptor's `multiview_mask` stays `None`, matching
L6 and L7a's other pipelines.

The FXAA node picks the matching layout via
`view_target.multiview_count()` when building the cached bind group.
The cache key (source `TextureViewId`) already invalidates across
multiview/non-multiview state changes — the texture view IDs differ
because the underlying `ViewTarget` swaps between single-layer and
array-layer textures.

`fxaa.wgsl` switches from a locally-declared `screenTexture` (renamed
to `input_texture`) to `#import bevy_core_pipeline::input_texture::
{input_texture, sample_input_level, current_view_index}`. Of the 12
`textureSampleLevel` callsites:

- 4 no-offset reads (center + 3 endpoint samples + final-uv read) and
  the in-loop endpoint refreshes now go through `sample_input_level`.
- 8 const-offset reads (4 cardinal-neighbor lumas + 4 corner lumas)
  duplicate under `#ifdef MULTIVIEW` because WGSL requires the
  `offset` operand to be a const-expression and can't be threaded
  through a helper.

`@builtin(view_index)` is plumbed into the `fragment` entry point
under `#ifdef MULTIVIEW` and assigned to `current_view_index` at the
top of the body.

Verified: `cargo check --workspace` clean; render/camera/pbr unit
suites pass (12 + 43 + 2); `anti_aliasing` example with FXAA enabled
on the helmet scene renders correctly (no artifacts, no validation
errors). Multiview branch is static-reasoning until a Vulkan host
runs it; on macOS Metal only the non-multiview path is exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for L7b-read (PBR mesh screen-space texture conversion):
add `MULTIVIEW` to `MeshPipelineViewLayoutKey`, derive it from
`MeshPipelineKey::max_view_count() > 1` in the `From` impl, and OR it into
the per-frame layout key in `prepare_mesh_view_bind_groups` based on
`ViewTarget::multiview_count()`.

Switch binding 17 (`screen_space_ambient_occlusion_texture`) to
`texture_2d_array<f32>` under MULTIVIEW in `mesh_view_bindings.wgsl` and
in `layout_entries`. The underlying SSAO texture stays single-layer
(`depth_or_array_layers = 1`); the bind group entry creates a
`TextureViewDimension::D2Array` view of it inline when multiview is
active. Layer growth + per-eye SSAO writes are deferred to L7b-write.

Thread `current_view_index` into the two SSAO readers via a verbose
`#ifdef MULTIVIEW` branch around `textureLoad`, since WGSL's
`texture_2d` and `texture_2d_array` `textureLoad` signatures differ
(`array_index` goes between `coords` and `level`). `deferred_lighting.wgsl::fragment`
also gains `@builtin(view_index)` + the assignment to
`current_view_index` that PBR's main fragment already has from L6.

Non-multiview path is bit-identical (smoke verified on the `ssao`
example against pre-C1 baseline; 3d_scene regression preserved).
multiview branch is unexercised on macOS Metal but the layout/view
shapes line up with the WGSL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-read C2 — depth, normal, motion-vector, and deferred prepass texture
bindings (binding indices 20-23) switch to `_array` variants under
`#ifdef MULTIVIEW` in `mesh_view_bindings.wgsl`. Host-side
`prepass::get_bind_group_layout_entries` learns the same switch and
`prepass::get_bindings` grows two flags (`multiview_array` for bindings
20-22, `deferred_multiview` for binding 23) so the caller can request
`D2Array` views of the still-single-layer prepass textures. WGSL has no
multisampled-array texture types, so MSAA + multiview keeps the
single-layer multisampled shape (rare combo in practice).

WGSL consumers thread `current_view_index` as the array layer in their
`textureLoad`/`textureSampleLevel` calls via `#ifdef MULTIVIEW`
duplication:

- `bevy_pbr::prepass_utils` (depth/normal/motion-vector reads)
- `bevy_pbr::pbr_deferred_functions` indirectly via
  `bevy_pbr::deferred_lighting::fragment` (deferred read at line 49) —
  the fragment also gains `@builtin(view_index)` + the assignment to
  `current_view_index` for its own deferred read and the SSAO read
  introduced in C1.
- `bevy_pbr::ssr` (SSR fragment) + `bevy_pbr::raymarch` (depth fetch
  and bilinear/nearest sample helpers).
- `bevy_dev_tools::debug_overlay` (7 textureLoad sites across
  depth/normal/motion-vector/deferred preview modes). Pipeline
  specialize now also pushes the `MULTIVIEW` shader-def from the
  layout-key bit, paralleling the existing `MULTISAMPLED` emission.

`textureDimensions` calls work unchanged on array textures, so
`pbr_functions.wgsl::317` and `ssr.wgsl::103` stay as-is. The deferred
prepass texture is never multisampled, so its multiview switch is
unconditional on the MULTIVIEW bit.

Non-multiview path is bit-identical (smoke verified on `deferred_rendering`,
`anti_aliasing` with TAA via a temporary `Msaa::Off`+`TemporalAntiAliasing::default()`
on the example camera that was reverted after, and `3d_scene` regression).
multiview branch is unexercised on macOS Metal but the layout/view shapes
line up with the WGSL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-read C3 — `view_transmission_texture` (binding 24) switches to
`texture_2d_array<f32>` under `#ifdef MULTIVIEW` in
`mesh_view_bindings.wgsl` and in the host-side bind group layout.

The two `textureSampleLevel` reads in `transmission.wgsl`
(`fetch_transmissive_background_non_rough` + the main spiral-tap
`fetch_transmissive_background`) duplicate under `#ifdef MULTIVIEW` to
pass `view_bindings::current_view_index` between `coords` and `level` —
WGSL's `texture_2d` vs `texture_2d_array` `textureSampleLevel`
signatures differ on the array-layer parameter, same as the
`textureLoad` pattern from C1/C2. Caller (`pbr_input_from_standard_material`)
is invoked from `pbr.wgsl::fragment`, which already sets
`current_view_index` from L6.

Bind-group construction in `prepare_mesh_view_bind_groups` creates a
fresh `TextureViewDimension::D2Array` view of the still-single-layer
transmission texture (or, when the camera has no transmission setup
this frame, of the `FallbackImageZero` texture) when `is_multiview`,
keeping the default_view path otherwise. Per-eye transmission writes
+ layer growth are deferred to L7b-write.

Non-multiview path is bit-identical (smoke verified on `transmission`
+ `3d_scene` regression). multiview branch is unexercised on macOS Metal
but the layout/view shapes line up with the WGSL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-read follow-up F1. Both `DeferredLightingLayout::specialize` and
`ScreenSpaceReflectionsPipeline::specialize` already derive their view
bind-group layout from a `MeshPipelineViewLayoutKey` that — as of L7b-read
C1 — gains the `MULTIVIEW` bit when `max_view_count > 1` via the
`From<MeshPipelineKey>` impl. Under multiview that picks the array-typed
layout (binding 17 `texture_2d_array<f32>`, binding 20 `texture_depth_2d_array`,
binding 23 `texture_2d_array<u32>`, etc.) but neither pipeline pushed the
`MULTIVIEW` shader-def, so their WGSL (`deferred_lighting.wgsl`,
`ssr.wgsl`, `raymarch.wgsl` — all modified by C1/C2) would compile against
the non-multiview `#else` branches in `mesh_view_bindings.wgsl` and
mismatch the layout's array texture types. Pipeline creation would fail
wgpu validation the moment a multiview camera fires.

Mirror the pattern the other consumers of `MeshPipelineViewLayoutKey`
already use:
- `DeferredLightingLayout::specialize` pushes `MULTIVIEW` from
  `key.max_view_count() > 1` (parallels `MeshPipeline::specialize` at
  `mesh.rs:3330` which has done this since L6).
- `ScreenSpaceReflectionsPipeline::specialize` pushes `MULTIVIEW` from
  `key.mesh_pipeline_view_key.contains(MULTIVIEW)` (parallels
  `RenderDebugOverlayPipeline::specialize` which C2 already updated).

Non-multiview path is bit-identical (no def pushed when the bit isn't
set; verified via `ssr`, `deferred_rendering`, and `3d_scene` smoke
screenshots against the pre-F1 baseline). multiview branch is
unexercised on macOS Metal but the layout/WGSL/views now agree shape-wise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-write C1 — first of three commits growing the screen-space texture
families converted by L7b-read (SSAO / prepass / transmission) from
`depth_or_array_layers = 1` to `view_count` and routing each eye's writes
into its own slice.

`prepare_ssao_textures` derives `view_count` from `Option<&ExtractedMultiview>`
(falls back to 1 with no component) and sets `depth_or_array_layers =
view_count` on all four SSAO textures (preprocessed depth + 5 mips, noisy,
output, depth_differences). Non-multiview cameras allocate single-layer
textures — bit-identical to the pre-L7b-write shape.

`SsaoBindGroups` keeps a single `common_bind_group` (the view-uniform
binding, which is per-camera not per-eye) but the three pipeline-specific
groups become `Vec<SsaoPerViewBindGroups>` — one entry per eye. Each
entry's storage views are explicit single-layer `D2` views
(`base_array_layer: eye`, `array_layer_count: Some(1)`, `dimension: D2`)
into the eye's slice of each SSAO texture. The prepass depth/normal reads
also become per-layer `D2` views of the prepass attachment textures when
multiview is active; the non-multiview branch keeps the existing
`prepass_textures.{depth,normal}_view()` helpers (which return
`default_view`, currently single-layer). Once L7b-write C2 lands the
prepass textures grow to `view_count` layers and the per-layer helper
closures here will index real per-eye data.

The `ssao` render-graph node wraps its three compute passes in a
`for per_view in &bind_groups.per_view` loop — for non-multiview cameras
this is one iteration with the same bind groups today's code already
built, so the dispatch shape is unchanged.

Out of scope for this commit (and the rest of L7b-write):
- SSAO's `@group(1) @binding(2) var<uniform> view: View` still reads the
  camera's "eye 0" view-matrix slot — for multiview both eyes get
  reconstructed against the head-pose camera matrices. Real per-eye
  matrices need either an L6-style array-binding rewrite of the SSAO
  shaders or per-eye dynamic offsets into the packed view-uniform
  buffer (DynamicArrayUniformBuffer slots are sized per-array, not
  per-element). Documented for L8/L9.
- Bind-group layouts stay `texture_storage_2d` / `texture_2d`. With
  per-eye single-layer `D2` views the existing layout fits without a
  layout-key MULTIVIEW bit; no SSAO WGSL changes either.

Non-multiview path verified bit-identical (smoke on the `ssao` example
matches the post-rebase baseline; AO shading on the room walls + sphere
unchanged). multiview branch is unexercised on macOS Metal but the
view dimensions match the layout entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7-skybox — converts the skybox pipeline's `View` uniform binding to the
same `array<View, MAX_VIEW_COUNT>` shape L6 introduced for the mesh +
prepass view binding. Punted from L7a in session 6 because the skybox
is structurally a view-uniform rewrite (typed `View` binding, cubemap,
fragment-side `view.X` reads) rather than an L7a-style texture-array
conversion.

`skybox.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
  `var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
  1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
  returning `view_array[current_view_index]`. Per-site rewrite: `view.X`
  → `view().X` (3 reads in `coords_to_ray_direction`, 1 in
  `skybox_fragment`).
- New `FragmentInput` struct so the fragment can pick up
  `@builtin(view_index)` under `#ifdef MULTIVIEW` and assign it to
  `current_view_index` at the top of the body. The vertex stage doesn't
  read `view` at all (it just generates a fullscreen triangle from
  `vertex_index`), so it stays untouched.

`skybox/mod.rs`:
- Layout entry switches from `uniform_buffer::<ViewUniform>(true)` to
  `uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
  satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
- `SkyboxPipelineKey` gains a `multiview_view_count: u32` field; the
  specialize pushes `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs when the
  count is `> 1` (mirrors `MeshPipeline::specialize` in `mesh.rs`).
- `prepare_skybox_pipelines` reads `Option<&ExtractedMultiview>` and
  sets `multiview_view_count` from `subviews.len()` (falls back to 1
  with no component). Source matches the L6 mesh + prepass convention.

`descriptor.multiview_mask` stays `None` — same deferral as L6/L7a/L7b.
The wgpu render-pass enablement is L7d's job. On non-multiview Vulkan
hosts the pipeline compiles into the existing single-view shape; on
multiview Vulkan hosts pipeline creation will still fail validation
because `@builtin(view_index)` is read without a `multiview_mask`.

Out of scope:
- Per-eye skybox sampling correctness (rotation, parallax) is the right
  shape now but unexercised on macOS Metal. The math in
  `coords_to_ray_direction` reads `view().view_from_clip` and
  `view().world_from_view` per eye, so once L7d enables single-pass
  multiview the skybox will produce per-eye correct rays.
- Skybox-prepass is a separate pipeline (in
  `skybox/prepass.rs` per awtterpip's design notes) and isn't shipped
  in current Bevy — no work needed here.

Non-multiview path verified bit-identical (smoke on the `skybox`
example with `--features bevy_ci_testing free_camera` matches the prior
baseline; Ryfjallet cubemap + red hut + ground render cleanly).
multiview branch is unexercised on macOS Metal but the binding /
layout / shader-def shapes line up with L6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7b-write C1 (`80f52dd9d`) grew the SSAO textures to `view_count` layers
and started creating per-layer `D2` views of the prepass depth/normal
attachment textures for each eye:

    let prepass_depth_view = is_multiview.then(|| {
        prepass_layer_view(
            &prepass_textures.depth.as_ref().unwrap().texture.texture,
            layer, "ssao_prepass_depth_layer_view",
        )
    });

`is_multiview` is derived from the SSAO textures' layer count. But the
prepass attachment textures themselves are still single-layer — C2
(prepass texture growth) is deferred. So for `layer >= 1`,
`base_array_layer: layer, array_layer_count: Some(1)` of a 1-layer
texture is a wgpu validation error. C1's commit body acknowledged the
"once C2 lands the per-layer helper closures here will index real
per-eye data" temporal aspect but missed that the intermediate state
errors at view creation rather than silently producing wrong content.
`prepare_ssao_bind_groups` runs unconditionally per frame, so any
multiview-configured camera with `DepthPrepass + NormalPrepass + SSAO`
would trip the validation regardless of pipeline-creation outcome.

Gate the per-layer prepass view creation on the prepass texture's
actual `depth_or_array_layers > 1`, not on the SSAO texture's. Until
C2 lands, both `prepass_depth_multilayer` and `prepass_normal_multilayer`
are `false` and all eyes fall back to
`prepass_textures.{depth,normal}_view()` (default_view, single-layer).
SSAO output for `eye >= 1` is then computed against eye-0 prepass data
— content-incorrect but matching the still-eye-0 read of `view: View`
documented in C1's body, and no crash. When C2 grows the prepass
textures, the flags auto-flip and the existing per-layer view code
becomes valid with no further SSAO-side change.

Non-multiview path unchanged: prepass textures stay 1-layer, flags
stay false, no per-layer prepass views ever created. cargo check
clean; lib tests 12 + 43 + 2; ssao + skybox smoke screenshots match
the post-L7b-write-C1 baselines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7c — first pipeline in the L7c tail; converts the tonemapping
post-process pipeline's `View` uniform binding to the same
`array<View, MAX_VIEW_COUNT>` shape L6 introduced for the mesh + prepass
view binding and L7-skybox mirrored. Same L6-shape pattern, applied to
tonemapping's smaller surface (single `view.X` read; uses the existing
`FullscreenVertexOutput` so it follows blit's direct
`@builtin(view_index)` parameter rather than skybox's wrapper struct).

`tonemapping.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
  `var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
  1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
  returning `view_array[current_view_index]`.
- Single per-site rewrite: `view.color_grading` → `view().color_grading`.
- Fragment signature gains `@builtin(view_index) view_index: i32` as a
  direct parameter under `#ifdef MULTIVIEW` (mirrors `blit.wgsl::fs_main`).
  Body assigns `current_view_index = view_index;` at top under the same
  gate.

`tonemapping/mod.rs`:
- Layout entry 0 switches from `uniform_buffer::<ViewUniform>(true)` to
  `uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
  satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
  Import swap: drops `uniform_buffer` + `ViewUniform`, picks up
  `uniform_buffer_sized` and `ExtractedMultiview`.
- `TonemappingPipelineKey` gains a `multiview_view_count: u32` field;
  `specialize` pushes `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs when
  the count is `> 1` (mirrors `SkyboxPipeline::specialize` and
  `MeshPipeline::specialize`).
- `prepare_view_tonemapping_pipelines` reads
  `Option<&ExtractedMultiview>` and sets `multiview_view_count` from
  `subviews.len()` (falls back to 1 with no component). Source matches
  the L6 mesh + prepass + L7-skybox convention.

`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7-skybox
deferral. The wgpu render-pass enablement is L7d's job. On non-multiview
hosts the pipeline compiles into the existing single-view shape; on
multiview Vulkan hosts pipeline creation will still fail validation
because `@builtin(view_index)` is read without a `multiview_mask`.

Out of scope:
- `hdr_texture` (binding 1) stays single-layer `texture_2d<f32>`. The
  post-process source texture isn't part of L7b-write yet (only SSAO
  has shipped from L7b-write; C2 prepass + C3 transmission are punted),
  so the tonemapping input is still a single-layer view of
  `target.post_process_write().source`. When L7b-write grows the
  post-process color texture to per-eye layers, tonemapping's source
  binding will need its own L7b-read-shape conversion.
- LUT bindings 3 + 4 are unrelated to multiview — both stay
  `texture_3d<f32>` + sampler.
- The node-side `TonemappingBindGroupCache` shape is unchanged. Cache
  key is `(view_uniforms.buffer.id, source.id, lut.id)`; per-view
  selection inside the packed `DynamicArrayUniformBuffer` is still
  driven by `view_uniform_offset.offset` at `set_bind_group` call time,
  so cache hits remain correct across views.

Non-multiview path verified bit-identical: `cargo check --workspace`
clean; lib tests bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2,
bevy_core_pipeline 0/0; smoke on `3d_scene --features bevy_ci_testing`
(default Camera3d → `hdr: true` + `Tonemapping::TonyMcMapface`, so the
tonemapping pipeline including the LUT path actually ran) renders blue
cube + shadow on circular ground cleanly. multiview branch is
unexercised on macOS Metal but the binding / layout / shader-def shapes
line up with L6 + L7-skybox.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7c — second pipeline in the L7c tail (after tonemapping); converts the
OIT resolve pipeline's `View` uniform binding to the same
`array<View, MAX_VIEW_COUNT>` shape L6 introduced for the mesh + prepass
view binding and L7-skybox + L7c-tonemapping mirrored. Smallest possible
L7c-shape diff — a single `view.X` read site (`view.viewport.z`) and an
already-local `FullscreenVertexOutput` struct so the fragment can pick
up `@builtin(view_index)` as a direct parameter (mirrors `blit.wgsl::fs_main`
and `tonemapping.wgsl::fragment`).

`oit_resolve.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
  `var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
  1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
  returning `view_array[current_view_index]`.
- Single per-site rewrite: `view.viewport.z` → `view().viewport.z`.
- Fragment signature gains `@builtin(view_index) view_index: i32` as a
  direct parameter under `#ifdef MULTIVIEW`. Body assigns
  `current_view_index = view_index;` at top under the same gate.

`oit/resolve/mod.rs`:
- Layout entry 0 switches from `uniform_buffer::<ViewUniform>(true)` to
  `uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
  satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
  Import swap: drops `uniform_buffer` + `ViewUniform`, picks up
  `uniform_buffer_sized` and `ExtractedMultiview`.
- `OitResolvePipelineKey` gains a `multiview_view_count: u32` field;
  `specialize_oit_resolve_pipeline` pushes `MULTIVIEW` + `MAX_VIEW_COUNT`
  shader-defs when the count is `> 1` (mirrors `SkyboxPipeline::specialize`,
  `TonemappingPipeline::specialize`, and `MeshPipeline::specialize`).
- `queue_oit_resolve_pipeline` reads `Option<&ExtractedMultiview>` from
  its camera query and sets `multiview_view_count` from `subviews.len()`
  (falls back to 1 with no component). The per-entity
  `cached_pipeline_id` HashMap already keys on the full
  `OitResolvePipelineKey`, so a non-multiview ↔ multiview transition on
  the same camera correctly evicts the stale cache entry and queues a
  fresh pipeline.

`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7-skybox/
L7c-tonemapping deferral. The wgpu render-pass enablement is L7d's job.
On non-multiview hosts the pipeline compiles into the existing
single-view shape; on multiview Vulkan hosts pipeline creation will
still fail validation because `@builtin(view_index)` is read without a
`multiview_mask`.

Out of scope:
- The `OitResolveBindGroup` is a single global Resource recreated each
  frame in `prepare_oit_resolve_bind_group`. With layout entry 0 now
  unbounded, the same bind group binds against both the multiview and
  non-multiview pipeline shapes. Per-camera selection inside the packed
  `DynamicArrayUniformBuffer` is still driven by `view_uniform.offset`
  at `set_bind_group` call time (`node.rs:82`).
- Storage buffer bindings 1-3 (`nodes`, `heads`, `atomic_counter`) are
  per-screen-pixel + per-pass state, unrelated to the per-view View
  uniform. Index math uses `view().viewport.z` (screen width) which is
  identical across eyes — multiview correctness will need either a
  per-eye linked-list partitioning of the storage buffers (heads array
  sized per-eye) or accept that the OIT linked list is screen-space
  shared. Future L7b-write / L7d concern, not L7c-shape work.
- Optional group(1) depth bind group (`depth: texture_depth_2d`,
  conditional on `!DEPTH_PREPASS`) stays as a non-multiview `D2`
  texture view of `ViewDepthTexture` — same gate as the rest of the
  prepass depth-view path. When L7b-write grows the depth texture per
  eye, this binding will need its own per-layer view treatment.

Non-multiview path verified bit-identical: `cargo check --workspace`
clean; lib tests bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2,
bevy_core_pipeline 0/0; smoke on `order_independent_transparency
--features bevy_ci_testing` renders three overlapping transparent
spheres (red/blue/green) with correct depth-sorted alpha blending and
the "Order Independent Transparency: On" toggle confirming the resolve
pipeline ran. multiview branch is unexercised on macOS Metal but the
binding / layout / shader-def shapes line up with L6 + L7-skybox +
L7c-tonemapping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7c — third pipeline in the L7c tail (after tonemapping + OIT resolve);
converts the background motion vectors prepass pipeline's `View` uniform
binding to the same `array<View, MAX_VIEW_COUNT>` shape L6 introduced
for the mesh + prepass view binding and L7-skybox + L7c-tonemapping +
L7c-OIT-resolve mirrored. Two `view.X` read sites
(`view.world_from_clip`, `view.unjittered_clip_from_world`).

`background_motion_vectors.wgsl`:
- `var<uniform> view: View` becomes a runtime-sized
  `var<uniform> view_array: array<View, #{MAX_VIEW_COUNT}>` with a
  1-element fallback when `MAX_VIEW_COUNT` is undefined.
- Adds `var<private> current_view_index: i32 = 0;` and a `view()` helper
  returning `view_array[current_view_index]`.
- Per-site rewrites: `view.world_from_clip` → `view().world_from_clip`;
  `view.unjittered_clip_from_world` → `view().unjittered_clip_from_world`.
- Fragment signature gains `@builtin(view_index) view_index: i32` as a
  direct parameter under `#ifdef MULTIVIEW` (mirrors `blit.wgsl::fs_main`
  and other L7c conversions). Body assigns
  `current_view_index = view_index;` at top under the same gate.
- `previous_view` binding at `@group(0) @binding(1)` is left untouched
  (see "Out of scope" below).

`background_motion_vectors.rs`:
- Layout entry 0 switches from `uniform_buffer::<ViewUniform>(true)` to
  `uniform_buffer_sized(true, None)` — wgpu's binding-size check is then
  satisfied by both `array<View, 1>` and `array<View, MAX_VIEW_COUNT>`.
  Layout entry 1 (`PreviousViewData` uniform) stays as the existing
  typed `uniform_buffer::<PreviousViewData>(true)`. Import swap: drops
  `ViewUniform`, adds `uniform_buffer_sized`, `ExtractedMultiview`, and
  `ShaderDefVal`.
- `BackgroundMotionVectorsPipelineKey` gains a `multiview_view_count: u32`
  field; `specialize` pushes `MULTIVIEW` + `MAX_VIEW_COUNT` shader-defs
  when the count is `> 1` (mirrors the other L7c pipelines).
- `prepare_background_motion_vectors_pipelines` reads
  `Option<&ExtractedMultiview>` from its camera query and sets
  `multiview_view_count` from `subviews.len()` (falls back to 1 with no
  component). Source matches the L6 mesh + prepass + L7-skybox +
  L7c-tonemapping + L7c-OIT-resolve convention.

`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7-skybox/
L7c-tonemapping/L7c-OIT-resolve deferral. The wgpu render-pass
enablement is L7d's job.

Out of scope:
- `previous_view: PreviousViewUniforms` binding (group 0 / binding 1).
  This is a per-camera (single-eye-equivalent) uniform sourced from
  `PreviousViewUniforms` resource — separate plumbing from the multiview
  `ViewUniform` packed array. Under multiview today, all eyes read the
  same `previous_view.clip_from_world`, which means eye-1's sky-pixel
  motion vector subtracts eye-0-derived previous clip-space from
  eye-1-derived current clip-space — incorrect for stereo VR where
  each eye has its own previous transform. Converting `PreviousViewData`
  to a packed-array per-camera shape parallels L4's
  `DynamicArrayUniformBuffer<ViewUniform>` work — future L8-style
  refactor that lives in `crates/bevy_core_pipeline/src/prepass/mod.rs`,
  not L7c-shape work. Today this is non-issue because the multiview
  branch is unexercised on Metal anyway, and once L7d enables it on
  Vulkan, only sky-pixel motion vectors on eye>=1 are affected (TAA +
  motion blur on the background would smear slightly wrong for the
  second eye until L8 lands).
- The pipeline bind-group is per-camera (created in
  `prepare_background_motion_vectors_bind_groups`) and inserted as a
  `BackgroundMotionVectorsBindGroup` component. With layout entry 0 now
  unbounded, the same bind group binds against both pipeline variants.
  Per-eye selection inside the packed `DynamicArrayUniformBuffer` is
  driven by `view_uniform_offset.offset` at the node's `set_bind_group`
  call site (in `prepass/node.rs`, unchanged).
- Layout entry 1's `PreviousViewData` is sized to a single struct (not
  unbounded). The existing typed `uniform_buffer::<PreviousViewData>(true)`
  declaration is correct for the single-per-camera shape and doesn't
  need the L7c loosening treatment.

Non-multiview path verified bit-identical: `cargo check --workspace`
clean; lib tests bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2,
bevy_core_pipeline 0/0; smoke on `motion_blur --features
bevy_ci_testing` renders the red car on the road with motion-blurred
trees + balls + blue sky background. The `MotionBlur` component
auto-requires `MotionVectorPrepass`, so the background motion vectors
pipeline ran on the sky pixels and TAA/motion-blur consumed the
resulting motion-vector attachment. multiview branch is unexercised on
macOS Metal but the binding / layout / shader-def shapes line up with
L6 + the other L7c conversions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the L6 mesh-path / L7c skybox + post-process shape but in
`bevy_pbr`: the two GPU clustering shaders (`cluster_z_slice.wgsl`
compute and `cluster_raster.wgsl` graphics) switch their `var<uniform>
view: View` binding to a runtime-sized `var<uniform> view_array:
array<View, MAX_VIEW_COUNT>` with a 1-element fallback when
`MAX_VIEW_COUNT` is undefined, plus a `var<private> current_view_index`
and a `view()` helper that indexes the array. Per-site `view.X` reads
become `view().X` (5 sites in z-slice, 8 in raster; the local `let view
= view_from_clip * clip` shadow in `clip_to_view` is preserved). Host
side, both pipeline layouts switch binding 6 to
`uniform_buffer_sized(true, None)` so the WGSL fallback `array<View,
1>` and the multiview `array<View, MAX_VIEW_COUNT>` both bind cleanly
against the existing per-camera `DynamicArrayUniformBuffer<ViewUniform>`
slot (the dynamic offset already in place keeps selecting the
per-camera array slot in the packed buffer).

`ClusteringRasterPipelineKey` gains `multiview_view_count: u32` and a
new `ClusteringZSlicingPipelineKey` replaces the unit key on the
z-slicing pipeline. Both specialize impls push a `MAX_VIEW_COUNT`
shader def when the count is `>1`. `prepare_clustering_pipelines` now
reads `Option<&ExtractedMultiview>` from the view query and threads the
subview count through to every clustering specialization.

Clustering reads `view()` at the default `current_view_index = 0` —
i.e. eye 0's head pose — unconditionally. This is deliberate: the
GPU clustering output is a single set of storage buffers per camera
shared across all eyes (`ViewGpuClusteringBuffers` +
`ViewClusteringBindGroups` are inserted once per `ExtractedView`), and
threading `@builtin(view_index)` into the fragment would diverge
cluster assignments across eyes for shared output buffers. Eye-1 of a
multiview camera therefore consumes a cluster grid built from eye-0's
view-matrix; the resulting eye-1 culling is slightly conservative for
objects near eye-1's frustum edges. Same shape as the L7b-write C1
SSAO compromise; per-eye clustering would require splitting the
clustering output buffers per eye, which is future work paired with
the prepass / L7d multiview-mask refactor.

`ClusteringAllocationPipeline` has no view binding (its layout binds
only the cluster offsets, lights, metadata, and scratchpad buffers)
and is untouched. The MULTIVIEW shader def is not pushed: it's only
needed to gate `@builtin(view_index)` plumbing in the WGSL, and
clustering doesn't thread view_index.

Non-multiview cameras get `multiview_view_count = 1` → no MAX_VIEW_COUNT
def → `array<View, 1>` fallback → bit-identical behavior to the
pre-conversion single-`View` binding.

Smoke: `3d_scene --features bevy_ci_testing` confirms GPU clustering
runs ("GPU clustering is supported on this device.") and the default
single-point-light scene renders correctly with the cube shadowed by
the clustered point light. `lighting --features bevy_ci_testing`
exercises the count + populate raster passes plus z-slicing across
multiple point and spot lights with colored floor projections; matches
the standard `lighting` example baseline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First sub-changes of the L7b-write C2 prepass refactor (A + B + C from
.scratch/session15_l7d_planning.md). No-op-shaped on non-multiview
cameras; new API is unused this commit — session 17 wires it into the
prepass / deferred render-graph nodes for per-eye dispatch.

A. Grow prepass + view-depth textures to per-eye layer count.

   `prepare_prepass_textures` and `prepare_core_3d_depth_textures` now
   read `Option<&ExtractedMultiview>` and derive
   `view_count = m.subviews.len()` (1 for non-multiview cameras, matching
   the SSAO C1 shape from 80f52dd). Each `TextureDescriptor` literal
   sets `size.depth_or_array_layers = view_count`:
   - prepass: depth_1, depth_2, normal, motion_vectors, deferred_1,
     deferred_2, deferred_lighting_pass_id (7 sites).
   - depth: view_depth_texture (1 site).
   `ViewPrepassTextures.size` flows through to the end-of-prepass-node
   depth copy; both source (view depth) and dest (prepass depth) now
   share the same per-eye layer count, so the copy is correct under
   multiview. On non-multiview cameras every descriptor stays
   single-layer → bit-identical to the pre-C2 shape.

B. Disambiguate `TextureCache` keys with `view_count`.

   Each of the 7 prepass HashMap keys and the 1 depth HashMap key gain
   `view_count` so a multiview camera and a non-multiview camera sharing
   the same render target don't collide on a cached texture of the wrong
   layer count. Same shape as the existing `(camera.target, msaa)` key
   on the depth path.

C. `ColorAttachment` / `DepthAttachment` per-layer view exposure.

   New `ColorAttachment::get_attachment_for_layer(layer)` and
   `get_unsampled_attachment_for_layer(layer)` synthesize per-layer `D2`
   `TextureView`s of the underlying (possibly multi-layer) texture and
   resolve target, lazily cached in `Arc<OnceLock<Vec<TextureView>>>`
   populated all-at-once on first per-layer access. Shared across
   `ColorAttachment` clones via `Arc`, matching the existing
   `is_first_call: Arc<AtomicBool>` cross-clone-sharing pattern.

   `DepthAttachment` gains an optional `multi_layer_texture: Option<Texture>`
   field and a new `DepthAttachment::new_multi_layer(texture, view, clear)`
   constructor that stores the underlying `Texture` handle so the new
   `get_attachment_for_layer(layer, store)` can build per-layer views.
   The existing `new(view, clear)` constructor is unchanged (leaves the
   field as `None`); session 17 will update `ViewDepthTexture::new` to
   thread the texture handle through. `get_attachment_for_layer` panics
   if called on a `new`-constructed attachment, since the underlying
   texture handle isn't available — the panic message names the
   `new_multi_layer` constructor.

   Both per-layer accessors preserve the existing first-call clear-vs-load
   semantics: ColorAttachment uses `Ordering::SeqCst` like its sampled
   sibling; DepthAttachment uses `Ordering::Relaxed` + `clear_value.unwrap()`
   like its sibling (`is_first_call` is still initialized from
   `clear_value.is_some()` so the unwrap is safe). For `view_count = 1`
   the per-layer-0 view is bit-identical to `default_view`.

Notable shape decisions:
- D2 (per-eye dispatch) over D1 (broadcast / multiview_mask). Session-15
  plan locked this in; per-eye dispatch avoids exposing class-(b) atomic
  broadcast hazards in oit_draw / cluster raster fragment paths under
  L7d. Documented as a deviation from awtterpip's PR bevyengine#16059 design.
- The per-layer view cache lives behind `Arc<OnceLock<Vec<TextureView>>>`
  rather than per-call view creation so the returned `&TextureView`
  borrows from the attachment for the lifetime of the render-pass
  descriptor. ColorAttachment is re-constructed each frame in
  `prepare_prepass_textures` so the OnceLock is per-frame — no staleness
  risk if the underlying texture is recreated by a camera resize.

Diff budget: ~180 net lines across 2 files (session-15 plan estimated
~80). Overhead is in `texture_attachment.rs` (~155 vs planned ~50):
both sampled + unsampled ColorAttachment variants, the shared
`build_per_layer_d2_views` helper, and per-method docstrings.

Verification:
- `cargo check --workspace` clean.
- Lib tests: bevy_render 12/12, bevy_camera 43/43, bevy_pbr 2/2.
- Screenshot smokes vs baseline:
  - `3d_scene --features bevy_ci_testing` — byte-identical to
    /tmp/post-rebase-3d_scene.png (697355 bytes).
  - `ssao --features bevy_ci_testing` — visually correct (AO shading on
    sphere/cube/room corners present); SSAO F1 multilayer-gate stays
    `false` because prepass textures are still single-layer on this
    non-multiview camera (view_count = 1).
  - `deferred_rendering --features bevy_ci_testing` — renders correctly
    with Deferred mode active (3 colored spheres, helmet bust, helmet
    with flame card all shaded). No baseline existed pre-C2; saved at
    /tmp/c2-abc-deferred.png for future comparison.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the render-pass body in `run_prepass_system` and
`run_deferred_prepass_system` in `for eye in 0..view_count`,
swapping every color + depth `get_attachment()` for the
per-layer `get_attachment_for_layer(eye)` introduced in
`c315f71a3`. `view_count` is derived from the new
`Option<&ExtractedMultiview>` query field, mirroring SSAO C1's
shape from `80f52dd9d`; non-multiview cameras have
`view_count = 1` and exercise the per-layer path against the
existing single-layer prepass + depth textures.

`ViewDepthTexture::new` is upgraded to call
`DepthAttachment::new_multi_layer(...)` so per-eye depth
attachment access works at all. Without this the per-eye
dispatch would panic at first `get_attachment_for_layer` call
per session 16's panic-on-misuse design. `ViewDepthTexture`
also gains a thin `get_attachment_for_layer` delegate so the
attachment field can stay private.

End-of-pass depth-copy blocks at the end of both nodes stay
outside the eye loop and outside `copy_texture_to_texture`'s
single call: post-C2 sub-A both source and destination depth
textures carry `depth_or_array_layers = view_count`, so one
call copies every layer. The webgl `clear_texture` block in
the deferred node also moves outside the loop -- it operates
on the whole texture via the default `ImageSubresourceRange`,
so one call suffices.

This is C2 sub-changes D2 + F from session-15's L7b-write
plan. Together with `c315f71a3` (texture growth + per-layer
attachment API), this completes the C2 prepass refactor:
prepass + deferred render-graph nodes now dispatch per layer
of the multiview depth + prepass texture arrays. D2 (per-eye
dispatch) is the design choice over D1 (broadcast) recorded
in session 15; it's a deviation from awtterpip's PR bevyengine#16059
design that needs to surface in the eventual reference-PR
description.

Smoke (all on macOS Metal, `--features bevy_ci_testing`):
* `3d_scene` -- byte-identical (697355 bytes) to
  `/tmp/c2-abc-3d_scene.png` (forward-only, no DepthPrepass;
  strongest read-side no-op witness via the
  `ViewDepthTexture::new` upgrade).
* `deferred_rendering` -- byte-identical (3628497 bytes) to
  `/tmp/c2-abc-deferred.png` (deferred node per-eye dispatch
  no-op-shaped on view_count=1).
* `ssao` -- visually correct, depth+normal prepass active.
  Camera animation drifts so byte-match isn't expected.
* `motion_blur` -- visually correct, motion-vectors prepass
  active.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bigmark222 and others added 16 commits May 24, 2026 00:41
`ColorAttachment` and `DepthAttachment` previously shared a single
`is_first_call: Arc<AtomicBool>` across every per-layer attachment
access in a frame, so under per-eye dispatch (session 18,
`ac409d6a3`) only the first eye's call to
`get_attachment_for_layer` cleared its layer; subsequent eyes saw
`first_call = false` and emitted `LoadOp::Load` on a layer that
had never been written this frame. Result: layers 1..N depth-test
against undefined memory, and the per-layer color textures
(normal, motion_vectors, deferred, deferred_lighting_pass_id)
accumulate frame-over-frame leftover data instead of clearing.

The bug landed silently in `c315f71a3` (session 16) -- which
introduced the per-layer accessors -- but only became reachable
once session 18's per-eye dispatch made
`get_attachment_for_layer` a real per-layer caller. macOS Metal
has no wgpu multiview support so smoke can't reproduce it; this
hardening is forward-correctness for the per-eye dispatch path
on any wgpu backend that actually exercises `view_count > 1`.

Fix: lazily-populated `per_layer_first_call: Arc<OnceLock<Vec<AtomicBool>>>`
mirroring the `per_layer_views` cache shape, one slot per layer
of the underlying texture. Each `get_attachment_for_layer(layer)`
flips its own slot, and also flips the global `is_first_call` to
false so legacy `get_attachment` callers (e.g., main opaque /
transparent pass running after the prepass per-eye loop on
`ViewDepthTexture`) still load the per-layer-cleared depth
instead of re-clearing it. `mark_as_cleared` and `prepare_for_new_frame`
also flip every initialized per-layer slot.

Single-layer textures (`view_count = 1`) initialize a one-slot
vec on first per-layer access and stay bit-identical to the old
behavior: layer 0's slot starts true (matching the old global
init), gets flipped false on first call (matching old fetch_and),
and stays false (matching old single-shared-AtomicBool).

Smoke (macOS Metal, `--features bevy_ci_testing`):
* `3d_scene` -- byte-identical (697355 bytes) to
  `/tmp/post-rebase-3d_scene.png` -- confirms read-side no-op on
  the depth attachment.
* `deferred_rendering` -- byte-identical (3628497 bytes) to
  `/tmp/c2-abc-deferred.png` -- confirms no-op on the deferred
  node's 4 per-layer color attachments + 1 per-layer depth
  attachment under `view_count = 1`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit follow-up A from `.scratch/session17_audit_planning.md` §4.1:
applies pattern (a) (binding-layout swap to D2Array under MULTIVIEW)
to the first `view_depth_texture.view()` consumer pipeline. Same
L7c shape as skybox / tonemapping / OIT-resolve / bg-motion-vectors,
adapted to the existing MULTISAMPLED + DENSITY_TEXTURE bind-group-
layout-key surface.

`volumetric_fog.wgsl`:
- `depth_texture` binding declaration nests `#ifdef MULTIVIEW` inside
  the non-multisampled branch — `texture_depth_2d_array` under
  `MULTIVIEW && !MULTISAMPLED`, `texture_depth_2d` otherwise. Same
  shape as the prepass-texture bindings in `mesh_view_bindings.wgsl`.
  Multisampled branch stays single-layer regardless of MULTIVIEW
  (see "MSAA + multiview" note below).
- Fragment signature gains `@builtin(view_index) view_index: i32`
  as a direct parameter under `#ifdef MULTIVIEW` (mirrors
  `blit.wgsl::fs_main` and other L7c conversions). Body assigns
  `current_view_index = view_index;` at top under the same gate so
  the existing `view()` helper (already used at lines 131/200/341
  per the L6 view-binding conversion in `04ed678bf`) reads the
  correct eye's `View`.
- Single `textureLoad` read site adds the `view_index` layer
  argument under `MULTIVIEW && !MULTISAMPLED`. Both multisampled
  branches and the non-multiview branch keep the existing 3-arg
  `textureLoad(_, frag_xy, 0)` shape.

`volumetric_fog/render.rs`:
- `VolumetricFogBindGroupLayoutKey` grows from 2 bits to 3 with a
  new `MULTIVIEW = 0x4` flag. The `VOLUMETRIC_FOG_BIND_GROUP_LAYOUT_COUNT`
  const (= `all().bits() + 1`) automatically grows to 8 layouts;
  the four MSAA+MULTIVIEW combinations are unreachable per the
  layout-key construction rules but cost nothing.
- Layout-build site (`init_volumetric_fog_pipeline`) gains a third
  branch: `texture_depth_2d_multisampled()` under MULTISAMPLED, then
  `texture_2d_array(TextureSampleType::Depth)` under MULTIVIEW, else
  `texture_depth_2d()`. Matches the helper shape used in
  `prepass_bindings.rs:32` and `mesh_view_bindings.rs:301`.
- `VolumetricFogPipelineKey` gains a `multiview_view_count: u32`
  field (default 1 for non-multiview). The `>1` value gates both
  the layout MULTIVIEW bit and the MULTIVIEW + MAX_VIEW_COUNT
  shader-def push in `specialize`.
- Render-graph node (`volumetric_fog`) ViewQuery gains
  `Option<&ExtractedMultiview>` and the bind-group-layout-key
  construction sets MULTIVIEW based on `view_count > 1 && !is_msaa`
  identically to specialize.
- `prepare_volumetric_fog_pipelines` query gains
  `Option<&ExtractedMultiview>` and threads
  `subviews.len()` (or 1 fallback) into the pipeline key. Source
  matches the L6 mesh + prepass + L7c convention.
- `bind_group_layout_description` adds the "multiview" name to the
  debug-label iter.

`descriptor.multiview_mask` stays `None` — same L6/L7a/L7b/L7c
deferral. The wgpu render-pass enablement is L7d's job.

MSAA + MULTIVIEW carve-out: WGSL has no
`texture_depth_multisampled_2d_array`, so the host gates the
MULTIVIEW shader def push and the layout-key MULTIVIEW bit on
`!MULTISAMPLED`. Under MSAA + multiview the depth binding stays
single-layer (no shader def, no layout switch), and the post-C2-A
multi-layer multisampled depth texture binding will fail wgpu
validation. This is the same limitation already documented in
`mesh_view_bindings.wgsl:99-106` for the prepass-texture bindings:
the MSAA + multiview combination is left unsupported at the
texture-binding level (rare in practice; VR doesn't pair with MSAA).
No existing camera in tree triggers this combo.

Non-multiview cameras get `multiview_view_count = 1` → no MULTIVIEW
push, no layout-key bit → fallback path → bit-identical behavior to
the pre-conversion single-layer binding.

Smoke verification:
- `volumetric_fog --features bevy_ci_testing` byte-identical
  (3709234 bytes) pre/post (both `/tmp/session19-baseline-volumetric_fog.png`
  and `/tmp/session19-post-volumetric_fog.png`). Strongest possible
  no-op witness for the converted pipeline on the existing
  non-multiview path.
- `3d_scene --features bevy_ci_testing` byte-identical (697355
  bytes), matches the lineage from sessions 16-18
  (`/tmp/post-rebase-3d_scene.png`, `/tmp/c2-abc-3d_scene.png`,
  `/tmp/session18-3d_scene.png`, `/tmp/session18-f1-3d_scene.png`).
  Read-side witness — volumetric fog isn't enabled on the 3d_scene
  camera so the fog pipeline never compiles, but confirms the
  depth-prepass plumbing stays clean.

multiview branch is unexercised on macOS Metal but the binding /
layout / shader-def shapes line up with `mesh_view_bindings.wgsl`'s
established prepass-texture pattern.

Out of scope:
- Atmosphere render_sky L7-shape conversion (next session per
  `.scratch/session17_audit_planning.md` §5).
- DoF L7-shape conversion (session 21+ per same plan).
- HZB depth pyramid stays deferred (pattern c per audit §4.3) —
  compute pipeline can't take `@builtin(view_index)` and the output
  is single-layer; per-eye HZB is an L8 layer.
- MSAA + MULTIVIEW depth binding (open question per audit §6) —
  resolved here by matching the established `mesh_view_bindings`
  carve-out. A real fix requires a per-layer D2 view + per-eye
  dispatch (pattern b) or a higher-level workaround; not in scope
  for the pattern (a) per-consumer conversion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…MULTIVIEW

The view depth texture under MULTIVIEW grew to a per-eye layered texture
in session 16 (C2 sub-A). On a multiview camera the existing
`texture_depth_2d` binding at `@group(0) @binding(13)` would fail wgpu
validation against the D2Array view; render_sky's fragment also has no
way to address its own eye's depth. Forward-enablement audit follow-up
per `.scratch/session17_audit_planning.md` §4.2.

WGSL (`render_sky.wgsl`): binding 13 nests `#ifdef MULTIVIEW` inside the
non-multisampled branch — `texture_depth_2d_array` under
`MULTIVIEW && !MULTISAMPLED`, `texture_depth_2d` otherwise; the
multisampled branch stays single-layer. Fragment entry gains a separate
`@builtin(view_index) view_index: i32` parameter alongside the
existing `FullscreenVertexOutput`-typed `in` (rather than a struct
rewrite — `in.position`/`in.uv` references stay untouched). The single
`textureLoad` read site picks the 4-arg form with `view_index` under
`MULTIVIEW && !MULTISAMPLED`; the other three branches keep the 3-arg
form. Same nest shape as `volumetric_fog.wgsl` (commit dc391d3) and
the `mesh_view_bindings.wgsl:99-106` prepass-texture pattern.

Host (`resources.rs`): `RenderSkyBindGroupLayouts` grows a third
`render_sky_multiview` field (binding 13 = `texture_2d_array` of depth).
`RenderSkyPipelineKey` gains `multiview_view_count: u32`. Specialize
pushes `MULTIVIEW` + `MAX_VIEW_COUNT` and selects the multiview layout
when `view_count > 1 && msaa_samples == 1`. Both `queue_render_sky_pipelines`
(pipeline key construction) and `prepare_atmosphere_bind_groups`
(bind-group layout pick) gain `Option<&ExtractedMultiview>` and gate on
the same predicate.

MSAA + MULTIVIEW carve-out: WGSL has no `texture_depth_multisampled_2d_array`,
so the MULTIVIEW shader def + multiview-layout pick are gated on
`!MULTISAMPLED`. Under MSAA + multiview the depth binding stays
single-layer, identical to today; the post-C2-A multi-layer multisampled
depth texture would fail wgpu validation against it. No camera in tree
triggers this combo. Same carve-out as
`mesh_view_bindings.wgsl:99-106` and session 19 volumetric fog.

Atmosphere's view binding (`@group(0) @binding(3) var<uniform> view: View`
in `bindings.wgsl`) and the camera-shared sky LUTs (`sky_view_lut`,
`aerial_view_lut`) are NOT touched by this commit. Under MULTIVIEW the
depth read is now eye-correct so sky-by-foreground-geometry occlusion
matches each eye's depth buffer, but the sky LUTs and ray-direction
computations still read element-0 view data — eye-correct view rays
through the sky LUTs are a separate L8 layer (per
`.scratch/session17_audit_planning.md` §6 second bullet). This commit
fixes the binding-layout-vs-texture-view shape mismatch; it does not
make the rendered sky fully per-eye-correct.

Smoke:
- `atmosphere --features bevy_ci_testing`: byte-identical (3542199
  bytes, md5 23c5c7f864035a6d4a58b47aa1acb24c) pre/post conversion on
  a warm shader cache. Newly added to the byte-deterministic example
  registry.
- `3d_scene --features bevy_ci_testing`: byte-identical (697355
  bytes) — depth-prepass-plumbing witness, atmosphere not enabled so
  render_sky pipeline doesn't compile.
- `volumetric_fog --features bevy_ci_testing`: byte-identical
  (3709234 bytes) — fog imports atmosphere transmittance LUT shader;
  confirms the atmosphere shared-module changes don't perturb fog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WGSL: the `depth_texture` binding (single-input + dual-input bind groups
share the same `@group(0) @binding(1)`) gains a `texture_depth_2d_array`
variant under `MULTIVIEW && !MULTISAMPLED`, nested inside the existing
non-multisampled branch. Each of the four fragment entries
(`gaussian_horizontal`, `gaussian_vertical`, `bokeh_pass_a`,
`bokeh_pass_b`) gains a separate `@builtin(view_index) view_index: i32`
parameter alongside the existing `in: FullscreenVertexOutput`, and
assigns `current_view_index = view_index;` at the top of its body under
MULTIVIEW. The depth read lives in the `calculate_circle_of_confusion`
helper called from all four entries; under `MULTIVIEW && !MULTISAMPLED`
its single `textureLoad` adds `current_view_index` as the layer
argument. The same `current_view_index` global drives the existing
`view()` helper that `depth_ndc_to_view_z` reads via
`view_transformations.wgsl`, so the per-eye projection matrix is also
applied correctly. WGSL has no `texture_depth_multisampled_2d_array`, so
the MSAA + multiview combination keeps the single-layer binding — same
carve-out as the prepass-texture bindings in
`mesh_view_bindings.wgsl:99-106`.

Host: `DepthOfFieldPipelineKey` gains `multiview_view_count: u32`.
`prepare_depth_of_field_view_bind_group_layouts` and
`prepare_depth_of_field_pipelines` both gain
`Option<&ExtractedMultiview>`; the layout-prepare hoists a shared
`depth_binding` expression that picks
`texture_depth_2d_multisampled()` under MSAA,
`texture_2d_array(TextureSampleType::Depth)` under non-MSAA multiview,
or `texture_depth_2d()` otherwise, and reuses it across both the
single-input and dual-input layouts. The pipeline-prepare derives
`multiview_view_count` once per view and threads it into all four
`DepthOfFieldPipelineKey` constructions (gaussian horizontal/vertical,
bokeh pass 0/1). The specialize impl pushes `MULTIVIEW` plus
`MAX_VIEW_COUNT` shader defs when
`multiview_view_count > 1 && !multisample`, matching the layout-side
carve-out.

Out of scope (audit §6 third bullet): the dual-input bind group's
`auxiliary_dof_texture.default_view` is single-layer today and would
need to grow to multi-layer for true per-eye DoF. This commit fixes the
depth-buffer-read shape mismatch under multiview; full per-eye DoF
needs the auxiliary texture grown too. Also out of scope: the view
binding at `@group(0) @binding(0)` still consumes the
`uniform_buffer::<ViewUniform>(true)` layout entry (the WGSL imports
the multi-view-aware `mesh_view_bindings::view` helper, so layout/shader
match under no-MULTIVIEW; the L7c view-binding-side conversion is a
separate session).

Smoke (warm shader cache, frame 100, `bevy_ci_testing`):

* `depth_of_field` — **byte-identical** (3629866 bytes) pre/post
  conversion on a single run. New entry in the byte-deterministic
  example registry.
* `3d_scene` — byte-identical (697355 bytes) to the session 16-20
  lineage. Depth-prepass-plumbing witness.
* `volumetric_fog` — byte-identical (3709234 bytes) to the session
  19-20 baseline. Cross-pipeline regression witness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The per-layer first-call init in `ColorAttachment::first_call_for_layer`
and `DepthAttachment::get_attachment_for_layer` (introduced by the per-
layer attachment accessor API in c315f71 / e186c03) unconditionally
seeded each slot with `true` (or `clear_value.is_some()` for depth) on
first per-layer access. That matches the global `is_first_call`
semantics for a consumer that runs FIRST in the frame — the prepass +
deferred per-eye loops added by ac409d6, which were the only
consumers in tree until now.

A consumer that runs AFTER another pass has already touched the same
attachment via the legacy `get_attachment` API (e.g. a per-eye main-
pass consumer running after `main_opaque_pass_3d` has flipped the
global latch to false) used to see `true` on its first per-layer
access and incorrectly emit `LoadOp::Clear`, wiping the earlier pass's
work. Surfaced by the in-progress transmission per-eye dispatch
(L7b-write C3) on macOS Metal even at view_count=1: a 21% size drop
in the byte-deterministic transmission screenshot baseline (3162953
bytes → 2485935 bytes) made the regression visible without needing
multiview hardware.

Fix: read `is_first_call.fetch_and(...)` first and seed the per-layer
init from its return value. Each per-layer slot then matches what the
global latch would have returned for that consumer at init time, and
the per-slot `fetch_and(false)` subsequently behaves as a per-layer
extension of the same latch.

No call-site changes. Existing consumers (prepass + deferred per-eye
loops at view_count=1) byte-identical on `3d_scene` (697355 bytes)
and `deferred_rendering` (3628497 bytes); pre-C3 transmission path
byte-identical on the transmission example (3162953 bytes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…TIVIEW

L7b-write C3: grow the `view_transmission_texture` to one layer per
eye and dispatch `main_transmissive_pass_3d` per eye into per-layer
color + depth attachments. Final L7b-write consumer per the
`.scratch/session17_audit_planning.md` §7 follow-up plan; the prepass
and deferred render-graph nodes already landed C2 in ac409d6
(session 18) and the read-side (binding 24 in `mesh_view_bindings`)
already nests `texture_2d_array<f32>` under `MULTIVIEW` (b5066e02f,
session 14). The D2Array view that line 905 of
`mesh_view_bindings.rs` creates over the transmission texture used to
be a one-layer view of a one-layer texture; post-C3 it is the
multi-layer view the WGSL declaration always expected.

Part 0 — `ViewTarget::get_color_attachment_for_layer` /
`get_unsampled_color_attachment_for_layer`. Thin delegates picking
`main_textures.a` vs `main_textures.b` via the existing post-process
swap atomic and forwarding to the per-layer accessors that
ColorAttachment grew in c315f71 (session 16). Mirrors session 18's
`ViewDepthTexture::get_attachment_for_layer` delegate. The
`main_textures.{a,b}` textures are already multi-layer under
multiview per the C2 sub-A growth in prepare_view_targets, so the
delegate just exposes that to per-eye consumers.

Part A — `prepare_core_3d_transmission_textures` allocates the
transmission texture with `depth_or_array_layers = view_count`,
matching the multi-layer main texture so the copy_texture_to_texture
in the node can pass `depth_or_array_layers = view_count` in one
call. Cache key extended from `camera.target.clone()` to
`(camera.target.clone(), view_count)` per the session-16 C2-B
precedent (prevents multiview/non-multiview cameras sharing a render
target from colliding on a cached texture of the wrong layer count).

Part B — `main_transmissive_pass_3d` adds
`Option<&'static ExtractedMultiview>` to its ViewQuery, computes
`view_count` once, and wraps each `for range in split_range(...)`
iteration's render pass in a `for eye in 0..view_count` loop. Each
eye's descriptor uses `target.get_color_attachment_for_layer(eye)` +
`depth.get_attachment_for_layer(eye, StoreOp::Store)`. The
copy_texture_to_texture stays inside the range loop but outside the
eye loop — one multi-layer copy per range step spans every eye's
layer. Range<usize> is cloned cheaply for each eye's render_range
call. The same restructure applies to the `steps == 0` branch.

Non-multiview path: `view_count = 1`, copy_extents matches the
pre-edit `physical_target_size.to_extents()`, and the eye loop runs
once with eye=0 — byte-identical no-op (verified, see below).

Deviates from PR bevyengine#16059's broadcast design (single render pass with
`multiview_mask`) in favour of the per-eye dispatch shape session
15's L7d planning locked in for class-(b) consumers, consistent with
ac409d6 (prepass + deferred). Surface in the eventual reference
PR's "differences from PR bevyengine#16059" section.

Out-of-scope follow-ups:
- View binding layout still `uniform_buffer::<ViewUniform>(true)`
  singular at the binding site that transmission's PBR shader chain
  reads — under MULTIVIEW the WGSL `view_array: array<View,
  MAX_VIEW_COUNT>` mismatches the singular layout entry. Shared
  staging with the atmosphere/DoF R2 staging from sessions 20/21;
  separate session if pursued.
- The non-MSAA + MULTIVIEW carve-out shape from `mesh_view_bindings
  .wgsl:99-106` doesn't fire here because the transmission texture
  itself is always `sample_count = 1` (the source main texture's
  MSAA is resolved on the copy_texture_to_texture's source side).

Smoke verification on macOS Metal (no wgpu multiview, so this
exercises the `view_count = 1` no-op path):
- `transmission --features bevy_ci_testing` — byte-identical
  (3162953 bytes) pre/post C3. New entry in the byte-deterministic
  example registry.
- `3d_scene --features bevy_ci_testing` — byte-identical (697355
  bytes) to the session 16-21 lineage.
- `volumetric_fog --features bevy_ci_testing` — byte-identical
  (3709234 bytes) to the session 19-21 baseline.

This commit depends on F2 (d92e550): C3 is the first consumer of
the per-layer color attachment accessor that runs AFTER another pass
has touched the global is_first_call latch (main_opaque_pass_3d), so
without F2's per-layer first-call seeding fix C3 would emit
LoadOp::Clear on its first per-eye access and wipe the opaque main
pass's work. F2 surfaced through the byte-deterministic smoke on
the non-multiview path even at view_count=1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First L7d flip per the session 15 L7d enablement plan §4 step 5:
the smallest-fragment-surface class-(a) pipeline switches from per-eye
dispatch (session 18's `ac409d6a3` shape) to single-pass broadcast,
establishing the L7d shape on the branch.

Class (a) reaffirmation: the WGSL writes only the motion-vectors
color attachment at @location(1) with depth GreaterEqual early-z,
no shared storage, and reads view data via
`view_array[current_view_index]` keyed off `@builtin(view_index)` —
exactly the safe-broadcast shape the inventory described.

Plan-vs-reality note: §4 step 5 framed L7d as a per-pipeline flip,
but background motion vectors today is dispatched as a `draw(0..3)`
call inside the SAME render pass that runs opaque + alpha-mask
prepass items (pre-dates session 18's per-eye refactor — see
`git show ac409d6~1`). `multiview_mask` is a pass-level property,
so the flip requires extracting the draw into its own render pass.
Done within `run_prepass_system` (Shape A1) to preserve session 18's
per-eye dispatch for the surrounding prepass items.

Host changes:

- `background_motion_vectors.rs`: pipeline descriptor sets
  `multiview_mask = NonZeroU32::new((1 << view_count) - 1)` under
  `multiview_view_count > 1`. Mirrors the same predicate already
  used for the MULTIVIEW + MAX_VIEW_COUNT shader-def push.
- `prepass/node.rs`: the bg motion vectors `if let` block moves
  OUT of the `for eye in 0..view_count` loop into a separate
  render pass after the loop ends. The broadcast pass uses
  multi-layer attachments via legacy
  `ColorAttachment::get_attachment()` (normals + motion vectors)
  and `ViewDepthTexture::get_attachment(StoreOp::Store)`; the
  pass-descriptor `multiview_mask` matches the pipeline
  descriptor (None when view_count==1).

F2 interaction (session 22's `d92e55099`): legacy `get_attachment*`
accessors after the per-eye loop read the global `is_first_call`
latch, which the per-eye loop already flipped to false. Result:
the broadcast pass loads the prior per-eye writes for normal +
motion vectors + depth instead of re-clearing — the case F2 was
designed to handle. First in-tree consumer to exercise the
legacy-after-per-layer ordering for color attachments.

WGSL: zero edits. `background_motion_vectors.wgsl` was already
L7c-converted (declares `view_array: array<View, MAX_VIEW_COUNT>`
under MULTIVIEW, sets `current_view_index = view_index` at the
fragment top).

Smoke verification:

- `motion_blur --features bevy_ci_testing` — **byte-identical**
  (4492372 bytes, md5 `219b899f7f82a5f8d2895e884260f99d`)
  pre/post-edit. Primary witness: the actual modified pipeline.
  Despite §registry noting motion_blur as non-deterministic
  (camera animation drift), this specific pair matched.
- `3d_scene --features bevy_ci_testing` — byte-identical
  (697355 bytes) to session 16-22 lineage. Prepass-node
  structural witness.
- `deferred_rendering --features bevy_ci_testing` —
  byte-identical (3628497 bytes) to session 17-22 lineage.
  Deferred late-prepass witness.
- `transmission --features bevy_ci_testing` — byte-identical
  (3162953 bytes) to session 22 baseline on the second run.
  First run drifted (one-time cold-shader-cache recompile
  per §registry caveat); converged on re-run. Cross-pipeline
  no-regression witness.

Non-multiview behavior (view_count==1): the bg motion vectors
draw now runs in its own pass with `multiview_mask: None` instead
of sharing a pass with the prepass items. Adds one render-pass
boundary on the non-multiview path; motion_blur smoke confirms
no output drift.

L7d flips remaining (per session 15 §1 inventory): PBR mesh
forward, PBR prepass, deferred lighting, SSR, tonemapping,
skybox, blit, FXAA, bloom-first, debug overlay. Pipelines that
own their render pass (tonemapping/skybox/etc) can flip in place;
pipelines that share a pass (PBR mesh forward, PBR prepass) need
the same extraction-or-whole-pass-flip design call this commit
faced. Cluster raster + OIT-flagged PBR stay per-eye-dispatched
per the class (b) carve-out.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L7d (`572af1539`) computed the all-eyes-set broadcast mask as
`(1u32 << view_count) - 1` at two sites:
`background_motion_vectors.rs:215` and `prepass/node.rs:283`.
Both panic in debug (shift overflow) and are UB in release when
`view_count == 32`, the `MAX_VIEW_COUNT` cap enforced by L2's
`e250678de` extraction.

A 32-view multiview camera is reachable through the public API
(`Multiview` extraction warn-clamps to 32, not 31). Typical
stereo (view_count == 2) is unaffected, but the next L7d flip
would copy the pattern and inherit the same defect.

Fix: replace with `u32::MAX >> (32 - view_count)` at both sites.
For valid inputs in `[1, 32]`, the result is the same all-eyes-set
mask but the shift amount stays in `[0, 31]`. The gate
`view_count > 1` is unchanged.

Smoke verification:

- `motion_blur --features bevy_ci_testing` — byte-identical
  (4492372 bytes) to the L7d baseline. The non-multiview path
  (`view_count == 1`) skips the branch entirely, so this only
  witnesses no-regression on the broadcast-pass structure.
- `3d_scene --features bevy_ci_testing` — byte-identical
  (697355 bytes). Prepass-node structural witness.

Same shape as session 18's F1 and session 22's F2: small
self-contained hardening on a novel surface, landed in-session
per [[feedback-session-conventions]] §collapse-pre-planned-split.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets `multiview_mask` on both the FXAA pipeline descriptor (in
`fxaa/mod.rs::specialize`) and its render-pass descriptor (in
`fxaa/node.rs`). Both sites compute the mask with the shift-safe
`u32::MAX >> (32 - view_count)` idiom established by F1 `43920683c`
and gate on `> 1` (pipeline via `key.multiview_view_count`, node via
`target.multiview_count().map_or(1, |n| n.get())`). Wgpu requires
the two descriptors to agree; the matching gate predicate mirrors the
existing convention for the MULTIVIEW + MAX_VIEW_COUNT shader-def
push and bind-group layout pick.

Second L7d flip on the branch. Unlike session 23's bg motion vectors
(which required Shape A1 extract surgery because the dispatch
co-mingles with prepass items in a shared render pass), FXAA owns
its own render pass (`fxaa/node.rs:53`), so the L7d flip is purely
mechanical: no WGSL edits (already L7c-converted with
`@builtin(view_index)` + `current_view_index` plumbing), no
extraction, no per-eye loop. ViewTarget's post-process source/
destination views are already multi-layer D2Array under multiview
per session 16's `prepare_view_targets` growth, so the destination
attachment supports the broadcast directly.

At `view_count == 1` (the only path macOS exercises since wgpu has
no multiview support there) both compute `multiview_mask = None`,
making this a literal no-op on the non-multiview path.

Smoke (`--features bevy_ci_testing`):
- `anti_aliasing` with a temp Camera spawn carrying `Msaa::Off,
  Fxaa::default()` (reverted after capture) — **byte-identical**
  (3588248 bytes, md5 `4d95a4be3b6fbecee0ac83836674f783`) pre/post
  the L7d flip on warm cache. Primary witness directly exercises
  the touched FXAA pipeline.
- `3d_scene` byte-identical (697355) to the session 16-23 lineage.
- `deferred_rendering` byte-identical (3628497) to the session 17-23
  lineage.
- `motion_blur` byte-identical (4492372, md5
  `219b899f7f82a5f8d2895e884260f99d`) to session 23. Confirms the
  in-tree L7d bg-motion-vectors pass is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets `multiview_mask` on both the tonemapping pipeline descriptor (in
`tonemapping/mod.rs::specialize`) and its render-pass descriptor (in
`tonemapping/node.rs`). Both sites compute the mask with the shift-safe
`u32::MAX >> (32 - view_count)` idiom established by F1 `43920683c`
and gate on `> 1` (pipeline via `key.multiview_view_count`, node via
`target.multiview_count().map_or(1, |n| n.get())`). Wgpu requires
the two descriptors to agree; the matching gate predicate mirrors the
existing convention for the MULTIVIEW + MAX_VIEW_COUNT shader-def
push and bind-group layout pick.

Third L7d flip on the branch, second purely mechanical own-pass flip
after FXAA `a7fd04a19`. Tonemapping owns its own render pass at
`tonemapping/node.rs` (single `draw(0..3, 0..1)` with no co-mingled
draws), and was already fully L7c-converted (`view_array: array<View,
MAX_VIEW_COUNT>` binding, `@builtin(view_index)` + `current_view_index`
plumbing in the fragment, `view()` helper used by tonemap math). The
L7d flip is purely the two `multiview_mask` field sets: no WGSL edits,
no extraction, no per-eye loop. The HDR source and destination both
come from `target.post_process_write()`, which returns the `default_view`
of `main_textures.{a,b}` — already multi-layer D2Array under multiview
per session 16's `prepare_view_targets` growth — so the broadcast pass
samples and writes per-eye-correct data on both sides. The tonemapping
LUT textures (bindings 3, 4) and sampler are camera-shared and
eye-independent; tonemap output is determined entirely by per-eye HDR
input + global LUT, so broadcast preserves per-eye correctness.

At `view_count == 1` (the only path macOS exercises since wgpu has
no multiview support there) both sites compute `multiview_mask = None`,
making this a literal no-op on the non-multiview path. Tonemapping
also runs in Core2dSystems::PostProcess for 2D cameras; 2D cameras
have no Multiview component, so `key.multiview_view_count == 1` and
`target.multiview_count() == None` at the two sites respectively —
both reach the else branch and 2D stays byte-identical.

Smoke (`--features bevy_ci_testing`, all byte-identical first try on
warm cache):
- `3d_scene` byte-identical (697355) to the session 16-24 lineage.
  Default Camera3d uses `Tonemapping::TonyMcMapface`, so this directly
  exercises the touched pipeline.
- `deferred_rendering` byte-identical (3628497) to the session 17-24
  lineage. Tonemapping witness via the deferred camera setup.
- `motion_blur` byte-identical (4492372, md5
  `219b899f7f82a5f8d2895e884260f99d`) to session 23-24. Confirms the
  in-tree L7d bg-motion-vectors broadcast pass and the L7d FXAA pass
  are unaffected.
- `atmosphere` byte-identical (3542199, md5
  `23c5c7f864035a6d4a58b47aa1acb24c`) to the session 20-24 lineage.
  Tonemapping witness through the atmosphere camera.
- `transmission` byte-identical (3162953) to the session 22-24
  lineage. Tonemapping + per-eye transmissive cross-witness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second Shape A1 L7d on the branch (after session 23's background
motion vectors). The skybox cubemap is shared across eyes; per-eye
view matrices sampled via `view()` from `@builtin(view_index)` give
each eye the correct ray direction, so one broadcast draw fills every
layer of the multi-layer color + depth attachments.

Skybox previously dispatched as a trailing `render_pass.draw(0..3, 0..1)`
inside the shared `main_opaque_pass_3d` (alongside opaque + alpha-mask
phase items). Because `multiview_mask` is a pass-level property
(per-pipeline framing collides with shared render passes), this commit
closes the existing main pass after the opaque + alpha-mask draws and
opens a new "skybox_broadcast" pass that re-derives the color + depth
attachments. Re-derivation hits the second-call `is_first_call` latch
on `ColorAttachment` / `DepthAttachment`, returning `LoadOp::Load` to
preserve the opaque + alpha-mask output.

Surrounding main_opaque_pass draws stay in their existing single-pass
shape; the broader question of converting mesh forward + alpha-mask to
per-eye / Shape D belongs to a future session. At view_count=1 (the
only path exercised by the in-tree examples), both descriptors set
multiview_mask=None and the skybox draw lands byte-identical to its
prior position inside the shared pass.

Pipeline-side and pass-side both derive view_count from the same
ExtractedMultiview.subviews.len() chain; the masks cannot diverge.
Mask uses the shift-safe `u32::MAX >> (32 - view_count)` formulation
established in session 23's F1 (avoids `1u32 << 32` overflow at the
MAX_VIEW_COUNT cap of 32).

Smoke (byte-identical to registry baselines):
- volumetric_fog: 3709234 bytes (primary skybox witness, first try)
- 3d_scene: 697355 bytes
- deferred_rendering: 3628497 bytes
- motion_blur: 4492372 bytes, md5 219b899f7f82a5f8d2895e884260f99d
  (confirms in-tree L7d bg-motion-vectors + FXAA + tonemapping
  broadcast passes all unaffected)
- atmosphere: 3542199 bytes, md5 23c5c7f864035a6d4a58b47aa1acb24c
  (converged on third run per registry cold-shader-cache caveat;
  atmosphere has no Skybox component so the cold-cache drift was
  unrelated to this edit)
- transmission: 3162953 bytes

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(L7d Shape D)

Fifth L7d flip on the branch and the first whole-pass-broadcast (Shape D)
conversion. Per the session 27 planning doc (`.scratch/session27_pbr_l7d_planning.md`)
§3.3, every in-tree dispatcher into `Opaque3d` / `AlphaMask3d` flows through PBR
`MeshPipeline` via `DrawMaterial` (verified via
`grep add_render_command::<Opaque3d|AlphaMask3d>` across the workspace), and the
mesh pipeline + every shader prereq is already L7c-converted at the layer where
the MULTIVIEW shader-def push happens. With one dispatcher and full L7c, Shape
A1 (extract per draw) collapses — the dispatcher IS the pass content. Shape D
(whole-pass broadcast) is the natural answer.

Pipeline-side: `MeshPipeline::specialize` sets `multiview_mask = NonZeroU32::new(u32::MAX >> (32 - max_view_count))`
under `max_view_count > 1` on the returned `RenderPipelineDescriptor`. Gate
mirrors the existing MULTIVIEW + MAX_VIEW_COUNT shader-def push at the same
site. The prepass + deferred prepass dispatches (`Opaque3dPrepass` etc.) flow
through the separate `PrepassPipelineSpecializer` type and are NOT covered by
this field-set — their conversion is a separate session.

Pass-side: `main_opaque_pass_3d` flips its descriptor from `multiview_mask: None`
to the shift-safe formula derived from `multiview.subviews.len()`. The compute
is lifted to function-top scope so the skybox broadcast pass (session 26) can
reuse the same value rather than recomputing — small de-duplication. Both gate
predicates ultimately derive from `ExtractedMultiview.subviews.len()`, so wgpu's
required pipeline-vs-pass mask agreement holds.

Shape D's load-bearing prereq — every dispatcher in the pass must be
feature-safe under broadcast — is met for every in-tree Material because they
all share the same L7c-converted `MeshPipeline::specialize` path. The latent
risk is custom user `Material` implementors that ship their own custom WGSL
entry without threading `@builtin(view_index)` + assigning
`current_view_index = view_index;`: the pipeline + hardware broadcast correctly,
but `current_view_index` stays at 0 on every layer and reads through `view()` /
`mesh_view_bindings::*` silently resolve to eye 0's data, so lighting and
camera-relative effects render as if every eye were eye 0 (geometry survives
because the default `mesh.wgsl` vertex entry threads `view_index` itself, IF
the custom material kept the default vertex entry).

To make that risk discoverable, two in-tree docs land alongside the Shape D
flip:

- A `# Multiview` section on the `Material` trait docstring
  (`bevy_pbr/src/material.rs`) explaining the requirement for any custom WGSL
  entry (vertex OR fragment), with the asymmetry between
  custom-fragment-only-with-default-vertex (geometry safe via default
  `mesh.wgsl`) and custom-both-without-threading (geometry breaks too).
  Anchors to `pbr.wgsl`, `mesh.wgsl`, and `mesh_view_bindings.wgsl`.
- A comment block in `mesh_view_bindings.wgsl` right after the
  `current_view_index` declaration with a paste-ready fragment-entry snippet
  showing the canonical `#ifdef MULTIVIEW` plumbing.

Both docs frame the issue around `current_view_index` rather than a specific
phase or pipeline, so they age well for any future shared-pass L7d conversion
that uses the same global.

Diff +111/-22 = +89 net across four files. Shape-flip surface itself (node +
MeshPipeline) is +33 net, top of the §3.3 ~30-50 estimate; the in-tree docs
add +56 net (Material trait `# Multiview` section +27, `mesh_view_bindings.wgsl`
paste-ready snippet +29) vs the §5 ~15-25 estimate. The doc overshoot is
deliberate — the paste-ready snippet is the canonical migration aid the
eventual PR description references, and the trait-level docstring is the
natural IDE-hover surface for custom-material authors. Total ~18% over the
~45-75 plan estimate; well under the L7c <120 cap and the >25% §L7d-bands
overshoot trigger.

Smoke verification (6 byte-deterministic witnesses per §registry):

- `3d_scene` --features bevy_ci_testing — byte-identical (697355) to the
  session 16-26 lineage. Primary cross-cutting witness.
- `volumetric_fog` --features bevy_ci_testing — byte-identical (3709234) to
  the session 19-26 lineage. Skybox broadcast + main_opaque interleave on the
  only registry witness carrying a `bevy_light::Skybox` component.
- `deferred_rendering` --features bevy_ci_testing — byte-identical (3628497)
  to the session 17-26 lineage. Deferred + main_opaque interleave.
- `motion_blur` --features bevy_ci_testing — byte-identical (4492372, md5
  `219b899f7f82a5f8d2895e884260f99d`) to the session 23-26 lineage. Fifth
  consecutive determinism confirmation; bg-motion-vectors broadcast +
  main_opaque + L7d FXAA + L7d tonemapping all unaffected.
- `transmission` --features bevy_ci_testing — 3162897 bytes vs registry
  baseline 3162953. Drift is pre-existing in HEAD `1490228f0`, not caused by
  this commit: cmp of post-edit PNG vs the same example run on a fully
  reverted working tree (stashed all four edits, ran transmission on HEAD,
  unstashed) shows the two PNGs are bit-identical at 3162897. The 56-byte
  drift from the registry baseline appears to be environmental (Metal
  pipeline-cache state, system load, or similar) between session 26's capture
  and now. Both PNGs render the transmissive demo correctly.
- `atmosphere` --features bevy_ci_testing — 3542320 bytes / md5
  `97f15a8251d03f49eeb6ac6f40d9cf26` stable across 3 runs with the edits
  applied. Clean HEAD with all edits reverted gave 3542199 bytes but md5
  `cbbda228c220ad4fe44829aac5b00a33` — different from the registry's
  `23c5c7f864035a6d4a58b47aa1acb24c` at the same byte size. Atmosphere is
  fundamentally image-content non-deterministic; the registry's prior
  size-match across sessions was coincidence at one cache-warmth state. PNG
  renders the atmosphere demo correctly. Cross-checked against the
  type-level no-op argument: `RenderPipelineDescriptor.multiview_mask`
  defaults to `None` (`bevy_material/src/descriptor.rs:53`), so at
  view_count == 1 this commit's explicit `multiview_mask: None` is identical
  to the prior `..default()`-filled value at both the pipeline and pass
  descriptors.

Together: 4 byte-identical registry matches (3d_scene, volumetric_fog,
deferred_rendering, motion_blur) + 1 bit-identical-to-clean-HEAD witness
(transmission) + 1 visually-correct-with-environmental-drift witness
(atmosphere) confirm the no-op claim at view_count == 1.

Out of scope (recorded for later sessions per planning doc §7):
- Prepass D2+F → Shape D (session 29; edits `PrepassPipelineSpecializer`).
- Deferred prepass D2+F → Shape D (session 30; node-only, must follow §29).
- Wireframe / deferred lighting / OIT resolve own-pass L7d (sessions 31-33).
- Transparent L7d (gated on a future sort-distance planning session).
- InfiniteGrid L7c (transparent prereq).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Converts the prepass node's per-eye dispatch loop (session 18, ac409d6)
into a single Shape D broadcast pass, mirroring session 28's main-opaque
flip but on the separate `PrepassPipelineSpecializer` host type covering
`Opaque3dPrepass` / `AlphaMask3dPrepass` / `Opaque3dDeferred` /
`AlphaMask3dDeferred` dispatches through `DrawPrepass`. The
background-motion-vectors broadcast pass (session 23) stays as-is — its
existing legacy `get_attachment` / `get_attachment(StoreOp::Store)` calls
remain the second legacy calls in `run_prepass_system`, so it still gets
`LoadOp::Load` and preserves the prepass-items output.

Pipeline-side (`bevy_pbr/src/prepass/mod.rs`, +18/-1): adds `NonZeroU32`
import; `PrepassPipeline::specialize` sets `multiview_mask =
NonZeroU32::new(u32::MAX >> (32 - max_view_count))` under
`max_view_count > 1` on the returned `RenderPipelineDescriptor`, parallel
to session 28's MeshPipeline edit. Comment notes coverage is the prepass
+ deferred prepass dispatches via `DrawPrepass`, distinct from the
forward-pass mask set in MeshPipeline.

Pass-side (`bevy_core_pipeline/src/prepass/node.rs`, +68/-75 = -7 net):
removes the `for eye in 0..view_count` loop, replaces per-layer
`get_attachment_for_layer(eye)` / `get_attachment_for_layer(eye,
StoreOp::Store)` with legacy `get_attachment()` / `get_attachment(
StoreOp::Store)`, and sets the pass descriptor `multiview_mask` to the
shift-safe formula under `view_count > 1`. Lifts `view_count` +
`multiview_mask` compute to function-top scope so the bg-motion-vectors
broadcast pass reuses the same value rather than recomputing.

F2 lifecycle (session 22, d92e550) traced through the new ordering:
the prepass-items pass calls legacy accessors first (flips global latch
to false on first frame, Clear), bg-motion-vectors calls legacy second
(global false, Load), main opaque calls legacy third (Load), transmission's
per-eye dispatch calls per-layer accessors which seed slots from the now-
false global via F2 (Load on every eye). The same Load-after-first-Clear
pattern the per-eye loop produced, with one rather than N dispatches.

Smoke verification (5 byte-deterministic witnesses + 1 bit-compare-vs-
clean-HEAD):
- 3d_scene 697355 — bit-identical to the session 16-28 lineage. Primary
  cross-cutting witness; no Prepass-component cross-check.
- volumetric_fog 3709234 — bit-identical to the session 19-28 lineage.
- deferred_rendering 3628497 — bit-identical to the session 17-28
  lineage. Exercises the deferred prepass node which inherits the
  PrepassPipelineSpecializer multiview_mask field-set automatically
  (session 30 will flip its own per-eye loop into the same Shape D
  broadcast on the node side; this commit's host edit is the prereq).
- motion_blur 4492372 md5 219b899f7f82a5f8d2895e884260f99d — bit-
  identical to the session 23-28 lineage. Load-bearing for this session:
  exercises BOTH the now-broadcast prepass items AND the bg-motion-
  vectors broadcast pass; a regression in either lifecycle would show.
  Sixth consecutive determinism confirmation.
- transmission 3162897 md5 6560d0d1bd00af9936b829e37fac4565 — clean-HEAD
  is now image-content non-deterministic (3 pre-impl runs on HEAD
  109bde0 produced 2 distinct sizes and 3 distinct md5s: 3162956
  ff8e4441, 3162897 f785c7eb, 3162897 6560d0d1). Post-impl md5 6560d0d1
  matches the third pre-impl run exactly — the post-edit output is in
  the set of clean-HEAD outputs, confirming the edit is runtime no-op at
  view_count=1. F2 lifecycle for transmission's per-layer dispatch is
  preserved as cold-read predicted.
- atmosphere 3542276 md5 eaf0bac6a7b82062c730bcd988134554 — drifted per
  the session-28-recorded content non-determinism. Backstopped by the 5
  strong witnesses above and the F2 reasoning chain.

No-op argument: `RenderPipelineDescriptor.multiview_mask` and
`RenderPassDescriptor.multiview_mask` both default to `None` and the
field-set gate is `view_count > 1`. At view_count = 1 the post-edit
descriptors are type-level identical to the pre-edit ones. The per-eye
loop at view_count=1 ran once with `get_attachment_for_layer(0,
StoreOp::Store)`; the synthesized D2 view is bit-identical to
`default_view` per the session-16 docstring, so the legacy and per-
layer paths produce wgpu-equivalent attachments. Latch lifecycle is
preserved (both paths flip the global to false on first call, Load on
second). The 4 byte-identical witnesses confirm the no-op claim
empirically; transmission's bit-match against a clean-HEAD output
covers the non-deterministic case.

Cold-read review combined in-session per session-conventions §review-
cadence (Shape D twice-established after session 28). Pipeline-vs-pass
agreement verified (both sites derive from ExtractedMultiview.subviews.
len() via the cache-keyed pipeline-key field and the runtime query
respectively); view_count == 32 shift-safe (u32::MAX >> 0 = u32::MAX);
view_count == 1 no-op (verified empirically); F2 seeding still correct
for transmission downstream consumer (init_per_layer_first_call seeds
from global=false → all per-layer slots false → Load on every eye);
bg-motion-vectors lexical ordering preserved (prepass-items first,
bg-motion-vectors second). Zero actionable findings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Converts the deferred prepass node's per-eye dispatch loop (session 18,
ac409d6) into a single Shape D broadcast pass, parallel to session
29's forward-prepass flip. Node-only edit: the
`PrepassPipelineSpecializer` host edit (session 29, 90bb176) already
covers `Opaque3dDeferred` / `AlphaMask3dDeferred` dispatches through
`DrawPrepass` per `prepass/mod.rs:169-170`, so no pipeline-side change
lands this session. Pipeline-vs-pass agreement holds because both gates
derive from `ExtractedMultiview.subviews.len()`.

Pass-side (`bevy_core_pipeline/src/deferred/node.rs`, +116/-99 = +17
net): adds `NonZeroU32` import; removes the `for eye in 0..view_count`
loop; replaces per-layer `get_attachment_for_layer(eye)` /
`get_attachment_for_layer(eye, StoreOp::Store)` with legacy
`get_attachment()` / `get_attachment(StoreOp::Store)` for all four
gbuffer color slots + depth; sets the pass descriptor `multiview_mask`
to `NonZeroU32::new(u32::MAX >> (32 - view_count))` under
`view_count > 1`. The `copy_texture_to_texture` at the end stays
unchanged — already handles multi-layer per its session-16 C2-A
docstring (`depth_or_array_layers = view_count`). The webgl
`clear_texture` block at function-top is unchanged; its stale "stays
outside the per-eye loop" comment is rewritten to "runs before the
broadcast pass" without changing meaning.

F2 lifecycle (session 22, d92e550) traced across three configurations:

- Forward + deferred prepass (deferred_rendering case): forward prepass
  ran first (session 29 broadcast pass) and flipped Normal /
  MotionVectors / depth global latches to false via legacy calls.
  Deferred entry: legacy `get_attachment()` on Normal / MotionVectors
  returns Load (preserves forward output); legacy
  `get_attachment(StoreOp::Store)` on depth returns Load (preserves
  forward depth); deferred-specific Deferred / DeferredLightingPassId
  latches are untouched by forward prepass → legacy returns Clear (first
  write to the gbuffer).
- Deferred-only configuration: all latches untouched on entry; the
  broadcast pass writes a fresh depth + gbuffer pass on first call.
- Early + late (late gated on occlusion_culling): early flips all
  attachment latches to false; late's legacy calls return Load and
  preserve early's output. Identical to the per-eye-loop pre-edit
  behavior at view_count=1 because session-22 F2 seeds per-layer slot
  0 from the now-false global, returning Load on the late per-layer
  call.

Smoke verification (4 byte-deterministic witnesses, all identical to
both pre-impl runs on HEAD 90bb176 and registry baseline):

- deferred_rendering 3628497 md5 d04bdbd9d89dd2ed8e8fe2d61ac6ac2b —
  **PRIMARY witness**, exercises the deferred prepass node directly.
- 3d_scene 697355 — cross-cutting; no DeferredPrepass component, so the
  new path never opens — confirms surrounding render graph behavior
  unchanged.
- volumetric_fog 3709234 — cross-validates that the deferred prepass
  edit doesn't perturb forward-prepass + skybox + fog interaction.
- motion_blur 4492372 md5 219b899f7f82a5f8d2895e884260f99d — cross-
  validates that session 29's forward prepass Shape D + this session's
  deferred Shape D both stay correct under the bg-motion-vectors
  broadcast pass lifecycle. Eighth consecutive determinism confirmation.

No-op argument: `RenderPassDescriptor.multiview_mask` defaults to
`None`; the field-set gate is `view_count > 1`. At view_count=1 the
post-edit pass descriptor is type-level identical to the pre-edit
`multiview_mask: None`. The per-eye loop at view_count=1 ran once with
`get_attachment_for_layer(0, ...)`; the synthesized D2 view is
bit-identical to `default_view` per the session-16 docstring, so
legacy and per-layer paths produce wgpu-equivalent attachments. Latch
lifecycle preserved per the F2 trace above. 4 byte-identical witnesses
confirm empirically.

Cold-read review combined in-session per session-conventions
§review-cadence (Shape D thrice-established after sessions 28 + 29).
Pipeline-vs-pass agreement verified (both derive from
ExtractedMultiview.subviews.len() via the cache-keyed pipeline-key
field and the runtime query respectively); view_count == 32 shift-safe
(u32::MAX >> 0 = u32::MAX); view_count == 1 no-op (verified empirically
by all 4 byte-identical witnesses); F2 traced across forward+deferred,
deferred-only, and early+late configurations against all six attachment
surfaces (Normal, MotionVectors, Deferred, DeferredLightingPassId,
depth, span). One in-session cosmetic fix (R1): rewrote the stale
"Stays outside the per-eye loop" comment on the webgl clear_texture
block to "Runs before the broadcast pass". No other actionable findings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The branch was developed in incremental sessions whose internal layer
naming (L7b-write, L7c, L7d, Shape A/A1/D, C2-A, F2 lifecycle, etc.)
seeped into ~21 in-tree comments across 15 files during the layered
implementation. Those labels carry no meaning outside this branch's
working notes; this commit rewrites each comment to describe the
substance in plain rendering vocabulary, without changing any behavior.

Strip categories:

- "L7d" / "L7d (Shape D)" comment prefix on 12 broadcast-mask
  explanations (FXAA, tonemapping, skybox, bg-motion-vectors,
  main-opaque, forward + deferred prepass nodes, plus the matching
  pipeline-side comments in `MeshPipeline` and `PrepassPipelineSpecializer`):
  drop the prefix, rephrase the lead sentence to "Broadcast across
  every eye layer in a single pass." / "Broadcast every <phase> dispatch
  ... under multiview." Substance after the lead sentence is unchanged.
- "F2 lifecycle on entry" → "Attachment lifecycle on entry" in the
  deferred-prepass node's lifecycle paragraph.
- "L7b-write" / "pre-L7b-write" / "post L7b-write" forward-references
  in 4 files (`mesh_view_bindings.rs`, `prepass_bindings.rs`,
  `ssao/mod.rs`, `transmission/node.rs`): rewrite each to describe the
  present runtime behavior rather than a temporal reference to a
  past session phase.

Stale-comment corrections (folded in because the jargon strip
exposed them):

- `mesh_view_bindings.rs:876-879` previously claimed "The SSAO texture
  is single-layer" and "every eye reads layer 0". That description
  predates the per-eye SSAO texture growth; the actual current
  behavior is `view_count` layers under multiview, with the consumer
  at `pbr_fragment.wgsl:660` reading its eye's slice via
  `current_view_index`. Comment rewritten to match.
- `mesh_view_bindings.rs:897-899` previously claimed "single-layer
  transmission texture" — also stale post the per-eye transmission
  texture growth. Comment rewritten.
- `prepass_bindings.rs:80-84` docstring previously said "the
  underlying textures are still single-layer this session" — stale.
  Rewritten to describe the multi-layer-under-multiview shape and
  the consumer reading its eye's slice via `current_view_index`.
  This is the only `///` doc-visible string in the strip.
- `ssao/mod.rs:727-735` previously framed the prepass-layer-count
  gate as "until C2 grows them this auto-upgrades" — stale.
  Rewritten to describe the gate as a runtime check on each
  texture's actual layer count, since SSAO can be configured
  with multi-layer SSAO output even when prepass attachments
  remain single-layer.

Workspace-wide `cargo check` is clean. No behavior changes; comment-
only diff. 15 files, +96/-96 (every change replaces text in place).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The branch was developed in iterative sessions using AI assistance;
this rule keeps the per-session planning and review notes (which live
in .scratch/) out of committed history. The notes themselves are
working artifacts and are not part of the PR contents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Welcome, new contributor!

Please make sure you've read our contributing guide, as well as our policy regarding AI usage, and we look forward to reviewing your pull request shortly ✨

@github-actions
Copy link
Copy Markdown
Contributor

Your PR caused a change in the graphical output of an example or rendering test. This might be intentional, but it could also mean that something broke!
You can review it at https://pixel-eagle.com/project/B04F67C0-C054-4A6F-92EC-F599FEC2FD1D?filter=PR-24422

If it's expected, please add the M-Deliberate-Rendering-Change label.

If this change seems unrelated to your PR, you can consider updating your PR to target the latest main branch, either by rebasing or merging main into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant