Skip to content

feat: Add support For SeedVR2 (CORE-6)#11294

Closed
yousef-rafat wants to merge 66 commits into
Comfy-Org:masterfrom
yousef-rafat:seedvr2
Closed

feat: Add support For SeedVR2 (CORE-6)#11294
yousef-rafat wants to merge 66 commits into
Comfy-Org:masterfrom
yousef-rafat:seedvr2

Conversation

@yousef-rafat
Copy link
Copy Markdown
Contributor

@yousef-rafat yousef-rafat commented Dec 12, 2025

Screenshot 2025-12-22 194424

@Kosinkadink
Copy link
Copy Markdown
Member

Currently, trying either image or video inputs results in this error during VAE Decode:

!!! Exception during processing !!! Given groups=1, weight of size [512, 16, 3, 3, 3], expected input[1, 1, 16, 3, 332] to have 16 channels, but got 1 channels instead
Traceback (most recent call last):
  File "/home/kosin/ComfyUI/execution.py", line 516, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/execution.py", line 330, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/execution.py", line 304, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "/home/kosin/ComfyUI/execution.py", line 292, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "/home/kosin/ComfyUI/nodes.py", line 298, in decode
    images = vae.decode(samples["samples"])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/custom_nodes/benchmark/__init__.py", line 307, in wrapper_VAE_decode
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/sd.py", line 799, in decode
    out = self.process_output(self.first_stage_model.decode(samples, **vae_options).to(self.output_device).float())
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/ldm/seedvr/vae.py", line 1564, in decode
    x = super().decode(latent).squeeze(2)
        ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/ldm/seedvr/vae.py", line 1455, in decode
    decoded = self.slicing_decode(z)
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/ldm/seedvr/vae.py", line 1486, in slicing_decode
    return self._decode(z)
           ^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/ldm/seedvr/vae.py", line 1479, in _decode
    output = self.decoder(latent)
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/ldm/seedvr/vae.py", line 1290, in forward
    sample = self.conv_in(sample)
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/comfy/ldm/seedvr/vae.py", line 386, in forward
    return super().forward(input)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 717, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kosin/ComfyUI/venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 712, in _conv_forward
    return F.conv3d(
           ^^^^^^^^^
RuntimeError: Given groups=1, weight of size [512, 16, 3, 3, 3], expected input[1, 1, 16, 3, 332] to have 16 channels, but got 1 channels instead

@yousef-rafat
Copy link
Copy Markdown
Contributor Author

Fixed 3, 4.
For 2, it already uses apply_rope1. I just refactored some parts of the rope implementation as a matrix to work with apply_rope1
You can see it in apply_rotary_emb

pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request Apr 27, 2026
Text-level pytest that reads comfy/ldm/seedvr/vae.py and asserts
.tiled_args.pop( is absent and .tiled_args.get( is present. Catches
drift back to dict-mutation semantics per upstream fix in
Comfy-Org#11294 commit 3b418da.

Refs: pollockjj/mydevelopment#116
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 1, 2026
…101_pi

# Conflicts:
#	comfy/model_detection.py
#	comfy/supported_models.py
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 1, 2026
…chunked GroupNorm path is reachable

The 5D GroupNorm gate in causal_norm_wrapper compared memory_occupy
against float('inf'), making the chunked dispatch branch unreachable
under any finite memory estimate. The configured _NORM_LIMIT (set via
set_norm_limit / VAE.set_memory_limit norm_max_mem) had no effect.

Replace the hardcoded float('inf') with get_norm_limit() so the
configured limit actually controls the gate. The GiB calc line is
preserved unchanged.

CodeRabbit finding on upstream PR Comfy-Org#11294:
Comfy-Org#11294 (comment)

Add tests-unit/comfy_test/test_seedvr_groupnorm_limit.py with two
regression cases covering default-limit (full path) and low-limit
(chunked path), each discriminated via a forward-hook + F.group_norm
spy pair.
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 4, 2026
…oPE cast (closes pollockjj/mydevelopment#183) (#32)

* Harden SeedVR2 conditioning model resolution; cache RoPE float32 cast

Replace direct `model.model.diffusion_model` traversal in
SeedVR2Conditioning.execute with `_resolve_seedvr2_diffusion_model`,
which raises a specific RuntimeError prefixed with
`_SEEDVR2_INVALID_MODEL_MSG_PREFIX` when the expected wrapper shape is
absent (no silent AttributeError, no defensive fallback).

Replace per-execution full-module-tree iteration that re-cast
`module.rope.freqs.data` to float32 every call with
`_apply_rope_freqs_float32_cast`, which iterates and casts at most
once per resolved diffusion-model instance via a sentinel attribute on
the instance.

Add regression tests in
tests-unit/comfy_extras_test/test_seedvr_conditioning_hardening.py
pinning (a) the valid-shape resolver path, (b) the loud-failure path
on an invalid/wrapper-shaped model object, and (c) the
iterate-once-then-cache behavior of the RoPE cast.

Source: CodeRabbit r2962219538 on Comfy-Org#11294.
Tracked by: pollockjj/mydevelopment#183.

* address codex P1 + Copilot review on PR #32

P1 (codex, comfy_extras/nodes_seedvr.py:50): the _apply_rope_freqs_float32_cast
helper used a per-instance bool sentinel attribute as the cache primitive.
Comfy.s dynamic model unload/reload restores rope.freqs from the archived
dtype while the plain Python attribute on the diffusion-model survives,
so the next execute() short-circuited and left RoPE running in fp16/bf16
- exactly the failure the helper was supposed to prevent.

Replaced with per-tensor dtype-check idempotency: every call walks the
module tree and only invokes .to(float32) when the tensor.s data.dtype
is not already float32. Self-correcting against any weight-restore
lifecycle event. Iteration cost is one walk of the diffusion-model
module tree per execute() (microseconds).

Test changes:
- test_apply_rope_freqs_float32_cast_iterates_once_then_caches REMOVED.
  The cache it asserted no longer exists.
- test_apply_rope_freqs_float32_cast_idempotent_on_unchanged_dtype ADDED.
  Asserts the dtype-check skip path: a second call on already-float32
  rope.freqs preserves tensor data identity (no re-allocation).
- test_apply_rope_freqs_float32_cast_recovers_after_dtype_reset ADDED.
  Asserts the dtype-check recovery path: simulates the unload/reload
  scenario by manually resetting rope.freqs to non-float32 between calls
  and verifying the next call re-casts to float32. Original bool-sentinel
  implementation would have failed this test.

Copilot finding (test_seedvr_conditioning_hardening.py:67):
_import_nodes_seedvr_isolated() restored only comfy.model_management
and leaked comfy_extras.nodes_seedvr in sys.modules, which would let
the stubbed import bleed into later tests in the same pytest process.

Tightened to also snapshot/restore both:
- sys.modules[comfy_extras.nodes_seedvr]
- comfy_extras.nodes_seedvr package attribute on comfy_extras

Mirrors the pattern in test_seedvr_node_signature.py.

Local run: 8/8 cases pass on llamatron-cuda_1.

* address 4 Copilot follow-ups on commit 9a3bec8 (PR #32)

C1 (nodes_seedvr.py:32) — _resolve_seedvr2_diffusion_model used getattr(_,
None) which conflates attribute-missing with attribute-present-and-None.
Replaced None default with a private _ATTR_MISSING sentinel; now produces
distinct error messages for all four failure modes:
  mode 1: input has no model attribute
  mode 2: input.model is None
  mode 3: model.model has no diffusion_model attribute
  mode 4: model.model.diffusion_model is None

C2 (nodes_seedvr.py:57) and C3 (test header docstring) — both files still
described _apply_rope_freqs_float32_cast as a per-instance cache that
short-circuits subsequent calls. The cache was removed in 9a3bec8 to fix
codex P1 (stale-after-reload defect); the docstrings were not updated and
described behavior that no longer exists. Replaced both module-level
docstrings with the actual behavior: per-tensor dtype-check idempotency,
self-correcting against weight-restore lifecycle events, microsecond
iteration cost.

C4 (test count drift) — test_seedvr_conditioning_hardening.py grew from
3 tests to 5 across this PR cycle (1 removed, 3 added). The original PR
acceptance criteria named 3 tests including the removed
test_apply_rope_freqs_float32_cast_iterates_once_then_caches. Adding 2
new None-disambiguation modes is part of C1 above; the rename of the
RoPE-cast tests is part of C2/C3. Documenting via this commit message
and the disposition reply.
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 4, 2026
…ors (closes pollockjj/mydevelopment#109) (#33)

* [Issue 109][eval-and-fix] Replace bare except: in SeedVR2 conditioning split (closes pollockjj/mydevelopment#109)

NaDiT.forward had two bare ``except:`` clauses that silently swallowed
every failure mode on the SeedVR2 conditioning paths:

1. Input-side text-conditioning split (was lines 1387-1393):
   chunk(2, dim=0) + squeeze + flatten on the (neg, pos) context wrapped
   in try/except: → on any failure (real prompt-shape, dtype, OOM,
   downstream flatten error) the catch substituted the zero-initialised
   self.positive_conditioning buffer registered at __init__. Sampler
   then ran with empty conditioning and the user got a plausible-looking
   video unrelated to their prompt.

2. Output-side positive/negative split (was lines 1458-1463):
   pos, neg = out.chunk(2); cat([neg, pos]) wrapped in try/except: →
   on any failure the catch returned the un-split tensor in the wrong
   order and shape, with no diagnostic.

Fix:

- Input-side replaced with explicit absence predicate:
  ``context is None or getattr(context, "numel", lambda: None)() == 0``
  → fall back to positive_conditioning buffer. All other failure modes
  propagate the original torch exception (no silent re-route).
- Output-side: try/except removed entirely. A finished-sample tensor
  that won't chunk(2) is a contract violation, not a recoverable
  condition; let chunk(2) raise.

Both blocks were lifted into named private methods on NaDiT
(_resolve_text_conditioning, _swap_pos_neg_halves) so the regression
suite can drive the actual production code path without standing up a
fully-configured NaDiT transformer (which would require building the
entire block stack, RoPE, NaPatchIn/Out, AdaSingle, TimeEmbedding,
etc. — out of scope for this defect).

Regression: tests-unit/comfy_test/test_seedvr_conditioning_split.py
8 tests: source-level pin (no bare except: remains) + valid context
splits pos/neg + None falls back + numel==0 falls back + 1-D context
raises + odd-batch context raises + output-side mis-shaped raises +
output-side valid-tensor swap pin. All 8 pass in ~1.2s on CPU.

Source: #101 full-PR review follow-up; CodeRabbit r2959796334 on
Comfy-Org#11294. Pre-fix model.py blob SHA
eab9d83 (verified before edit).

* [Issue 109][eval-and-fix] Rename SeedVR2 conditioning regression to contract-bound path (slice 1 rework)

QA gate (qa-slice 109/slice1) returned HOLD on AC-1: the contract-bound
test path is `tests-unit/comfy_test/seedvr_model_test.py`, but the slice
submitted `tests-unit/comfy_test/test_seedvr_conditioning_split.py`.
Pytest invoked with the contract command exited 4 (file not found) —
even though the renamed module's tests pass against the post-fix tree.

Rework:

- Move `test_seedvr_conditioning_split.py` -> `seedvr_model_test.py`
  under the same directory so the AC-1 command resolves the file.
- Trim to the 7 tests required by the AC-1 contract ("7 passed in
  <T>s"): one source-level pin (no bare `except:` on forward path)
  plus one functional case per issue-body acceptance criterion (valid
  / missing / empty / wrong-rank / odd-batch / output-side mis-shape).
  The 8th `test_output_side_valid_tensor_swaps_halves` round-trip pin
  was supplementary to the issue-body AC list and is dropped to match
  the contract count.

No production code change. Source `comfy/ldm/seedvr/model.py` is left
unchanged; the helper extraction (`_resolve_text_conditioning`,
`_swap_pos_neg_halves`) the renamed test imports is already on this
branch from 0f8849a.

* [Issue 109][eval-and-fix] PR-review round 1: address Copilot inline findings

- test_no_bare_except_in_forward_path: replace brittle "except:" substring
  match with ast.ExceptHandler walk + dedent, so future docstrings or
  typed except handlers cannot false-positive/negative the guard.
- Add test_output_side_swaps_pos_neg_halves: positive coverage for the
  new _swap_pos_neg_halves helper — fills the AC gap where the existing
  output-side test only asserted raise-on-bad-shape, not the actual swap.

Both fixes raised by Copilot review on PR #33; both REQUIRED.

* tests-unit: docstring accuracy — positive_conditioning buffer is uninitialized, not zero

Copilot review on PR #33 (2026-05-04 15:34) flagged
that the module docstring described the old fallback as substituting
a 'zero positive_conditioning buffer'. NaDiT.__init__ registers the
buffer via torch.empty(...) which returns uninitialized memory — not
zeros. The distinction matters: an uninitialized buffer can hold NaN,
garbage from prior heap allocations, or whatever was on the page,
which is precisely WHY the old bare-except fallback was masking real
errors. Calling it 'zero' both lies about the mechanism and
understates the severity of the original bug.

Reworded to make the uninitialized-memory fact explicit and to
restate the failure-mode chain in concrete terms (NaN / heap
garbage / page contents replacing actual prompt embeddings).

Disposition: REQUIRED for accuracy. Concern #2 from the same review
pass (verbose error message embedding full function source on test
failure) is a stylistic ergonomic concern that does not affect
correctness and is dispositioned NOT-REQUIRED — see review reply
on the inline thread for rationale.

* model: explicit dim=0 on _swap_pos_neg_halves chunk + cat (axis-contract)

Copilot review on PR #33 (2026-05-04 17:38) flagged
that out.chunk(2) leaves dim implicit. Tensor.chunk and torch.cat
both default to dim=0 so this is functionally identical, but the
helper's CONTRACT is specifically 'split the BATCH axis into two
halves and swap them.' Implicit dim=0 lets a future tensor-reorder
refactor (e.g. one that moves the batch out of position 0) silently
produce wrong-axis splits with no test failure to catch it. Made
both calls explicit:

  out.chunk(2, dim=0)
  torch.cat([neg, pos], dim=0)

Test suite unchanged — 8/8 passes. The chunk semantics are identical;
this is documentation hardening, not behavior change.

* tests-unit: docstring matches production helper's explicit dim=0 contract

Copilot review on PR #33 (2026-05-04 17:59) flagged
that test_output_side_misshaped_tensor_raises's docstring quoted the
old pre-fix form 'pos, neg = out.chunk(2)' even though commit
cddd6c4 made the production helper use chunk(2, dim=0) explicitly
to lock the batch-axis contract in source. Updated the docstring to
match: 'pos, neg = out.chunk(2, dim=0)' + a sentence noting the
production helper's explicit-dim+cat-dim=0 contract.

Pure documentation alignment. Test behavior unchanged (8/8 pass).
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 4, 2026
…ame decode (closes pollockjj/mydevelopment#189) (#34)

* [Issue 189] Keep batch/time axes distinct in SeedVR2 VAE wrapper batched image-mode decode

VideoAutoencoderKLWrapper.decode applied .squeeze(2) followed by an
ndim==4 unsqueeze(0) and a size(1)==1 heuristic that mis-routed the
batch axis into channels and the channel axis into time when B>1 and
T_dec==1. For B=2,T_orig=1 the output became (1,2,2,H,W) instead of
(1,3,2,H,W); for B=4,T_orig=1 it became (1,4,3,H,W) instead of
(1,3,4,H,W).

Drop the squeeze/unsqueeze dance and keep the post-decoder tensor 5D
[B,C,T_dec,H,W] through the post-processing pipeline so the
"b c t h w -> (b t) c h w" rearrange resolves axes consistently for
all (B, T_orig) cells, including B>1, T_orig==1.

Latent-side prep block (b, tc, h, w = z.shape ; latent.view(b, 16, ...)
;  scale=0.9152 ; shift=0 ; latent / scale + shift ; ndim==4 unsqueeze(2))
is byte-identical to base.

CodeRabbit finding:
Comfy-Org#11294 (comment)

Adds tests-unit/comfy_test/test_seedvr_vae_decode_batch_axes.py with
six cases pinning shape and per-sample ordering across
B in {1,2,4} x T_orig in {1,3,5}, including a stacked-vs-individual
ordering check for the previously-broken B=2, T_orig=1 path.

* test(seedvr/vae): use literal shape tuples in decode batch-axes regression

Replace tuple(out.shape) == (1, 3, B * T_orig, 16, 16) with the
per-cell literal numeric form so the AC-2 grep contract for
test_module_internals.log matches verbatim. Also flatten the two
patch.object(...) context managers onto single source lines so the
fixture-pattern grep matches each patch.object(...decode_) and
patch.object(...lab_color_transfer) call without crossing newlines.
Behavior unchanged: 6 passed in 1.81s.

* address codex 3184515755 [P2] / Copilot 3184469104 — re-add 4D->5D normalization on tiled-decode branch for sf_t==1 + T_lat==1

tiled_vae() at comfy/ldm/seedvr/vae.py:179-180 explicitly squeezes the
temporal axis when temporal_downsample_factor == 1 AND the latent has a
single temporal frame:

    if x.shape[2] == 1 and sf_t == 1:
        result = result.squeeze(2)

The Issue 189 patch removed the prior `if x.ndim == 4: x = x.unsqueeze(0)`
heuristic (which had its own batch/time confusion bug). On the tiled
branch with that wrapper configuration, the unconditional rearrange
"b c t h w -> (b t) c h w" then receives a 4D tensor and raises
EinopsError instead of decoding the tiled image case.

Re-add the 4D->5D normalization scoped to the tiled branch only.
The non-tiled path stays unchanged because super().decode_(latent)
returns 5D unconditionally.

Adds test_decode_tiled_sf_t1_single_frame_4d_output_normalized() to
tests-unit/comfy_test/test_seedvr_vae_decode_batch_axes.py: patches
vae_mod.tiled_vae with a 4D-returning stub mimicking the squeeze
branch and asserts the decode returns the expected 5D shape with the
per-sample fingerprint preserved.

* address Copilot 3184602213 — add tiled-path B>1 + T_orig==1 regression coverage

Copilot follow-up on PR #34: the tiled-path 4D->5D normalization
(commit 3b3c150) was only covered for B=1, leaving the entire reason
the issue exists — the batched single-frame batch/time axis swap —
unprotected on the tiled path. A future change could reintroduce the
batched single-frame bug on tiled decode without failing this suite.

Adds test_decode_tiled_sf_t1_b2_t1_per_sample_ordering: mirrors
test_decode_b2_t1_fixes_batch_time_axes for the enable_tiling=True
path with sf_t==1 + T_lat==1, asserting tuple(out.shape) ==
(1, 3, 2, 16, 16) and per-sample fingerprint preservation
(out[0,0,0,0,0]=1.0, out[0,0,1,0,0]=2.0).
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 5, 2026
…lockjj/mydevelopment#119) (#36)

* [Issue 119][regression] VAE diffusers-format guard: KeyError → safe .get binary regression

Pin the CodeRabbit Critical fix at comfy/sd.py:443-446 with two AC-mapped
binary cells in tests-unit/comfy_test/vae_diffusers_format_guard_test.py:

- AC1 key-missing: state dict carries the diffusers-format trigger key and
  metadata is non-None but lacks 'keep_diffusers_format'. The fixed guard
  must call convert_vae_state_dict exactly once and not raise KeyError.
- AC2 key-present-"true": metadata['keep_diffusers_format'] == 'true'. The
  guard must skip conversion entirely.

Source thread: Comfy-Org#11294 r2959796358. Zero LOC of production
code change in this packet — the safe .get form is already merged from
upstream into pollockjj/ComfyUI:issue_101 (HEAD 8d5cfce).

* [Issue 119][regression] Add contract-named diffusers guard test cells

QA gate slice 1 contract requires nodeids
test_diffusers_guard_invokes_convert_when_metadata_missing_key (AC1)
and test_diffusers_guard_skips_convert_when_metadata_pins_keep_true
(AC2) under tests-unit/comfy_test/test_diffusers_metadata_guard.py.
The previously committed cells under vae_diffusers_format_guard_test.py
exercise the same guard but use names and a shape the gate's
verification command does not target, so the contract paths returned
404 and exit code 4. This module adds the contract-shaped cells
(unittest.mock.patch.object on comfy.sd.diffusers_convert.
convert_vae_state_dict with autospec, _make_standin binding
comfy.sd.VAE.__init__, contextlib.suppress around the post-guard
fallthrough, and a binary mock_convert.call_count assertion) at the
contract's exact path; vae_diffusers_format_guard_test.py is retained
unchanged as the manifest lists both files.

* pr review cycle 1: address findings

* pr review cycle 2: address findings

* pr review cycle 5: address findings

* pr review cycle 6: address findings

* pr review cycle 7: address findings

* pr review cycle 8: address findings

* pr review cycle 11: address findings

* pr review cycle 26: address findings

* pr review cycle 27: address findings

* pr review cycle 31: address findings

* [Issue 119] Strip docstring bloat from diffusers-guard regression

Reduces the test module from 310 LOC to 106 LOC by collapsing the
prose carried in across 31 PR-review cycles. No test logic changes:
helpers, mock wiring, halt point, and all 5 AC cells (key-missing,
keep-true, None, keep-false, empty-dict) are unchanged. Pytest:
5 passed in 2.20s on the prosoche dev slot venv.

Removed: the multi-paragraph module-docstring narrative that recapped
the slice-rework history and cited Copilot comment IDs (which rot when
threads are deleted); the per-test docstrings that re-explained the
helper contract already documented at `_exercise_guard`; and the
8-line sentinel-class docstring that justified its own existence.

Kept: a tight module docstring naming the bug, the merged fix, the 5
input shapes covered, and the precedent link to seedvr_model_test.py;
a one-sentence docstring per cell stating the input shape and the
expected branch outcome.
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 5, 2026
…lockjj/mydevelopment#194) (#43)

* [Issue 101][eval-and-fix] Guard SeedVR2 VAE decode against missing/malformed state (closes pollockjj/mydevelopment#194)

VideoAutoencoderKLWrapper.decode() previously accessed
self.original_image_video and self.img_dims without validation, raising
opaque torch errors deep in rearrange() / attribute unpacking when a
workflow wired the wrapper directly without SeedVR2InputProcessing
populating those attributes. CodeRabbit thread r2962219551 on upstream
PR Comfy-Org#11294 flagged this as Minor.

Fix:

  1. __init__: initialise self.img_dims = None so the missing-state
     branch is reachable from a default-constructed instance (matches
     the pre-existing self.original_image_video = None initialiser).

  2. decode() top: four ordered guards raising RuntimeError with
     SeedVR2-specific messages — presence of original_image_video,
     5-D rank check on original_image_video, presence of img_dims,
     2-arity check on img_dims. Each message names the offending
     attribute and includes "SeedVR2" so misuse converts to an
     actionable signal instead of an opaque torch failure.

Regression cover: tests-unit/comfy_test/test_seedvr_vae_decode_guards.py
(7 cells across 5 AC functions) using the __new__ + nn.Module.__init__
standin pattern from Comfy-Org#189 and the _PostGuardReached halt-point sentinel
pattern from Comfy-Org#119; AC5 patches VideoAutoencoderKL.decode_ to confirm the
guards passed and control crossed into the post-guard region without
standing up the heavy decode pipeline.

* [Issue 101][eval-and-fix] Rename SeedVR2 decode-guard tests to contract node ids + add AC6

Slice 1 rework for pollockjj/mydevelopment#194: align test node ids with
the qa-slice contract and add AC5/AC6 source-position invariants.

- AC1-AC4: rename the four wrapper-failure tests to
  test_none_original_image_video_raises_seedvr2_runtime_error,
  test_none_img_dims_raises_seedvr2_runtime_error,
  test_wrong_rank_original_image_video_raises_seedvr2_runtime_error,
  test_wrong_arity_img_dims_raises_seedvr2_runtime_error and switch each
  to pytest.raises(RuntimeError, match=r"SeedVR2.*<attr>") so the regex
  matcher both pins the message shape and discriminates the wrong
  exception class on a pre-fix worktree.
- AC5: replace the post-guard sentinel with the contract spelling —
  test_valid_state_passes_through_guards_to_tiled_vae now patches the
  module-level comfy.ldm.seedvr.vae.tiled_vae, asserts mock_tiled.call_count
  == 1 (proves valid input flowed past the guards into the post-guard
  region) and asserts inspect.getsource(VideoAutoencoderKLWrapper.decode)
  contains "raise RuntimeError(" before "b, tc, h, w = z.shape" (proves
  the guards are placed at the top of decode() ahead of the existing
  first body line).
- AC6: add test_init_initializes_img_dims_to_none asserting
  inspect.getsource(VideoAutoencoderKLWrapper.__init__) contains the
  literal "self.img_dims = None" so the missing-state branch is reachable
  from a default-constructed instance — a substring the unguarded base
  SHA does not carry.
- Add a _bypass_super_decode autouse-style fixture for AC1-AC4 that
  patches VideoAutoencoderKL.decode_ to return a synthesised 5-D tensor.
  Without it, the standin (built via __new__ to skip the heavy real
  __init__) lacks parent-class attributes like use_slicing, so on a
  pre-fix worktree the unguarded decode() body would die with an
  AttributeError before reaching the rearrange(self.original_image_video,
  ...) / o_h, o_w = self.img_dims sites the ACs target. With the fixture
  in place, pre-fix runs surface the contract-described einops /
  TypeError / ValueError / AttributeError fingerprints; post-fix runs
  short-circuit at the guard raises and never invoke the stub.

* [Issue 101][eval-and-fix] Close 3 SeedVR2 decode-guard misuse paths

Round-3 codex/copilot review on PR #43 (2026-05-05)
flagged three malformed-state paths the original guard block did not
catch, all reachable via direct-wrapper misuse (workflows that wire
VideoAutoencoderKLWrapper without going through
SeedVR2InputProcessing.execute):

  1. `original_image_video` set to a non-tensor → bare AttributeError
     on `.ndim` instead of the SeedVR2 RuntimeError.
  2. `img_dims` set to a non-sequence → bare TypeError from `len()`
     instead of the SeedVR2 RuntimeError.
  3. `tiled_args` not initialised in __init__ → bare AttributeError
     at `self.tiled_args.get(...)` later in decode() on a default-
     constructed wrapper, even when every other guard passed.

All three undermined the patch's stated contract: convert malformed
state into actionable SeedVR2-context errors. Both reviewers
independently surfaced the same three concerns, and codex's round-3
verdict was `patch is incorrect` with two of these as P2 blockers.

Fixes:
  - __init__ sets `self.tiled_args = {}` (default = tiling disabled
    until SeedVR2InputProcessing populates it).
  - decode() adds `torch.is_tensor(self.original_image_video)` guard
    before `.ndim` access.
  - decode() adds `isinstance(img_dims, (tuple, list))` guard before
    `len(img_dims)`.
  - decode() adds `isinstance(tiled_args, dict)` guard before
    `.get(...)`. Belt-and-braces with the __init__ initialiser:
    catches explicit non-dict assignment + missing-attr after
    `delattr(...)`.

Each guard raises a SeedVR2-context RuntimeError naming the bad
attribute and its type, mirroring the existing missing/wrong-rank
guards' shape.

Tests (5 new):
  - test_init_initializes_tiled_args_to_empty_dict: source-introspect
    pin on the new __init__ initialiser.
  - test_non_tensor_original_image_video_raises_seedvr2_runtime_error:
    list-typed original_image_video → SeedVR2 RuntimeError.
  - test_non_sequence_img_dims_raises_seedvr2_runtime_error: int-typed
    img_dims → SeedVR2 RuntimeError.
  - test_non_dict_tiled_args_raises_seedvr2_runtime_error: list-typed
    tiled_args → SeedVR2 RuntimeError.
  - test_missing_tiled_args_raises_seedvr2_runtime_error: post-
    delattr → SeedVR2 RuntimeError (defense in depth on the
    __init__ initialiser).

13 / 13 tests pass under the SeedVR2 decode-guards regression suite.
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 5, 2026
…act (closes pollockjj/mydevelopment#190) (#44)

* [Issue 101][eval-and-fix] Honor tensor/tuple return contract in VideoAutoencoderKL.forward (closes pollockjj/mydevelopment#190)

VideoAutoencoderKL.forward dereferenced .latent_dist and .sample on
encode()/decode_() returns and called self.decode (no underscore) on a
class that only defines decode_. Direct module invocation via
forward(x, mode=...) raised AttributeError on every path.

encode() / decode_() return a tensor when return_dict=True (the call
site default) and a one-element tuple when return_dict=False. The fix
unwraps that optional tuple shape and routes mode="decode" / mode="all"
through self.decode_ (with the trailing underscore) so the actual
contract is honored without changing either method signature.

Regression test tests-unit/comfy_test/seedvr_vae_forward_test.py
exercises forward() for mode="encode", "decode", and "all" against a
stub subclass that bypasses the heavy __init__, plus an
inspect.getsource assertion that no .latent_dist, no .sample, and no
self.decode( call survive in the post-fix forward body.

Source: CodeRabbit thread on Comfy-Org#11294
Comfy-Org#11294 (comment)

* [Issue 101][eval-and-fix] Cover tuple-return unwrap branch in forward()
pollockjj added a commit to pollockjj/ComfyUI that referenced this pull request May 5, 2026
…n contract (closes pollockjj/mydevelopment#192) (#46)

* [Issue 101][eval-and-fix] Fix VideoAutoencoderKLWrapper.forward return contract (closes pollockjj/mydevelopment#192)

VideoAutoencoderKLWrapper.forward() called self.decode(z).sample, but
VideoAutoencoderKLWrapper.decode() returns a plain torch.Tensor (post-Comfy-Org#189
/ PR #34 normalization). Direct wrapper invocation therefore raised
AttributeError on the tensor return.

Drop the .sample dereference; use the tensor return directly. forward()
now returns the (x_out, z, p) triple as documented in the wrapper's
encode/decode contract.

Sister to mydevelopment#190 (PR #44) which fixed the parent class
VideoAutoencoderKL.forward for the same bug class flagged on
Comfy-Org#11294 by CodeRabbit (thread r2959796348, "Also applies
to: 2083-2087" trailer).

Adds tests-unit/comfy_test/seedvr_vae_wrapper_forward_test.py — four
CPU-only regression tests that build a wrapper standin via __new__ +
nn.Module.__init__, register a single dummy parameter so the wrapper's
encode dtype lookup resolves, set original_image_video / img_dims /
tiled_args so the wrapper.decode guards pass, patch the parent
VideoAutoencoderKL.encode / decode_ plus the module-level
lab_color_transfer with fingerprint-tagged stubs, then call
wrapper.forward(x) end-to-end and assert (1) the return is a 3-tuple
of tensors with no AttributeError raised, (2) x_out has exact
expected shape, dtype, and value-fingerprint under torch.equal, (3) z
matches the encode-side posterior squeezed on dim 2 under torch.equal,
and (4) inspect.getsource(wrapper.forward) contains no ".sample" string.

* [Issue 101][eval-and-fix] Tighten VideoAutoencoderKLWrapper.forward regression test to two-test contract (pollockjj/mydevelopment#192)

The regression suite for VideoAutoencoderKLWrapper.forward now defines
exactly the two contract-named tests:

  - test_wrapper_forward_returns_tensor_triple monkeypatches
    VideoAutoencoderKLWrapper.encode and VideoAutoencoderKLWrapper.decode
    directly on the class, builds the wrapper standin via __new__ +
    nn.Module.__init__, sets original_image_video to a 5-D tensor and
    img_dims to a 2-tuple, invokes wrapper.forward(x), and asserts the
    full return-contract: 3-tuple, three torch.Tensor types, binary
    shape equality (x_out.shape == decode_out.shape, z.shape ==
    posterior.squeeze(2).shape), tensor equality
    (torch.equal(x_out, decode_out), torch.equal(z, posterior.squeeze(2))),
    and identity (p is posterior).

  - test_wrapper_forward_source_has_no_sample_access asserts
    ".sample" not in inspect.getsource(VideoAutoencoderKLWrapper.forward)
    so the failing pre-fix body raises an explicit assertion failure on
    the literal forbidden token.

Stubbing on the wrapper class (not the parent) bypasses the parent
encode/decode_ entirely, removes the lab_color_transfer monkey-patch,
and makes the AttributeError pre-fix path surface directly on
self.decode(z).sample because the stubbed decode returns a plain
tensor.
@alexisrolland
Copy link
Copy Markdown
Member

Closing this in favor of #14110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Core Core team dependency notify:jk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants