[Issue 101][eval-and-fix] Honor SeedVR2 GroupNorm memory limit (closes pollockjj/mydevelopment#187)#31
Conversation
…chunked GroupNorm path is reachable
The 5D GroupNorm gate in causal_norm_wrapper compared memory_occupy
against float('inf'), making the chunked dispatch branch unreachable
under any finite memory estimate. The configured _NORM_LIMIT (set via
set_norm_limit / VAE.set_memory_limit norm_max_mem) had no effect.
Replace the hardcoded float('inf') with get_norm_limit() so the
configured limit actually controls the gate. The GiB calc line is
preserved unchanged.
CodeRabbit finding on upstream PR Comfy-Org#11294:
Comfy-Org#11294 (comment)
Add tests-unit/comfy_test/test_seedvr_groupnorm_limit.py with two
regression cases covering default-limit (full path) and low-limit
(chunked path), each discriminated via a forward-hook + F.group_norm
spy pair.
…8,2,4,4)) per AC-2 contract QA gate slice 2 returned NOT MET on AC-2: shared fixture used torch.zeros(*_TENSOR_SHAPE, dtype=torch.float32) instead of the contract-required literal torch.randn(1,8,2,4,4). Inline the call in both test bodies so the literal substring causal_norm_wrapper(gn, torch.randn(1,8,2,4,4)) appears in source. Forward hook, F.group_norm patch.object spy, output-shape assertions, and set_norm_limit(None) restoration in finally are preserved unchanged. Both named tests still pass: python -m pytest -q tests-unit/comfy_test/test_seedvr_groupnorm_limit.py -> 2 passed, exit 0 Ruff still clean: python -m ruff check comfy/ldm/seedvr/vae.py tests-unit/comfy_test/test_seedvr_groupnorm_limit.py -> All checks passed
There was a problem hiding this comment.
Pull request overview
Restores SeedVR2 VAE 5D GroupNorm chunking behavior by honoring the configured norm_max_mem limit (via get_norm_limit()), and adds regression tests to prevent the dispatch gate from regressing to an unreachable chunked path.
Changes:
- Fix
causal_norm_wrapperGroupNorm dispatch gate to compare againstget_norm_limit()instead offloat("inf"). - Add unit tests that assert full-path behavior at the default limit and chunked-path behavior at a low limit.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
comfy/ldm/seedvr/vae.py |
Makes the GroupNorm chunking gate honor the configured norm memory limit. |
tests-unit/comfy_test/test_seedvr_groupnorm_limit.py |
Adds regression coverage to distinguish full vs chunked GroupNorm execution paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| handle = gn.register_forward_hook(_hook) | ||
| try: | ||
| with patch.object(vae_mod.F, "group_norm", side_effect=_group_norm_spy): | ||
| out_tensor = causal_norm_wrapper(gn, torch.randn(1,8,2,4,4)) |
There was a problem hiding this comment.
NOT-REQUIRED as a Ruff failure: E231 is in the active rule set (E family selected, not in lint.ignore) but Ruff CI on this PR ran SUCCESS on these exact lines. Ruff's E231 implementation does not flag this construction in current versions.
Adopted as a DRY improvement: the file already defines _TENSOR_SHAPE = (1, 8, 2, 4, 4) on line 54 and uses it on lines 92/94/144/146. Inlining torch.randn(1,8,2,4,4) with a different format on construction was an inconsistency. Switched to torch.randn(*_TENSOR_SHAPE) in commit 2e681fb.
| handle = gn.register_forward_hook(_hook) | ||
| try: | ||
| with patch.object(vae_mod.F, "group_norm", side_effect=_group_norm_spy): | ||
| out_tensor = causal_norm_wrapper(gn, torch.randn(1,8,2,4,4)) |
There was a problem hiding this comment.
Same disposition as the line-83 thread: NOT-REQUIRED as Ruff E231 (CI green), adopted as a DRY improvement using *_TENSOR_SHAPE in commit 2e681fb.
pollockjj
left a comment
There was a problem hiding this comment.
Review — [Issue 101][eval-and-fix] Honor SeedVR2 GroupNorm memory limit
Scope: 1 production line + 158-line regression test. Base issue_101 ← head issue_187. Closes pollockjj/mydevelopment#187.
Overview
Fixes the causal_norm_wrapper chunked-GroupNorm dispatch gate at comfy/ldm/seedvr/vae.py:509 so it reads the configured limit accessor (get_norm_limit()) instead of the placeholder float("inf") literal that made the chunked branch unreachable for any finite limit. Mirrors the canonical pattern already used by the chunked-conv sibling at vae.py:659-660. Adds a regression test pair locking in dispatch behavior under both the default-inf and a deliberately-low 1e-9 GiB limit.
Correctness
- The one-line change matches the diagnosis exactly. Line 508 (the
memory_occupy = x.numel() * x.element_size() / 1024**3GiB calc) is preserved byte-equal; only the gate RHS changes. get_norm_limit()is defined atvae.py:185-186of the same module and theset_norm_limit(...)mutator + theVideoAutoencoderKL.set_memory_limit(...)bridge are already wired correctly — only the gate consumer was missing.- Default
_NORM_LIMIT = float("inf")preserves prior behavior at any call site that does not invokeset_norm_limit(...). The default-limit test verifies this directly. # TODO: this may be set dynamically from the vaecorrectly removed since the TODO is resolved by the fix itself.
Test design — strong
- Two independent observers discriminate the branches:
register_forward_hookongncatches the full-forward path (which invokesnorm_layer(x)and triggersnn.Module.__call__).patch.object(vae_mod.F, "group_norm", side_effect=_group_norm_spy)catches directF.group_normcalls; the chunked branch dispatches viaF.group_normwithnum_groups_per_chunk = norm_layer.num_groups // num_chunks, so any spy call withnum_groups < _NUM_GROUPSis the chunked-branch signature.
- Both tests assert the positive and negative branch fires/does-not-fire — the default-limit test asserts
chunked_calls == 0(not justfull_calls == 1) and vice versa for the low-limit test. Catches a regression where both branches fire (e.g., anorinstead of anif/elsesomewhere downstream). - Output-shape assertion (
out_tensor.shape == _TENSOR_SHAPE) confirms the wrapper's 5D round-trip (b c t h w → (b t) c h w → b c t h w) is preserved on both branches. set_norm_limit(None)reset in bothfinallyblocks isolates module-global state across test runs. Consistent with the established sibling pattern intests-unit/comfy_test/test_seedvr_rope_delegation.py.- CPU-only by construction (
cli_args.cpu = Truebefore thecomfy.ldm.*import;gn.eval(); tiny float32 tensor) — no GPU required.
Suggestions / observations
- Coverage limited to
disable_weight_init.GroupNorm. The production gate at line 509 readsisinstance(norm_layer, ops.GroupNorm)whereopsresolves at module scope. Bothdisable_weight_init.GroupNormandmanual_cast.GroupNormcould appear in production paths depending on the active backend. The tests cover onlydisable_weight_init. If the dispatch is contractually identical for both subclasses (and they share theops.GroupNormMRO confirmed by GTP-3), a parametrized test would close the gap cheaply. - No numerical-equivalence check between branches. The tests verify dispatch (which branch ran), not output correctness (does the chunked branch produce the same numerical output as the full branch within float tolerance?). For a gate fix this scope is reasonable, but a follow-up
torch.allclose(full_output, chunked_output, rtol=..., atol=...)test would catch any future correctness drift in the chunked path. - Spy signature
eps=1e-5defaults — minor._group_norm_spy(input_tensor, num_groups_arg, weight=None, bias=None, eps=1e-5)matches today'sF.group_normsignature, but if a futurenn.functional.group_normever gains a kwarg (e.g.,device=None) the spy wouldTypeError. Replacing the explicit-kwarg signature with(*args, **kwargs)and unpacking would future-proof at minimal cost. patch.object(vae_mod.F, "group_norm", ...)mutates the globaltorch.nn.functional.group_normfor the duration of the context (sincevae_mod.F is torch.nn.functionalper GTP-13). Intentional and documented in the module docstring; flagged for completeness — any concurrent test that callsF.group_normfrom a different code path during the patch window would route through the spy. Not an issue for the current single-threaded pytest invocation.- Base-branch nomenclature. The Affected Repositories table in the PR body lists base branch
issue_101_pi, but the PR itself targetsissue_101. Confirm the rename was intentional (looks like the_pisuffix was the planning-time tag).
Performance
N/A — one-line gate change, no algorithmic delta. The test pair completes in milliseconds with the small-tensor + CPU configuration.
Security
No new external surface. No auth, network, or untrusted-input handling. Production change is purely an internal accessor swap.
Verdict
The fix is minimal, correct, and surgically scoped. The test design is the strongest part — two-observer dispatch discrimination is robust against future refactors of either branch. The suggestions above are non-blocking; the parametrized-ops.GroupNorm coverage and the chunked-vs-full numerical-equivalence test are worth tracking as follow-up but should not gate this merge.
Looks good to land.
…pilot review NOT-REQUIRED-but-DRY: lines 83/135 inlined torch.randn(1,8,2,4,4); aligning with the assertion sites on 92/94/144/146 that reference the constant)
…ignature Codex review findings on PR #31: - F1 (NOT-REQUIRED-non-blocking, applied): parametrize both tests over comfy.ops.disable_weight_init.GroupNorm AND comfy.ops.manual_cast.GroupNorm. Production gate at vae.py:509 reads isinstance(norm_layer, ops.GroupNorm) which matches both via MRO; coverage gap closed. Verified manual_cast exists with MRO (manual_cast.GroupNorm, disable_weight_init.GroupNorm, torch.nn.modules.normalization.GroupNorm). - F3 (NOT-REQUIRED-non-blocking, applied): spy signature widened from explicit-kwargs (input, num_groups, weight=None, bias=None, eps=1e-5) to (input, num_groups, *args, **kwargs) with passthrough. Future torch F.group_norm kwargs no longer TypeError. Findings F2 (numerical-equivalence test) tracked as separate follow-up; F4 (patch.object global mutation) acknowledged informational; F5 (base-branch rename issue_101_pi -> issue_101) confirmed intentional in review reply.
Disposition — codex review of 2026-05-03 21:32Verdict adopted: "looks good to land." Five findings dispositioned and addressed where actionable. F1 — Coverage limited to
|
Plan: Honor SeedVR2 GroupNorm memory limit
Overview
The SeedVR2 native VAE in
comfy/ldm/seedvr/vae.pyexposes a configured GroupNorm memory cap (norm_max_mem→set_norm_limit(...)→ module-level_NORM_LIMIT) but the chunked-GroupNorm dispatch gate atvae.py:509reads the literalfloat("inf")instead ofget_norm_limit(). The configured cap is therefore a no-op and the chunked branch is unreachable for any finite limit. This plan records that unreachability as a baseline decision packet onissue_101_piHEAD, applies the one-line gate fix onpollockjj/ComfyUI#issue_187(preserving the GiB calculation and the parallel structure of the chunked-conv sibling at lines 659–660), adds a regression test pair, runs a hygiene gate (ruff + production-code-shape independence), and emits a post-fix provenance decision packet. PR creation is delegated to the/prskill after slice completion.Diagnosis Summary
comfy/ldm/seedvr/vae.pyline 509 readsif isinstance(norm_layer, ops.GroupNorm) and memory_occupy > float("inf"): # TODO: this may be set dynamically from the vae, and the gate's right-hand operand is the literal positive infinity. No finitememory_occupy(computed on line 508 asx.numel() * x.element_size() / 1024**3GiB) can exceed positive infinity, so the chunked-GroupNormelsebranch at line 521 always runs and the chunked branch at lines 510–520 is unreachable.set_norm_limit(1e-9)followed bycausal_norm_wrapper(ops.GroupNorm(num_channels=8, num_groups=4), torch.randn(1,8,2,4,4))runs the full GroupNorm path (thenn.Moduleforward hook fires once with shape(2,8,4,4)) regardless of the configured limit. Measured live onissue_101_piHEAD4ffef0a4.float("inf")was left in place of the configured limit accessorget_norm_limit()(defined on lines 185–186 of the same file). The author marked the line with# TODO: this may be set dynamically from the vae. The mutatorset_norm_limit(value)(lines 189–193) and the upstream config bridgeVideoAutoencoderKL.set_memory_limit(conv_max_mem, norm_max_mem)(lines 2297–2298) are wired correctly; only the gate's RHS is wrong.x.ndim == 5) routed throughcausal_norm_wrapper(GroupNorm, x)is affected; the boundary is exactly theisinstance(norm_layer, ops.GroupNorm)branch starting at line 502 with the 5D sub-branch starting at line 505. The 4D sub-branch (line 503–504) is unaffected because it never hits the gate.memory_occupy > float("inf")withmemory_occupy > get_norm_limit()at line 509. Preserve line 508 (the GiB calc) verbatim. The fix mirrors the chunked-conv sibling pattern at lines 659–660, whereInflatedCausalConv3d.forwardgates against the per-instanceself.memory_limit. The# TODOcomment may be removed by the slicer (resolved by the fix) but its retention or removal is not load-bearing for the AC contract.set_norm_limit(1e-9)will exercise the gate and emit named-key JSON describing which branch ran. Pre-fix the JSON recordschunked_groupnorm_reachable=false; post-fix the JSON recordschunked_groupnorm_reachable=true. A regression test pair intests-unit/comfy_test/asserts the same dispatch behavior under pytest — the low-limit test exercises the chunked branch via acomfy.ldm.seedvr.vae.F.group_normspy that records every direct call'snum_groupsargument and asserts at least one call hasnum_groups < gn.num_groups(the chunked branch's signature). A hygiene slice runs ruff on the touched files and asserts no observability detuning was introduced. A final decision-packet slice records post-fix provenance.Affected Repositories
pollockjj/mydevelopment/home/johnj/dev_cuda_1/mydevelopmentmainissue_187pollockjj/ComfyUI/home/johnj/dev_cuda_0/ComfyUIissue_101_piissue_187Research and Methodology
Plan Foundations Comment URL: https://github.com/pollockjj/mydevelopment/issues/187#issuecomment-4361175762
Detected Scope:
none— single-line gate fix where the existingvae.pycode IS the spec; the chunked-conv sibling atvae.py:659-660already canonicalizes the configured-limit pattern this defect must restore for the GroupNorm pathway.The linked comment is the load-bearing artifact for every measurement, equivalence rule, canonical-protocol, tool version, and pipeline-stage citation in this plan. Every AC below that names a metric, threshold, equivalence rule, canonical protocol, tool version, or pipeline stage traces by quote or URL to an entry in that comment's
## Research and Methodologysection. qa-plan verifies the comment exists and contains the required subsections for the detected scope.Tools, Pipeline, and Measurements
Plan Foundations Comment URL: https://github.com/pollockjj/mydevelopment/issues/187#issuecomment-4361175762
The linked comment's
## Tools, Pipeline, and Measurementssection enumerates: (1) Existing Tooling Inventory withREUSE/KNOWN-GOOD-REF/NEW/WAIVEDstatus per tool; (2) Pipeline stages from input to output artifact; (3) Measurements table with tool+version, invocation, score range, tolerance; (4) Single-probe-before-sweep / boundary-bracketing requirements (bothN/Afor this plan and explicitly justified). Every AC below that names a tool, version, pipeline stage, or measurement traces by quote or URL to an entry in that section.Ground Truth Probes
For every literal API surface that any AC references — attribute, method, function signature, parameter list, CLI flag, file path, env var, magic constant — captured live on the issue's base branch
pollockjj/ComfyUI:issue_101_pi(HEAD4ffef0a4) on 2026-05-01 from/home/johnj/dev_cuda_0/ComfyUI. Plan-introduced literals (probe JSON keys, test names, paths created by this plan) are anchored separately in## Created Surface Contract, not here.cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "from comfy.cli_args import args; args.cpu=True; import comfy.ldm.seedvr.vae as v, inspect; print(inspect.signature(v.causal_norm_wrapper))"(norm_layer: torch.nn.modules.module.Module, x: torch.Tensor) -> torch.Tensorcd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "from comfy.cli_args import args; args.cpu=True; import comfy.ldm.seedvr.vae as v, inspect; print(inspect.signature(v.set_norm_limit)); print(inspect.signature(v.get_norm_limit)); print(repr(v._NORM_LIMIT))"(value: Optional[float] = None)/()/infcd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "from comfy.cli_args import args; args.cpu=True; import comfy.ops as o; print(type(o.disable_weight_init.GroupNorm(num_channels=8, num_groups=4)).__mro__[:3])"(<class 'comfy.ops.disable_weight_init.GroupNorm'>, <class 'torch.nn.modules.normalization.GroupNorm'>, <class 'torch.nn.modules.module.Module'>)cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "from comfy.cli_args import args; args.cpu=True; import torch, comfy.ops as o; gn=o.disable_weight_init.GroupNorm(num_channels=8,num_groups=4); gn.eval(); calls=[]; gn.register_forward_hook(lambda m,i,out: calls.append(tuple(i[0].shape))); _=gn(torch.randn(2,8,4,4)); print(calls)"[(2, 8, 4, 4)]sed -n '508,509p' /home/johnj/dev_cuda_0/ComfyUI/comfy/ldm/seedvr/vae.pymemory_occupy = x.numel() * x.element_size() / 1024**3; line 509:if isinstance(norm_layer, ops.GroupNorm) and memory_occupy > float("inf"): # TODO: this may be set dynamically from the vaels /home/johnj/dev_cuda_0/ComfyUI/tests-unit/comfy_test/folder_path_test.py/model_detection_test.py/__pycache__/test_seedvr_rope_delegation.py/test_seedvr_vae_tiled_args_no_mutate.py(notest_seedvr_groupnorm_limit.pypresent)cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "print(9.5e-7 > float('inf')); print(9.5e-7 > 1e-9)"False/Truesed -n '184,193p' /home/johnj/dev_cuda_0/ComfyUI/comfy/ldm/seedvr/vae.py_NORM_LIMIT = float("inf")/def get_norm_limit():/return _NORM_LIMIT/ (blank) / (blank) /def set_norm_limit(value: Optional[float] = None):/global _NORM_LIMIT/if value is None:/value = float("inf")/_NORM_LIMIT = valuecd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/pip list 2>/dev/null | grep -E "^(torch|einops)\\b"einops 0.8.2/torch 2.11.0+cu130cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -m pytest --version 2>&1 | head -1; ruff --version 2>&1 | head -1pytest 9.0.3/ruff 0.15.9cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "from comfy.cli_args import args; print(type(args.cpu).__name__, args.cpu); args.cpu=True; print(args.cpu)"bool False/Truegit -C /home/johnj/dev_cuda_0/ComfyUI rev-parse HEAD; git -C /home/johnj/dev_cuda_0/ComfyUI rev-parse --abbrev-ref HEAD4ffef0a410bb0f999ee2579b297b26470fab2a22/issue_101_picd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python -c "from comfy.cli_args import args; args.cpu=True; import comfy.ldm.seedvr.vae as v; import torch.nn.functional as F; print(v.F is F); print(v.F.group_norm is F.group_norm); print(v.F.__name__)"True/True/torch.nn.functionalgrep -nE 'args\\.cpu = True|torch\\.cuda\\.is_available' /home/johnj/dev_cuda_0/ComfyUI/tests-unit/comfy_test/test_seedvr_rope_delegation.py51:if not torch.cuda.is_available():/52: args.cpu = True/136: marks=pytest.mark.skipif(not torch.cuda.is_available(), reason="no cuda"),If a planned literal cannot be probed (slot offline, source absent, tool missing) the AC is rewritten to remove the unverified literal or the slice is rescoped. No probe → no AC.
Created Surface Contract
These literals are introduced by this plan and therefore are NOT subject to Phase 0.5 ground-truth probing. They become probable after the slice that creates them lands.
Probe-script API (Slice 1 introduces; Slice 2 reuses):
--out PATHCLI flag (writes UTF-8 JSON toPATH);--limit VALUECLI flag (callsset_norm_limit(float(VALUE)); default1e-9); script exit code0on success.Probe JSON named keys (Slice 1 introduces baseline; Slice 2 emits fixed):
chunked_groupnorm_reachable(bool),full_groupnorm_calls(int),chunked_groupnorm_calls(int),computed_memory_gib(float, GiB),configured_norm_limit_gib(float, GiB),tensor_shape(list of int),tensor_dtype(string),groupnorm_num_channels(int),groupnorm_num_groups(int),mydevelopment_head_sha(string),comfyui_head_sha(string),comfyui_branch(string),vae_line_509(string verbatim).Comparison JSON named keys (Slice 2 introduces):
branch_reachability_delta(string, value"false->true"),prior_artifact_path(string, relative path),new_artifact_path(string, relative path),prior_chunked_groupnorm_reachable(bool),new_chunked_groupnorm_reachable(bool),prior_full_groupnorm_calls(int),new_full_groupnorm_calls(int),prior_chunked_groupnorm_calls(int),new_chunked_groupnorm_calls(int).Test names (Slice 2 introduces):
test_seedvr_groupnorm_default_limit_uses_full_groupnorm_path,test_seedvr_groupnorm_low_limit_uses_chunked_groupnorm_path. Both live in moduletests_unit.comfy_test.test_seedvr_groupnorm_limit.File paths (Slices 1–4 introduce):
pollockjj/mydevelopment#issue_187:github_issues/187/probe_seedvr_groupnorm_limit.py(S1),github_issues/187/slice1/groupnorm_limit_baseline.json(S1),github_issues/187/slice1/probe_run.log(S1),github_issues/187/slice1/provenance.md(S1),github_issues/187/slice2/groupnorm_limit_fixed.json(S2),github_issues/187/slice2/before_after_comparison.json(S2),github_issues/187/slice2/probe_run.log(S2),github_issues/187/slice2/pytest.log(S2),github_issues/187/slice2/vae_post_fix_line509.txt(S2),github_issues/187/slice2/vae_diff.patch(S2),github_issues/187/slice3/ruff.log(S3),github_issues/187/slice3/no_detuning.log(S3),github_issues/187/slice4/provenance.md(S4).pollockjj/ComfyUI#issue_187:comfy/ldm/seedvr/vae.py(modified at line 509; line 508 preserved verbatim) (S2),tests-unit/comfy_test/test_seedvr_groupnorm_limit.py(S2 introduces).Source-line replacement (Slice 2 introduces): at
comfy/ldm/seedvr/vae.pyline 509 the literalmemory_occupy > float("inf")is replaced withmemory_occupy > get_norm_limit(). The# TODO: this may be set dynamically from the vaetrailing comment is no longer load-bearing; the slicer may retain or remove it without changing AC pass status.Asset Readiness
pollockjj/ComfyUI:issue_101_piHEAD4ffef0a4/home/johnj/dev_cuda_0/ComfyUIcomfy/ldm/seedvr/vae.py(gate at line 509, calc at line 508,_NORM_LIMIT/accessors at 184–193,set_memory_limitbridge at 2297–2301)/home/johnj/dev_cuda_0/ComfyUI/comfy/ldm/seedvr/vae.pyissue_101_piHEADtests-unit/comfy_test//home/johnj/dev_cuda_0/ComfyUI/tests-unit/comfy_test/issue_101_piHEADtests-unit/comfy_test/test_seedvr_rope_delegation.py(args.cpu = Trueblock beforecomfy.ldm.*import)/home/johnj/dev_cuda_0/ComfyUI/tests-unit/comfy_test/test_seedvr_rope_delegation.pyissue_101_piHEADtorch==2.11.0+cu130,einops==0.8.2,pytest==9.0.3)/home/johnj/dev_cuda_0/ComfyUI/.venvruff0.15.9 binary/home/johnj/.local/bin/ruffcomfy.cli_args.args.cpuflag/home/johnj/dev_cuda_0/ComfyUI/comfy/cli_args.pyissue_101_piHEADcomfy.ldm.seedvr.vae.Fbinding ==torch.nn.functionalpollockjj/mydevelopment:mainHEADcde849a6/home/johnj/dev_cuda_1/mydevelopmentIC_kwDOR2e1q88AAAABA_JC0gNo external assets (model weights, video/image datasets, OS packages) are required.
Slices
Slice 1: Pre-fix Baseline Reachability Probe
Kind: decision-packet
Objective: Produce the baseline reachability decision packet showing the chunked-GroupNorm branch is unreachable on
pollockjj/ComfyUI:issue_101_piHEAD4ffef0a4before any production fix is applied.Acceptance Criteria
AC-1: File
github_issues/187/probe_seedvr_groupnorm_limit.pyis committed onpollockjj/mydevelopment:issue_187and is a Python module that exposes a--out PATHCLI flag, a--limit VALUECLI flag (default1e-9), exits 0 on success, and writes a UTF-8 JSON file toPATHcontaining exactly the named-key set{chunked_groupnorm_reachable, full_groupnorm_calls, chunked_groupnorm_calls, computed_memory_gib, configured_norm_limit_gib, tensor_shape, tensor_dtype, groupnorm_num_channels, groupnorm_num_groups, mydevelopment_head_sha, comfyui_head_sha, comfyui_branch, vae_line_509}— verified by committed artifactgithub_issues/187/slice1/probe_run.logcapturing the stdout ofcd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python /home/johnj/dev_cuda_1/mydevelopment/github_issues/187/probe_seedvr_groupnorm_limit.py --helpshowing the two flags and exit-0 behavior, plus the JSON artifact existing at the path declared in AC-2.pollockjj/ComfyUI:issue_101_piHEAD4ffef0a4before any production fix is applied."AC-2: Probe is invoked as
cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python /home/johnj/dev_cuda_1/mydevelopment/github_issues/187/probe_seedvr_groupnorm_limit.py --out /home/johnj/dev_cuda_1/mydevelopment/github_issues/187/slice1/groupnorm_limit_baseline.json --limit 1e-9againstissue_101_piHEAD4ffef0a4(no production fix applied) and the resultinggithub_issues/187/slice1/groupnorm_limit_baseline.jsonis committed onpollockjj/mydevelopment:issue_187with the named-key valueschunked_groupnorm_reachable=false,full_groupnorm_calls=1,chunked_groupnorm_calls=0,configured_norm_limit_gib=1e-9,computed_memory_gibpopulated to a positive float in(0, 1e-5],tensor_shape=[1,8,2,4,4],tensor_dtype="torch.float32",groupnorm_num_channels=8,groupnorm_num_groups=4,vae_line_509containing the substringmemory_occupy > float("inf")— verified by committed artifactgithub_issues/187/slice1/groupnorm_limit_baseline.jsonparsed and key-checked, plus committed artifactgithub_issues/187/slice1/probe_run.logcapturing the probe's stdout/stderr.pollockjj/ComfyUI:issue_101_piHEAD4ffef0a4before any production fix is applied."AC-3: File
github_issues/187/slice1/provenance.mdis committed onpollockjj/mydevelopment:issue_187containing exactly the named-line keysdecision_summary:,recommended_action:,mydevelopment_head_sha:,comfyui_head_sha:,comfyui_branch:,coderabbit_url:,upstream_pr_url:,probe_command:,probe_output_path:, withdecision_summaryvaluebaseline_groupnorm_chunking_unreachable,recommended_actionvalueapply_gate_fix_on_pollockjj/ComfyUI_issue_187,comfyui_head_shavalue4ffef0a410bb0f999ee2579b297b26470fab2a22,comfyui_branchvalueissue_101_pi,coderabbit_urlvaluehttps://github.com/Comfy-Org/ComfyUI/pull/11294#discussion_r2959796343,upstream_pr_urlvaluehttps://github.com/Comfy-Org/ComfyUI/pull/11294,probe_commandvalue matching the AC-2 command verbatim,probe_output_pathvaluegithub_issues/187/slice1/groupnorm_limit_baseline.json— verified by committed artifactgithub_issues/187/slice1/provenance.mdandgrep '^<key>: 'parseability against the listed key set.pollockjj/ComfyUI:issue_101_piHEAD4ffef0a4before any production fix is applied."Slice 2: Gate Fix, Regression Test Pair, Post-Fix Probe and Comparison
Kind: implementation
Objective: Prove that replacing the
> float("inf")literal atcomfy/ldm/seedvr/vae.pyline 509 with> get_norm_limit()makes the chunked-GroupNorm branch reachable under a configuredset_norm_limit(...)while preserving the default-limit (full-path) behavior, by landing the one-line gate fix onpollockjj/ComfyUI:issue_187, adding a two-test regression module that fails pre-fix on the chunked-path test, re-running the same probe to recordchunked_groupnorm_reachable=true, and emitting a before/after delta artifact.Acceptance Criteria
AC-1: File
comfy/ldm/seedvr/vae.pyonpollockjj/ComfyUI:issue_187HEAD contains the literalmemory_occupy > get_norm_limit()on the same logical line that previously readmemory_occupy > float("inf")(the diff is restricted to this single token replacement on theif isinstance(norm_layer, ops.GroupNorm) and ...line); line 508 (memory_occupy = x.numel() * x.element_size() / 1024**3) is preserved verbatim — verified by committed artifactgithub_issues/187/slice2/vae_post_fix_line509.txtcontaining the output ofcd /home/johnj/dev_cuda_0/ComfyUI && grep -nF 'memory_occupy > get_norm_limit()' comfy/ldm/seedvr/vae.py(one matching line) AND committed artifactgithub_issues/187/slice2/vae_diff.patchcontaining the output ofgit -C /home/johnj/dev_cuda_0/ComfyUI diff issue_101_pi..issue_187 -- comfy/ldm/seedvr/vae.pywhose hunk header references line 509 and whose changed lines contain exactly one-line with the substringmemory_occupy > float("inf")and exactly one+line with the substringmemory_occupy > get_norm_limit().vae.py:509readsif isinstance(norm_layer, ops.GroupNorm) and memory_occupy > float("inf"):(placeholder literalfloat("inf")left in place of the configured-limit accessorget_norm_limit()defined on lines 185-186 per GTP-8); pre-fixgrep -nF 'memory_occupy > get_norm_limit()' comfy/ldm/seedvr/vae.pyreturns no matchandismemory_occupy > get_norm_limit();grep -nF 'memory_occupy > float("inf")' /home/johnj/dev_cuda_0/ComfyUI/comfy/ldm/seedvr/vae.pyreturns no match; line 508 contains the GiB calc verbatim> float("inf")literal atcomfy/ldm/seedvr/vae.pyline 509 with> get_norm_limit()makes the chunked-GroupNorm branch reachable under a configuredset_norm_limit(...)while preserving the default-limit (full-path) behavior, by landing the one-line gate fix onpollockjj/ComfyUI:issue_187, adding a two-test regression module that fails pre-fix on the chunked-path test, re-running the same probe to recordchunked_groupnorm_reachable=true, and emitting a before/after delta artifact."AC-2: File
tests-unit/comfy_test/test_seedvr_groupnorm_limit.pyonpollockjj/ComfyUI:issue_187HEAD defines the two functionstest_seedvr_groupnorm_default_limit_uses_full_groupnorm_pathandtest_seedvr_groupnorm_low_limit_uses_chunked_groupnorm_path. The first setsset_norm_limit(None), constructscomfy.ops.disable_weight_init.GroupNorm(num_channels=8, num_groups=4)with a registerednn.Moduleforward hook that appends to a list, runscausal_norm_wrapper(gn, torch.randn(1,8,2,4,4)), and asserts the hook fired exactly once AND that the output shape equals(1,8,2,4,4)AND restoresset_norm_limit(None)in afinally. The second setsset_norm_limit(1e-9), builds the same fixture, additionally patchescomfy.ldm.seedvr.vae.F.group_normwith aunittest.mock.patch.objectspy whoseside_effectrecords thenum_groupsargument of every direct call AND delegates to the realtorch.nn.functional.group_norm, runs the samecausal_norm_wrapper(gn, torch.randn(1,8,2,4,4)), and asserts the hook fired exactly zero times AND that the spy recorded at least one direct call whosenum_groupsargument is strictly less thangn.num_groups(which is 4) AND that the output shape equals(1,8,2,4,4)AND restoresset_norm_limit(None)in afinally. The module setscomfy.cli_args.args.cpu = Truebefore importing anycomfy.ldm.*symbol whentorch.cuda.is_available()is False (matching the established pattern intests-unit/comfy_test/test_seedvr_rope_delegation.pylines 51-52 per GTP-14). Both tests pass undercd /home/johnj/dev_cuda_0/ComfyUI && python -m pytest -q tests-unit/comfy_test/test_seedvr_groupnorm_limit.pywith overall exit 0 and the summary line containing2 passed— verified by committed artifactgithub_issues/187/slice2/pytest.logcapturing stdout+stderr of that pytest invocation including the2 passedsummary line and test names.issue_101_piHEAD (no fix) causestest_seedvr_groupnorm_low_limit_uses_chunked_groupnorm_pathto fail withAssertionErroron thelen(forward_hook_count) == 0assertion (the gate> float("inf")keeps the full path active so the hook fires once with shape(2,8,4,4)per GTP-4 / GTP-7); pytest summary contains1 failed, 1 passed2 passed; the low-limit test'sforward_hook_countis[](zero entries); the low-limit test'sF.group_normspy recorded list contains at least one entry withnum_groups < 4; the default-limit test'sforward_hook_counthas exactly one entry with shape(2,8,4,4); both tests' return tensors have shape(1,8,2,4,4)> float("inf")literal atcomfy/ldm/seedvr/vae.pyline 509 with> get_norm_limit()makes the chunked-GroupNorm branch reachable under a configuredset_norm_limit(...)while preserving the default-limit (full-path) behavior, by landing the one-line gate fix onpollockjj/ComfyUI:issue_187, adding a two-test regression module that fails pre-fix on the chunked-path test, re-running the same probe to recordchunked_groupnorm_reachable=true, and emitting a before/after delta artifact."AC-3: Probe is invoked as
cd /home/johnj/dev_cuda_0/ComfyUI && .venv/bin/python /home/johnj/dev_cuda_1/mydevelopment/github_issues/187/probe_seedvr_groupnorm_limit.py --out /home/johnj/dev_cuda_1/mydevelopment/github_issues/187/slice2/groupnorm_limit_fixed.json --limit 1e-9againstpollockjj/ComfyUI:issue_187HEAD (post-fix from AC-1) and the resultinggithub_issues/187/slice2/groupnorm_limit_fixed.jsonis committed onpollockjj/mydevelopment:issue_187with the named-key valueschunked_groupnorm_reachable=true,full_groupnorm_calls=0,chunked_groupnorm_calls=2,configured_norm_limit_gib=1e-9,computed_memory_gibpopulated to the same positive float as the Slice 1 baseline (within 1e-12 absolute),tensor_shape=[1,8,2,4,4],tensor_dtype="torch.float32",groupnorm_num_channels=8,groupnorm_num_groups=4,vae_line_509containing the substringmemory_occupy > get_norm_limit()and NOT containingfloat("inf")— verified by committed artifactgithub_issues/187/slice2/groupnorm_limit_fixed.jsonparsed and key-checked, plus committed artifactgithub_issues/187/slice2/probe_run.log.issue_101_piHEAD (Slice 1 AC-2 outcome) recordschunked_groupnorm_reachable=false,full_groupnorm_calls=1,chunked_groupnorm_calls=0,vae_line_509containingfloat("inf")(the gate> float("inf")keeps the full path active per GTP-5 / GTP-7); the existing_NORM_LIMIT/get_norm_limit()accessors at lines 184-193 (GTP-8) are unused by the gatechunked_groupnorm_reachable=true,full_groupnorm_calls=0,chunked_groupnorm_calls=2,vae_line_509containingget_norm_limit()and not containingfloat("inf"); the gate now reads from the sameget_norm_limit()accessor that the configurednorm_max_memflows into viaset_norm_limit(...)(per GTP-8 + GTP-2)> float("inf")literal atcomfy/ldm/seedvr/vae.pyline 509 with> get_norm_limit()makes the chunked-GroupNorm branch reachable under a configuredset_norm_limit(...)while preserving the default-limit (full-path) behavior, by landing the one-line gate fix onpollockjj/ComfyUI:issue_187, adding a two-test regression module that fails pre-fix on the chunked-path test, re-running the same probe to recordchunked_groupnorm_reachable=true, and emitting a before/after delta artifact."AC-4: File
github_issues/187/slice2/before_after_comparison.jsonis committed onpollockjj/mydevelopment:issue_187containing exactly the named-key set{branch_reachability_delta, prior_artifact_path, new_artifact_path, prior_chunked_groupnorm_reachable, new_chunked_groupnorm_reachable, prior_full_groupnorm_calls, new_full_groupnorm_calls, prior_chunked_groupnorm_calls, new_chunked_groupnorm_calls}with valuesbranch_reachability_delta="false->true",prior_artifact_path="github_issues/187/slice1/groupnorm_limit_baseline.json",new_artifact_path="github_issues/187/slice2/groupnorm_limit_fixed.json",prior_chunked_groupnorm_reachable=false,new_chunked_groupnorm_reachable=true,prior_full_groupnorm_calls=1,new_full_groupnorm_calls=0,prior_chunked_groupnorm_calls=0,new_chunked_groupnorm_calls=2— verified by committed artifact at the named path parsed and key-checked.chunked_groupnorm_reachable=false); the comparison artifact does not exist andbranch_reachability_deltacannot be"false->true"because no post-fix run has been performedbranch_reachability_delta="false->true"and the per-key value pairs above match; the comparison JSON exists at the named path on the issue branch> float("inf")literal atcomfy/ldm/seedvr/vae.pyline 509 with> get_norm_limit()makes the chunked-GroupNorm branch reachable under a configuredset_norm_limit(...)while preserving the default-limit (full-path) behavior, by landing the one-line gate fix onpollockjj/ComfyUI:issue_187, adding a two-test regression module that fails pre-fix on the chunked-path test, re-running the same probe to recordchunked_groupnorm_reachable=true, and emitting a before/after delta artifact."Slice 3: Hygiene Validator Gate (Ruff Lint + Production-Code Shape Independence)
Kind: hygiene
Objective: Prove the Slice 2 commits leave the touched files ruff-clean and free of any observability detuning (no added
print/logging.debug/ sentinel comments) that could land as production-code noise alongside the gate fix.Acceptance Criteria
AC-1:
cd /home/johnj/dev_cuda_0/ComfyUI && python -m ruff check comfy/ldm/seedvr/vae.py tests-unit/comfy_test/test_seedvr_groupnorm_limit.pyexits 0 againstpollockjj/ComfyUI:issue_187HEAD — verified by committed validator-output artifactgithub_issues/187/slice3/ruff.logcapturing stdout+stderr of that invocation and ending with theAll checks passed!line, with shell exit code recorded as0in the log's final lineRUFF_EXIT: 0.print/logging.debug/ sentinel comments) that could land as production-code noise alongside the gate fix."AC-2: Slice 2 introduced zero observability detuning (
print,logging.debug,logging.info, sentinel comments, debug markers) incomfy/ldm/seedvr/vae.pyand intests-unit/comfy_test/test_seedvr_groupnorm_limit.py— verified by committed validator-output artifactgithub_issues/187/slice3/no_detuning.logcontaining the combined output of two greps: (a)grep -nE '^[+]\s*(print|logging\.debug|logging\.info|# TODO\s*(?:DEBUG|MARKER|SENTINEL))' /home/johnj/dev_cuda_1/mydevelopment/github_issues/187/slice2/vae_diff.patchreturning no matches, AND (b)grep -nE '\bprint\(|logging\.debug|logging\.info|# TODO\s*(?:DEBUG|MARKER|SENTINEL)' /home/johnj/dev_cuda_0/ComfyUI/tests-unit/comfy_test/test_seedvr_groupnorm_limit.pyreturning no matches, with the log's final line recordingNO_DETUNING_EXIT: 0.print/logging.debug/ sentinel comments) that could land as production-code noise alongside the gate fix."Slice 4: Post-Fix Provenance Decision Packet
Kind: decision-packet
Objective: Produce the post-fix provenance decision packet binding the issue branch HEADs, the canonical-reference URLs, and the verbatim invocation commands used by Slices 2 and 3 into a single architect-readable record.
Acceptance Criteria
github_issues/187/slice4/provenance.mdis committed onpollockjj/mydevelopment:issue_187containing exactly the named-line keysdecision_summary:,recommended_action:,mydevelopment_head_sha:,comfyui_head_sha:,comfyui_branch:,comfyui_base_sha:,comfyui_base_branch:,coderabbit_url:,upstream_pr_url:,probe_command:,probe_output_path:,regression_test_command:,ruff_command:, withdecision_summaryvaluepost_fix_groupnorm_chunking_restored,recommended_actionvalueinvoke_/pr_from_issue_187_to_issue_101_pi,comfyui_base_shavalue4ffef0a410bb0f999ee2579b297b26470fab2a22,comfyui_base_branchvalueissue_101_pi,comfyui_branchvalueissue_187,coderabbit_urlvaluehttps://github.com/Comfy-Org/ComfyUI/pull/11294#discussion_r2959796343,upstream_pr_urlvaluehttps://github.com/Comfy-Org/ComfyUI/pull/11294,probe_commandvalue matching the Slice 2 AC-3 command verbatim,regression_test_commandvalue matching the Slice 2 AC-2 pytest invocation verbatim,ruff_commandvalue matching the Slice 3 AC-1 ruff invocation verbatim — verified by committed artifact at the named path andgrep '^<key>: 'parseability against the listed key set.Constraints
python(resolved by orchestrator/slicer to the deliverable venv at/home/johnj/dev_cuda_0/ComfyUI/.venv/bin/pythonfor ComfyUI-touching commands; no AC hardcodes an absolute interpreter path).debug/run_debug.py(not exercised — this plan is CPU-only static-source plus pytest; no ComfyUI launch).--use-process-isolation --disable-cuda-malloc(not exercised — see above).pkill, norm -rf, nopython main.py.issue_187in bothpollockjj/mydevelopmentandpollockjj/ComfyUI, not on base branchesmain/issue_101_pi.#186: do NOT touchcomfy_extras/nodes_seedvr.py,comfy/sd.py, orcomfy/ldm/seedvr/model.py. The fix is bounded tocomfy/ldm/seedvr/vae.pyline 509.pytest 9.0.3in deliverable venv — REUSE; cited fromtools_register.md#pytest.ruff 0.15.9system binary at/home/johnj/.local/bin/ruff— REUSE; ComfyUI CI invocation pattern ispython -m ruff check .per.github/workflows/ruff.yml.torch 2.11.0+cu130,einops 0.8.2in deliverable venv — REUSE.comfy.ops.disable_weight_init.GroupNormin deliverable source — REUSE; constructable on CPU per GTP-3.nn.Module.register_forward_hook(torch) andunittest.mock.patch.object(stdlib) — REUSE for measurement instruments.comfy.ldm.seedvr.vae_NORM_LIMIT/get_norm_limit/set_norm_limitaccessors — REUSE; the plan only fixes the consumer at line 509, not the accessors.ghCLI — REUSE for issue body update viascripts/run_tdd_post.py update-issue(PR creation is the/prskill's responsibility, not this plan).comfy.ldm.seedvr.vae.F.group_normdirect-call spy together (the two instruments unambiguously identify which branch ran; chunked branch fires zero hook events AND records directF.group_normcalls withnum_groups < gn.num_groups).git diffartifact + regression test that fails pre-fix and passes post-fix (perequivalence_methods.md#code-behavior-equivalence).## Created Surface Contract;grep '^<key>:'parseable for key existence;python -m json.toolparseable for structural validity.git stashand without external decision deferral..gitignorerules for safe generated bulk, or discarded generated debris.git status --shortgates locally and must refuse to submit while any touched repo is dirty.repo_hygiene.md,repo_hygiene.txt, rawgit status --shorttranscripts, orCLEANmarkers as QA evidence; this plan deliberately omits any hygiene-evidence AC. QA verifies submitted commits, branch refs, and artifact presence through GitHub rather than trusting slicer-authored cleanliness transcripts.Out of Scope
#186:#188frame count,#189batch/time axes,#190/#192forward return contracts,#191latent metadata,#193cache clearing,#194decode state guards. Each owns its own scope and its own issue branch.debug/run_debug.pylaunches).numz/ComfyUI-SeedVR2_VideoUpscaler,ComfyUI-SeedVR2-Canonical, etc.). Their forks ofset_memory_limitandcausal_norm_wrapperalready useget_norm_limit()correctly per the upstream SeedVR repo; only the in-treecomfy/ldm/seedvr/vae.pycarries the placeholder gate.debug/run_debug.py,debug/reset_gpu.py,debug/harness_runtime.py, etc.).comfy_extras/nodes_seedvr.py,comfy/sd.py,comfy/ldm/seedvr/model.py(per#186file-family lock)./prand/pr-reviewskills' responsibilities; this plan only lands committed work on issue branches and exits.vae.py:659-660or to the upstream config bridgeset_memory_limit(...)atvae.py:2297-2301. The fix only changes one logical line incausal_norm_wrapper.# TODOcomment is optional; AC pass status does not depend on it._NORM_LIMIT's default value,set_norm_limit'sNone→infsemantics, orget_norm_limit's return shape. The gate's RHS is the only mutation.repo_hygiene.md/repo_hygiene.txt/ rawgit status --shorttranscripts /CLEANmarkers as QA evidence; repository cleanliness is a local pre-submit gate enforced by Melian and tdd-slice, not a committed artifact this plan asks for.Scope: nonejustification.