security: prevent path traversal in dataset nodes and use weights_only for torch.load#14166
Conversation
Adds _sanitize_folder_name() helper to validate that folder names resolve within the expected base directory, preventing path traversal attacks that could: - Write files to arbitrary locations (SaveTrainingDataset) - Create directories anywhere (SaveImageDataSetToFolderNode) - Load malicious pickles from unexpected paths (LoadTrainingDataset) Also adds weights_only=True to torch.load() in LoadTrainingDataset to prevent arbitrary code execution via pickle deserialization.
📝 WalkthroughWalkthroughThis PR hardens dataset operations by adding 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@comfy_extras/nodes_dataset.py`:
- Around line 15-19: The current _sanitize_folder_name uses startswith on
abspaths which is bypassable by sibling names; replace that check with a robust
containment test: compute base_abs = os.path.abspath(base_dir) and resolved =
os.path.abspath(os.path.join(base_abs, folder_name)) and verify
os.path.commonpath([base_abs, resolved]) == base_abs (or compare with a trailing
os.path.sep on base_abs) and raise ValueError if the check fails; update the
function _sanitize_folder_name to use this commonpath (or separator) containment
logic to reliably prevent path traversal.
- Line 1557: The use of torch.load(..., weights_only=True) at the shard load
site can fail when shards contain comfy.hooks.HookGroup objects (saved by
MakeTrainingDataset via clip.encode_from_tokens_scheduled when
clip.apply_hooks_to_conds is enabled by SetClipHooks). Replace the strict
weights_only load with a safe fallback: try loading with weights_only=True
first, and on UnpicklingError (or other deserialization errors) fall back to
torch.load(f) without weights_only so non-weight objects (e.g., HookGroup in
pooled_dict["hooks"]) can be deserialized; alternatively, if you prefer
prevention, strip or avoid saving pooled_dict["hooks"] in
MakeTrainingDataset/encode_from_tokens_scheduled when clip.apply_hooks_to_conds
is enabled. Ensure the change references the shard loading site where shard_data
= torch.load(f, weights_only=True) is called and mentions MakeTrainingDataset /
clip.encode_from_tokens_scheduled / comfy.hooks.HookGroup so maintainers can
find the related code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3af125df-ae43-4f0a-9382-c9a985d642a0
📒 Files selected for processing (1)
comfy_extras/nodes_dataset.py
| def _sanitize_folder_name(base_dir, folder_name): | ||
| resolved = os.path.abspath(os.path.join(base_dir, folder_name)) | ||
| if not resolved.startswith(os.path.abspath(base_dir)): | ||
| raise ValueError("Path traversal detected in folder_name") | ||
| return resolved |
There was a problem hiding this comment.
startswith check is bypassable by sibling directories sharing the base prefix.
resolved.startswith(os.path.abspath(base_dir)) is the classic prefix-match pitfall. With base = /…/output, a folder_name of "../output_evil" resolves to /…/output_evil, which starts with /…/output and so passes the check — yet it lives outside the base directory. Since path-traversal prevention is the whole point of this helper, gate on a trailing separator (or os.path.commonpath) instead.
🛡️ Proposed fix
def _sanitize_folder_name(base_dir, folder_name):
- resolved = os.path.abspath(os.path.join(base_dir, folder_name))
- if not resolved.startswith(os.path.abspath(base_dir)):
+ base = os.path.abspath(base_dir)
+ resolved = os.path.abspath(os.path.join(base, folder_name))
+ if resolved != base and not resolved.startswith(base + os.sep):
raise ValueError("Path traversal detected in folder_name")
return resolved📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _sanitize_folder_name(base_dir, folder_name): | |
| resolved = os.path.abspath(os.path.join(base_dir, folder_name)) | |
| if not resolved.startswith(os.path.abspath(base_dir)): | |
| raise ValueError("Path traversal detected in folder_name") | |
| return resolved | |
| def _sanitize_folder_name(base_dir, folder_name): | |
| base = os.path.abspath(base_dir) | |
| resolved = os.path.abspath(os.path.join(base, folder_name)) | |
| if resolved != base and not resolved.startswith(base + os.sep): | |
| raise ValueError("Path traversal detected in folder_name") | |
| return resolved |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@comfy_extras/nodes_dataset.py` around lines 15 - 19, The current
_sanitize_folder_name uses startswith on abspaths which is bypassable by sibling
names; replace that check with a robust containment test: compute base_abs =
os.path.abspath(base_dir) and resolved = os.path.abspath(os.path.join(base_abs,
folder_name)) and verify os.path.commonpath([base_abs, resolved]) == base_abs
(or compare with a trailing os.path.sep on base_abs) and raise ValueError if the
check fails; update the function _sanitize_folder_name to use this commonpath
(or separator) containment logic to reliably prevent path traversal.
|
|
||
| with open(shard_path, "rb") as f: | ||
| shard_data = torch.load(f) | ||
| shard_data = torch.load(f, weights_only=True) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
torch.load weights_only=True supported types allowlist UnpicklingError add_safe_globals
💡 Result:
In PyTorch, the weights_only=True option for torch.load is a security feature designed to prevent arbitrary code execution by restricting the unpickler to a limited set of safe types [1][2]. As of PyTorch 2.6, this setting is the default [2][3][4]. When loading a checkpoint with weights_only=True, if the file contains unsupported types or classes (often triggered when serializing custom objects, modules, or certain third-party types like those from NumPy), an UnpicklingError will occur, specifically identifying an "Unsupported global" [2][5][6]. Supported Types and Allowlisting By default, the weights_only unpickler allows [1][7]: - Tensors - Primitive Python types (e.g., int, float, str, bool) - Dictionaries - Basic collections (like lists and tuples) To allow additional types—such as custom model classes or specific third-party objects—you must explicitly add them to an allowlist using torch.serialization.add_safe_globals [1][8]. How to use add_safe_globals: 1. Identify the missing type from the error message (e.g., GLOBAL main.MyModel) [2][6]. 2. Add the class or function to the safe allowlist before calling torch.load [2][9]. Example: import torch from my_module import MyCustomClass # Register the class as safe torch.serialization.add_safe_globals([MyCustomClass]) # Now the load should succeed ckpt = torch.load("checkpoint.pth", weights_only=True) Best Practices: - Only allowlist types you explicitly trust, as this bypasses the security protections of weights_only=True [2]. - Avoid saving entire model objects; it is best practice to save and load only the state_dict [10]. - If you frequently encounter many missing globals, ensure your checkpoint was saved using standard PyTorch types rather than complex, custom-serialized structures [10]. If you are certain that a file is from a trusted source and it is impractical to allowlist all necessary globals, you may set weights_only=False, though this exposes your environment to potential security vulnerabilities [2][5].
Citations:
- 1: https://docs.pytorch.org/docs/stable/generated/torch.load.md
- 2: https://stackoverflow.com/questions/79584485/unable-to-torch-load-due-to-pickling-safety-error
- 3: https://dev-discuss.pytorch.org/t/bc-breaking-change-torch-load-is-being-flipped-to-use-weights-only-true-by-default-in-the-nightlies-after-137602/2573
- 4: https://forums.developer.nvidia.com/t/pickle-unpicklingerror-pytorch-2-6-weights-only-during-re-identification-evaluate-in-tao-6-25-11/361337
- 5:
torch.loadwithweights_only=Truedoes not allowbytesorbytearraypytorch/pytorch#133163 - 6: https://stackoverflow.com/questions/79431134/torch-cannot-add-namedtuple-class-to-safe-globals
- 7: https://docs.pytorch.org/docs/2.11/generated/torch.load.html
- 8: https://github.com/pytorch/pytorch/blob/main/torch/serialization.py
- 9: pytorch/pytorch@072d9e8
- 10: torch.load with weights_only=True show error for pickle protocol 4 pytorch/pytorch#118092
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Show context around the load call in question
sed -n '1520,1605p' comfy_extras/nodes_dataset.py | cat -n
# 2) Find where shard files are created/saved (torch.save usage near dataset/shard logic)
rg -n "torch\.save\(" comfy_extras/nodes_dataset.py
rg -n "shard" comfy_extras/nodes_dataset.py | head -n 80
# 3) Find conditioning generation path (encode_from_tokens_scheduled)
rg -n "encode_from_tokens_scheduled" comfy_extras/nodes_dataset.py
# 4) Search for add_safe_globals usage anywhere in repo
rg -n "add_safe_globals" -S .Repository: Comfy-Org/ComfyUI
Length of output: 5830
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1360,1415p' comfy_extras/nodes_dataset.py | cat -n
echo "----"
sed -n '1415,1505p' comfy_extras/nodes_dataset.py | cat -n
echo "----"
sed -n '1490,1575p' comfy_extras/nodes_dataset.py | cat -n
echo "----"
sed -n '1,120p' comfy/utils.py | cat -nRepository: Comfy-Org/ComfyUI
Length of output: 14925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Locate the implementation of encode_from_tokens_scheduled
rg -n "def encode_from_tokens_scheduled|encode_from_tokens_scheduled\s*\(" -S .
# 2) Print a focused excerpt around the implementation (first match)
python3 - <<'PY'
import subprocess, re, sys
res = subprocess.check_output(["bash","-lc","rg -n \"def encode_from_tokens_scheduled|encode_from_tokens_scheduled\\s*\\(\" -S . | head -n 20"], text=True)
print(res)
PYRepository: Comfy-Org/ComfyUI
Length of output: 4925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Read implementation of encode_from_tokens_scheduled
sed -n '250,430p' comfy/sd.py | cat -n
# Also inspect any helper used by encode_from_tokens_scheduled (quick scan nearby)
rg -n "encode_from_tokens_scheduled" -n comfy/sd.pyRepository: Comfy-Org/ComfyUI
Length of output: 9983
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "apply_hooks_to_conds" comfy/sd.py comfy -S
rg -n "use_clip_schedule" comfy/sd.py comfy -S
rg -n "add_hooks_to_dict\(" -S comfy/sd.py comfyRepository: Comfy-Org/ComfyUI
Length of output: 1389
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "apply_hooks_to_conds" comfy/sd.py comfy -S
rg -n "use_clip_schedule" comfy/sd.py comfy -S
rg -n "add_hooks_to_dict\(" -S comfy/sd.py comfyRepository: Comfy-Org/ComfyUI
Length of output: 1389
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find encode_token_weights implementations
rg -n "def encode_token_weights" comfy -S
# Print a small excerpt around the first few matches to see return types/structure
rg -n "def encode_token_weights" comfy -S | head -n 5 | cut -d: -f1 | uniq | head -n 5 | while read -r f; do
echo "---- $f"
# show around the first occurrence
start=$(rg -n "def encode_token_weights" "$f" -S | head -n 1 | cut -d: -f2)
begin=$((start-20))
end=$((start+80))
sed -n "${begin},${end}p" "$f" | cat -n
doneRepository: Comfy-Org/ComfyUI
Length of output: 2256
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect MakeTrainingDataset to see what CLIP object it receives/uses
rg -n "class MakeTrainingDataset|MakeTrainingDataset" comfy_extras/nodes_dataset.py
start=$(rg -n "class MakeTrainingDataset" comfy_extras/nodes_dataset.py | head -n1 | cut -d: -f1)
sed -n "$((start-40)),$((start+220))p" comfy_extras/nodes_dataset.py | cat -n
# 2) Inspect encode_token_weights for common CLIP implementations (SD1 + SDXL)
for f in comfy/sd1_clip.py comfy/sdxl_clip.py comfy/text_encoders/*clip*.py; do
[ -f "$f" ] || continue
lines=$(rg -n "def encode_token_weights" "$f" 2>/dev/null | head -n 3 || true)
if [ -n "$lines" ]; then
echo "==== $f ===="
echo "$lines"
first=$(echo "$lines" | head -n1 | cut -d: -f1)
sed -n "$((first-30)),$((first+120))p" "$f" | cat -n
fi
doneRepository: Comfy-Org/ComfyUI
Length of output: 14376
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Check whether use_clip_schedule is ever enabled elsewhere
rg -n "use_clip_schedule\s*=" comfy -S
# 2) For sd1_clip.py, inspect both encode_token_weights definitions
for line in $(rg -n "def encode_token_weights" comfy/sd1_clip.py | cut -d: -f1); do
begin=$((line-30)); if [ "$begin" -lt 1 ]; then begin=1; fi
end=$((line+160))
echo "==== comfy/sd1_clip.py around $line ===="
sed -n "${begin},${end}p" comfy/sd1_clip.py | cat -n
done
# 3) Inspect sdxl_clip.py encode_token_weights definition
for line in $(rg -n "def encode_token_weights" comfy/sdxl_clip.py | cut -d: -f1); do
begin=$((line-30)); if [ "$begin" -lt 1 ]; then begin=1; fi
end=$((line+160))
echo "==== comfy/sdxl_clip.py around $line ===="
sed -n "${begin},${end}p" comfy/sdxl_clip.py | cat -n
doneRepository: Comfy-Org/ComfyUI
Length of output: 15149
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "use_clip_schedule" . -S
rg -n "forced_hooks" . -S
rg -n "apply_hooks_to_conds" . -SRepository: Comfy-Org/ComfyUI
Length of output: 1674
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '220,290p' comfy_extras/nodes_hooks.py | cat -nRepository: Comfy-Org/ComfyUI
Length of output: 3341
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "add_safe_globals" . -S
rg -n "safe_globals" comfy -S
rg -n "HookGroup" comfy -S | head -n 50Repository: Comfy-Org/ComfyUI
Length of output: 5709
Hardening risk: weights_only=True can break shards saved with CLIP hooks.
MakeTrainingDataset persists whatever clip.encode_from_tokens_scheduled() returns. When clip.apply_hooks_to_conds is set (via SetClipHooks with apply_to_conds=True), encode_from_tokens_scheduled includes pooled_dict["hooks"] holding a comfy.hooks.HookGroup. torch.load(..., weights_only=True) will then likely fail to deserialize that shard with UnpicklingError unless those globals are allowlisted or hooks are stripped/avoided during save/load.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@comfy_extras/nodes_dataset.py` at line 1557, The use of torch.load(...,
weights_only=True) at the shard load site can fail when shards contain
comfy.hooks.HookGroup objects (saved by MakeTrainingDataset via
clip.encode_from_tokens_scheduled when clip.apply_hooks_to_conds is enabled by
SetClipHooks). Replace the strict weights_only load with a safe fallback: try
loading with weights_only=True first, and on UnpicklingError (or other
deserialization errors) fall back to torch.load(f) without weights_only so
non-weight objects (e.g., HookGroup in pooled_dict["hooks"]) can be
deserialized; alternatively, if you prefer prevention, strip or avoid saving
pooled_dict["hooks"] in MakeTrainingDataset/encode_from_tokens_scheduled when
clip.apply_hooks_to_conds is enabled. Ensure the change references the shard
loading site where shard_data = torch.load(f, weights_only=True) is called and
mentions MakeTrainingDataset / clip.encode_from_tokens_scheduled /
comfy.hooks.HookGroup so maintainers can find the related code.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@comfy_extras/nodes_dataset.py`:
- Around line 260-263: mode is extracted into mode = mode[0] but never
forwarded, so the "increment" option is a no-op; change the call to
save_images_to_folder(...) to pass an explicit overwrite argument derived from
mode (e.g., overwrite = True when mode == "overwrite", False when mode ==
"increment") so save_images_to_folder receives the intent from mode; update the
save_images_to_folder(...) invocation near
_sanitize_folder_name(folder_paths.get_output_directory(), folder_name) to
include the overwrite flag (using filename_prefix, images, and output_dir as
before).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 594d24b4-cd2e-452b-8337-ed437c4fcb7c
📒 Files selected for processing (1)
comfy_extras/nodes_dataset.py
| mode = mode[0] | ||
|
|
||
| output_dir = os.path.join(folder_paths.get_output_directory(), folder_name) | ||
| saved_files = save_images_to_folder(images, output_dir, filename_prefix, mode=='overwrite') | ||
| output_dir = _sanitize_folder_name(folder_paths.get_output_directory(), folder_name) | ||
| saved_files = save_images_to_folder(images, output_dir, filename_prefix) |
There was a problem hiding this comment.
mode is extracted but never forwarded — the increment option is a no-op.
mode = mode[0] is computed but save_images_to_folder(...) is called without overwrite, so it always uses the default overwrite=True. Selecting "increment" silently overwrites files instead of incrementing filenames.
🐛 Proposed fix
- saved_files = save_images_to_folder(images, output_dir, filename_prefix)
+ saved_files = save_images_to_folder(images, output_dir, filename_prefix, overwrite=(mode == "overwrite"))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@comfy_extras/nodes_dataset.py` around lines 260 - 263, mode is extracted into
mode = mode[0] but never forwarded, so the "increment" option is a no-op; change
the call to save_images_to_folder(...) to pass an explicit overwrite argument
derived from mode (e.g., overwrite = True when mode == "overwrite", False when
mode == "increment") so save_images_to_folder receives the intent from mode;
update the save_images_to_folder(...) invocation near
_sanitize_folder_name(folder_paths.get_output_directory(), folder_name) to
include the overwrite flag (using filename_prefix, images, and output_dir as
before).
No description provided.