security: prevent path traversal in dataset nodes and use weights_only for torch.load#14167
Conversation
Adds _sanitize_folder_name() helper to validate that folder names resolve within the expected base directory, preventing path traversal attacks that could: - Write files to arbitrary locations (SaveTrainingDataset) - Create directories anywhere (SaveImageDataSetToFolderNode) - Load malicious pickles from unexpected paths (LoadTrainingDataset) Also adds weights_only=True to torch.load() in LoadTrainingDataset to prevent arbitrary code execution via pickle deserialization.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR hardens the dataset module against two security issues: path traversal attacks and unsafe PyTorch deserialization. A new 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@comfy_extras/nodes_dataset.py`:
- Around line 15-19: The _sanitize_folder_name helper's startswith check is
bypassable (e.g., ../output_evil) so replace the brittle prefix test with a
proper path-boundary check: compute absolute base (base_abs =
os.path.abspath(base_dir)) and resolved path, then ensure
os.path.commonpath([base_abs, resolved]) == base_abs (or compare using path
separator boundary by adding os.path.sep) and raise ValueError if not; update
_sanitize_folder_name to use this commonpath-based validation so sibling
directories that merely share a string prefix cannot escape the base.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e075255f-5d8c-4018-a102-2cb6efcf26f7
📒 Files selected for processing (1)
comfy_extras/nodes_dataset.py
| def _sanitize_folder_name(base_dir, folder_name): | ||
| resolved = os.path.abspath(os.path.join(base_dir, folder_name)) | ||
| if not resolved.startswith(os.path.abspath(base_dir)): | ||
| raise ValueError("Path traversal detected in folder_name") | ||
| return resolved |
There was a problem hiding this comment.
startswith boundary check is bypassable — escapes to sibling directories.
The prefix check passes for any sibling path that shares the base name as a string prefix. For base_dir = /home/user/output, a folder_name of ../output_evil resolves to /home/user/output_evil, which satisfies startswith("/home/user/output") yet lives outside the base. That re-opens the very traversal this helper is meant to block (writing files / creating dirs in <base>_*).
Anchor the comparison on a path separator boundary (or use os.path.commonpath).
🛡️ Proposed fix
def _sanitize_folder_name(base_dir, folder_name):
- resolved = os.path.abspath(os.path.join(base_dir, folder_name))
- if not resolved.startswith(os.path.abspath(base_dir)):
- raise ValueError("Path traversal detected in folder_name")
- return resolved
+ base = os.path.abspath(base_dir)
+ resolved = os.path.abspath(os.path.join(base, folder_name))
+ if resolved != base and not resolved.startswith(base + os.sep):
+ raise ValueError("Path traversal detected in folder_name")
+ return resolved🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@comfy_extras/nodes_dataset.py` around lines 15 - 19, The
_sanitize_folder_name helper's startswith check is bypassable (e.g.,
../output_evil) so replace the brittle prefix test with a proper path-boundary
check: compute absolute base (base_abs = os.path.abspath(base_dir)) and resolved
path, then ensure os.path.commonpath([base_abs, resolved]) == base_abs (or
compare using path separator boundary by adding os.path.sep) and raise
ValueError if not; update _sanitize_folder_name to use this commonpath-based
validation so sibling directories that merely share a string prefix cannot
escape the base.
This PR addresses path traversal vulnerabilities in dataset loading nodes and switches
torch.loadto useweights_only=Truefor safer deserialization.