-
Notifications
You must be signed in to change notification settings - Fork 95
Jessie/video watermarking #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Feature/bert inference
|
Warning Rate limit exceeded@JessieeeNotLi has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 57 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (72)
WalkthroughAdds a new NVENC/FFmpeg GPU benchmarking toolchain plus many new GPU/JAX/ONNX/Torch benchmark suites, CI/tooling/editor config updates, a submodule URL update, and widespread stylistic/input-generator edits across benchmarks. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Shell as run_nvenc_bench.sh
participant Bench as gpu_bench.py
participant FFmpeg
participant GPU as NVIDIA
participant FS as FileSystem
User->>Shell: start benchmark
Shell->>GPU: query hardware (nvidia-smi)
Shell->>Bench: invoke gpu_bench.py (probe ffmpeg)
Shell->>FS: ensure sample video exists (create if missing)
Shell->>Bench: pass trial list
loop for each trial
Bench->>FFmpeg: probe capabilities (encoders/filters)
Bench->>Bench: plan filter graph (GPU vs CPU path)
Bench->>FFmpeg: spawn transcode command
FFmpeg->>GPU: use hw accel (if planned)
FFmpeg-->>Bench: stream logs/progress
Bench->>Bench: parse_progress, collect metrics
end
Bench-->>User: write JSON/CSV summary
sequenceDiagram
participant FaaS as Handler
participant Storage
participant CPU
participant GPU
participant Encoder as FFmpeg
FaaS->>Storage: download input video
FaaS->>CPU: decode frames (or request hw decode)
CPU->>GPU: upload frames / convert to tensors
GPU-->>GPU: run conv2d + ReLU compute
GPU->>CPU: download processed frames
FaaS->>Encoder: invoke ffmpeg to re-encode/overlay/scale
Encoder->>Storage: upload output
FaaS-->>FaaS: return measurements (download/compute/upload)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
added video watermarking benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
watermarking_bench/.gitkeep (1)
1-1:.gitkeepmay be redundant if the directory contains other files.Since the PR already adds multiple files to
watermarking_bench/(gpu_bench.py, documentation, etc.), the.gitkeepplaceholder is unnecessary—the directory will be preserved by these files alone. You can safely remove this.watermarking_bench/gpu_bench.py (2)
32-40: Consider splitting multiple statements per line.Lines 37-39 use semicolons to place multiple statements on single lines, which slightly reduces readability.
Apply this diff for improved clarity:
- m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last); fps = float(m.group(1)) if m else None - m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last); speed = float(m.group(1)) if m else None - m = re.search(r"frame=\s*([0-9]+)", last); frames = int(m.group(1)) if m else None + m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last) + fps = float(m.group(1)) if m else None + m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last) + speed = float(m.group(1)) if m else None + m = re.search(r"frame=\s*([0-9]+)", last) + frames = int(m.group(1)) if m else None
245-293: Consider splitting the cleanup statements for clarity.Lines 275-276 compress the temporary file cleanup into a single line. While functional, separating these statements would improve readability.
Apply this diff:
idx += 1 - try: os.remove(outp) - except OSError: pass + try: + os.remove(outp) + except OSError: + pass
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
watermarking_bench/results.csvis excluded by!**/*.csvwatermarking_bench/results_long.csvis excluded by!**/*.csvwatermarking_bench/sample.mp4is excluded by!**/*.mp4watermarking_bench/watermark.pngis excluded by!**/*.png
📒 Files selected for processing (5)
watermarking_bench/.gitkeep(1 hunks)watermarking_bench/gpu_bench.py(1 hunks)watermarking_bench/read.me(1 hunks)watermarking_bench/run_nvenc_bench.sh(1 hunks)watermarking_bench/watermarking_readme.md(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
watermarking_bench/gpu_bench.py (1)
benchmarks/200.multimedia/220.video-processing/python/function.py (1)
watermark(38-47)
🪛 Ruff (0.14.3)
watermarking_bench/gpu_bench.py
1-1: Shebang is present but file is not executable
(EXE001)
14-14: subprocess call: check for execution of untrusted input
(S603)
29-29: Do not catch blind exception: Exception
(BLE001)
37-37: Multiple statements on one line (semicolon)
(E702)
38-38: Multiple statements on one line (semicolon)
(E702)
39-39: Multiple statements on one line (semicolon)
(E702)
149-149: Avoid specifying long messages outside the exception class
(TRY003)
275-275: Multiple statements on one line (colon)
(E701)
276-276: Multiple statements on one line (colon)
(E701)
🔇 Additional comments (10)
watermarking_bench/watermarking_readme.md (1)
1-9: LGTM! Clear and helpful documentation.The instructions are concise and cover the common use cases effectively.
watermarking_bench/read.me (1)
1-3: LGTM! Concise usage instructions.The commands are clear and cover the essential use cases.
watermarking_bench/gpu_bench.py (5)
7-22: LGTM! Well-designed helper functions.The utility functions are clean and appropriate for the task. The subprocess call is safe since command arguments are controlled by the script logic, not untrusted user input.
24-30: LGTM! Appropriate error handling for optional GPU info.The broad exception catch is acceptable here since
nvidia-smiavailability is optional, and the function gracefully degrades by returning None values.
44-130: LGTM! Comprehensive filter graph construction.The function effectively handles multiple scenarios with proper GPU/CPU bridging. The explicit use of
hwdownloadandhwupload_cudafor format conversion is good practice and aligns with FFmpeg best practices for CUDA filters.
134-198: LGTM! Well-structured transcoding function.The function properly orchestrates the FFmpeg command construction, execution, and metric collection. The detailed error message on line 149 (flagged by static analysis) is actually helpful for debugging, providing both the failure output and the command arguments.
200-244: LGTM! Comprehensive benchmark orchestration.The main function effectively handles argument parsing, warmup execution, and trial iteration. The default trials include
av1_nvenc, which will fail on systems without AV1 support, but this is caught by the encoder availability check intranscode_once(), so the error handling is appropriate.watermarking_bench/run_nvenc_bench.sh (3)
1-26: LGTM! Robust script setup and validation.The script properly validates prerequisites and provides helpful diagnostic output. The use of
set -euo pipefailensures failures are caught early.
269-278: LGTM! Helpful sample generation for testing.The automatic generation of a 4K test video with audio is convenient for users who want to run the benchmark without providing their own input. The encoding parameters are reasonable for a test sample.
280-300: LGTM! Clear benchmark execution and results preview.The benchmark execution is well-structured with timestamped CSV output and a helpful preview of results. The trial selection excludes
av1_nvenc, which improves compatibility with systems that lack AV1 encoder support.
| @@ -0,0 +1,293 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set executable permission on the file.
The shebang indicates this script should be directly executable, but the file permissions aren't set. Ensure chmod +x is applied to match the usage shown in the documentation.
Apply this fix:
chmod +x watermarking_bench/gpu_bench.py🧰 Tools
🪛 Ruff (0.14.3)
1-1: Shebang is present but file is not executable
(EXE001)
🤖 Prompt for AI Agents
In watermarking_bench/gpu_bench.py at line 1 the file contains a shebang but
lacks executable permission; make the script executable by setting the execute
bit (e.g., run chmod +x watermarking_bench/gpu_bench.py) so it can be run
directly as indicated in the docs and CI.
| # --- Make or update a working gpu_bench.py (GPU-first, safe bridges, skip missing encoders) --- | ||
| cat > gpu_bench.py <<'PY' | ||
| #!/usr/bin/env python3 | ||
| import argparse, datetime, json, os, re, shutil, subprocess, sys, tempfile, csv | ||
| from typing import List, Dict, Any, Optional, Tuple | ||
| def which_ffmpeg() -> str: | ||
| p = shutil.which("ffmpeg") | ||
| if not p: | ||
| sys.exit("ffmpeg not found on PATH. Use Docker image with NVENC or install FFmpeg with NVENC.") | ||
| return p | ||
| def run(cmd: List[str]) -> subprocess.CompletedProcess: | ||
| return subprocess.run(cmd, stdin=subprocess.DEVNULL, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) | ||
| def has_encoder(ffmpeg: str, enc: str) -> bool: | ||
| out = run([ffmpeg, "-hide_banner", "-encoders"]).stdout | ||
| return re.search(rf"\b{re.escape(enc)}\b", out) is not None | ||
| def has_filter(ffmpeg: str, name: str) -> bool: | ||
| out = run([ffmpeg, "-hide_banner", "-filters"]).stdout | ||
| return (f" {name} " in out) | ||
| def gpu_info() -> Dict[str, Any]: | ||
| try: | ||
| out = run(["nvidia-smi", "--query-gpu=name,memory.total,driver_version", "--format=csv,noheader,nounits"]).stdout.strip() | ||
| name, mem, drv = [x.strip() for x in out.splitlines()[0].split(",")] | ||
| return {"name": name, "memory_total_mb": int(mem), "driver_version": drv} | ||
| except Exception: | ||
| return {"name": None, "memory_total_mb": None, "driver_version": None} | ||
| def parse_progress(log: str) -> Dict[str, Any]: | ||
| lines = [ln for ln in log.splitlines() if ("fps=" in ln or "speed=" in ln or "frame=" in ln)] | ||
| fps = speed = frames = None | ||
| if lines: | ||
| last = lines[-1] | ||
| m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last); fps = float(m.group(1)) if m else None | ||
| m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last); speed = float(m.group(1)) if m else None | ||
| m = re.search(r"frame=\s*([0-9]+)", last); frames = int(m.group(1)) if m else None | ||
| return {"fps": fps, "speed_x": speed, "frames": frames} | ||
| def build_vf_or_complex(ffmpeg: str, scale: Optional[str], wm_path: Optional[str], overlay: str, want_gpu_decode: bool) -> Tuple[List[str], str]: | ||
| used = [] | ||
| vf_args: List[str] = [] | ||
| complex_graph = "" | ||
| have_scale_npp = has_filter(ffmpeg, "scale_npp") | ||
| have_scale_cuda = has_filter(ffmpeg, "scale_cuda") | ||
| have_overlay_cuda = has_filter(ffmpeg, "overlay_cuda") | ||
| if not wm_path: | ||
| if scale: | ||
| if want_gpu_decode and have_scale_npp: | ||
| vf_args = ["-vf", f"scale_npp={scale}"]; used.append("scale_npp") | ||
| elif want_gpu_decode and have_scale_cuda: | ||
| vf_args = ["-vf", f"scale_cuda={scale}"]; used.append("scale_cuda") | ||
| else: | ||
| vf_args = ["-vf", f"hwdownload,format=nv12,scale={scale},hwupload_cuda"] | ||
| used.append("scale(cpu)+hwdownload+hwupload_cuda") | ||
| return (vf_args, "+".join(used)) | ||
| # watermark path | ||
| if want_gpu_decode and have_overlay_cuda: | ||
| if scale and have_scale_npp: | ||
| complex_graph = f"[0:v]scale_npp={scale}[v0];[v0][1:v]overlay_cuda={overlay}[vout]" | ||
| used += ["scale_npp","overlay_cuda"] | ||
| elif scale and have_scale_cuda: | ||
| complex_graph = f"[0:v]scale_cuda={scale}[v0];[v0][1:v]overlay_cuda={overlay}[vout]" | ||
| used += ["scale_cuda","overlay_cuda"] | ||
| elif scale: | ||
| complex_graph = f"[0:v]hwdownload,format=nv12,scale={scale},hwupload_cuda[v0];[v0][1:v]overlay_cuda={overlay}[vout]" | ||
| used += ["scale(cpu)+hwdownload+hwupload_cuda","overlay_cuda"] | ||
| else: | ||
| complex_graph = f"[0:v][1:v]overlay_cuda={overlay}[vout]" | ||
| used += ["overlay_cuda"] | ||
| return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used)) | ||
| # CPU overlay fallback (explicit bridges) | ||
| if scale and want_gpu_decode and (have_scale_npp or have_scale_cuda): | ||
| scaler = "scale_npp" if have_scale_npp else "scale_cuda" | ||
| complex_graph = ( | ||
| f"[0:v]{scaler}={scale}[v0gpu];" | ||
| f"[v0gpu]hwdownload,format=nv12[v0cpu];" | ||
| f"[v0cpu][1:v]overlay={overlay}[mix];" | ||
| f"[mix]hwupload_cuda[vout]" | ||
| ) | ||
| used += [scaler, "hwdownload+overlay(cpu)+hwupload_cuda"] | ||
| elif scale: | ||
| complex_graph = ( | ||
| f"[0:v]hwdownload,format=nv12,scale={scale}[v0cpu];" | ||
| f"[v0cpu][1:v]overlay={overlay}[mix];" | ||
| f"[mix]hwupload_cuda[vout]" | ||
| ) | ||
| used += ["scale(cpu)+overlay(cpu)+hwupload_cuda"] | ||
| else: | ||
| complex_graph = ( | ||
| f"[0:v]hwdownload,format=nv12[v0cpu];" | ||
| f"[v0cpu][1:v]overlay={overlay}[mix];" | ||
| f"[mix]hwupload_cuda[vout]" | ||
| ) | ||
| used += ["overlay(cpu)+hwupload_cuda"] | ||
| return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used)) | ||
| def transcode_once(ffmpeg: str, inp: str, outp: str, codec: str, bitrate: str, preset: str, | ||
| duration: Optional[float], scale: Optional[str], wm_path: Optional[str], | ||
| overlay_pos: str, decode_mode: str = "gpu") -> Dict[str, Any]: | ||
| if not has_encoder(ffmpeg, codec): | ||
| raise RuntimeError(f"encoder '{codec}' not available; check your ffmpeg build (NVENC/AV1).") | ||
| want_gpu_decode = (decode_mode == "gpu") | ||
| args = [ffmpeg, "-hide_banner", "-y", "-vsync", "0"] | ||
| if want_gpu_decode: | ||
| args += ["-hwaccel", "cuda", "-hwaccel_output_format", "cuda", "-extra_hw_frames", "16", | ||
| "-init_hw_device", "cuda=cuda", "-filter_hw_device", "cuda"] | ||
| args += ["-i", inp] | ||
| if wm_path: | ||
| args += ["-loop", "1", "-i", wm_path] | ||
| if duration: | ||
| args += ["-t", str(duration)] | ||
| filt_args, filter_used = build_vf_or_complex(ffmpeg, scale, wm_path, overlay_pos, want_gpu_decode) | ||
| args += filt_args | ||
| args += ["-c:v", codec, "-b:v", bitrate, "-preset", preset, "-rc", "vbr", "-movflags", "+faststart"] | ||
| args += ["-c:a", "copy"] | ||
| args += [outp] | ||
| t0 = datetime.datetime.now() | ||
| proc = run(args) | ||
| t1 = datetime.datetime.now() | ||
| if proc.returncode != 0: | ||
| raise RuntimeError("ffmpeg failed:\n" + proc.stdout + f"\n\nARGS:\n{' '.join(args)}") | ||
| parsed = parse_progress(proc.stdout) | ||
| size = os.path.getsize(outp) if os.path.exists(outp) else 0 | ||
| return { | ||
| "args": args, | ||
| "filter_used": filter_used, | ||
| "stdout_tail": "\n".join(proc.stdout.splitlines()[-15:]), | ||
| "compute_time_us": (t1 - t0) / datetime.timedelta(microseconds=1), | ||
| "fps": parsed["fps"], | ||
| "speed_x": parsed["speed_x"], | ||
| "frames": parsed["frames"], | ||
| "output_size_bytes": size | ||
| } | ||
| def main(): | ||
| ap = argparse.ArgumentParser(description="GPU NVENC benchmark.") | ||
| ap.add_argument("--input", required=True) | ||
| ap.add_argument("--duration", type=float, default=None) | ||
| ap.add_argument("--repeat", type=int, default=1) | ||
| ap.add_argument("--warmup", action="store_true") | ||
| ap.add_argument("--csv", default=None) | ||
| ap.add_argument("--watermark", default=None) | ||
| ap.add_argument("--overlay", default="main_w/2-overlay_w/2:main_h/2-overlay_h/2") | ||
| ap.add_argument("--decode", choices=["gpu","cpu"], default="gpu") | ||
| ap.add_argument("--trials", nargs="+", default=[ | ||
| "codec=h264_nvenc,bitrate=5M,preset=p5", | ||
| "codec=h264_nvenc,bitrate=12M,preset=p1,scale=1920:1080", | ||
| "codec=hevc_nvenc,bitrate=6M,preset=p4", | ||
| # "codec=av1_nvenc,bitrate=3M,preset=p5", # only if available | ||
| ]) | ||
| args = ap.parse_args() | ||
| ffmpeg = which_ffmpeg() | ||
| gi = gpu_info() | ||
| def parse_trial(s: str) -> Dict[str, str]: | ||
| d: Dict[str, str] = {} | ||
| for kv in s.split(","): | ||
| k, v = kv.split("=", 1); d[k.strip()] = v.strip() | ||
| return d | ||
| trial_specs = [parse_trial(s) for s in args.trials] | ||
| # Warmup with first available encoder | ||
| if args.warmup: | ||
| warm = next((t for t in trial_specs if has_encoder(ffmpeg, t.get("codec","h264_nvenc"))), None) | ||
| if warm: | ||
| with tempfile.NamedTemporaryFile(suffix=".mp4", delete=True) as tmp: | ||
| _ = transcode_once(ffmpeg, args.input, tmp.name, | ||
| warm.get("codec","h264_nvenc"), | ||
| warm.get("bitrate","5M"), | ||
| warm.get("preset","p5"), | ||
| args.duration, | ||
| warm.get("scale"), | ||
| args.watermark, | ||
| args.overlay, | ||
| args.decode) | ||
| results = []; idx = 0 | ||
| for spec in trial_specs: | ||
| for _ in range(args.repeat): | ||
| if not has_encoder(ffmpeg, spec.get("codec","h264_nvenc")): | ||
| results.append({ | ||
| "trial_index": idx, "codec": spec.get("codec"), "bitrate": spec.get("bitrate"), | ||
| "preset": spec.get("preset"), "scale_filter": "", "fps": None, "speed_x": None, | ||
| "frames": None, "compute_time_us": 0, "output_size_bytes": 0, | ||
| "stdout_tail": "SKIPPED: encoder not available", "argv": "", "status": "skipped" | ||
| }); idx += 1; continue | ||
| with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp: | ||
| outp = tmp.name | ||
| res = transcode_once(ffmpeg, args.input, outp, | ||
| spec.get("codec","h264_nvenc"), | ||
| spec.get("bitrate","5M"), | ||
| spec.get("preset","p5"), | ||
| args.duration, | ||
| spec.get("scale"), | ||
| args.watermark, | ||
| args.overlay, | ||
| args.decode) | ||
| results.append({ | ||
| "trial_index": idx, "codec": spec.get("codec"), "bitrate": spec.get("bitrate"), | ||
| "preset": spec.get("preset"), "scale_filter": res["filter_used"], "fps": res["fps"], | ||
| "speed_x": res["speed_x"], "frames": res["frames"], | ||
| "compute_time_us": res["compute_time_us"], "output_size_bytes": res["output_size_bytes"], | ||
| "stdout_tail": res["stdout_tail"], "argv": " ".join(res["args"]), "status": "ok" | ||
| }) | ||
| idx += 1 | ||
| try: os.remove(outp) | ||
| except OSError: pass | ||
| report = {"gpu": gi, "ffmpeg_path": ffmpeg, "trial_count": len(results), "results": results} | ||
| print(json.dumps(report, indent=2)) | ||
| if args.csv and results: | ||
| with open(args.csv, "w", newline="") as f: | ||
| w = csv.DictWriter(f, fieldnames=list(results[0].keys())) | ||
| w.writeheader(); w.writerows(results) | ||
| if __name__ == "__main__": | ||
| main() | ||
| PY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Eliminate code duplication between embedded and standalone scripts.
The Python benchmarking code is duplicated here (237 lines embedded) and in the standalone gpu_bench.py file. This creates maintenance burden and increases the risk of the two versions diverging—which has already begun (the embedded version has av1_nvenc commented out on line 192 and includes different skipping logic on lines 225-231).
Consider one of these approaches:
Option 1 (recommended): Ship gpu_bench.py as a standalone script and invoke it:
-cat > gpu_bench.py <<'PY'
-#!/usr/bin/env python3
-[... entire script ...]
-PY
-chmod +x gpu_bench.py
+# Ensure gpu_bench.py is in the same directory
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+GPU_BENCH="${SCRIPT_DIR}/gpu_bench.py"
+
+if [[ ! -f "$GPU_BENCH" ]]; then
+ echo "Error: gpu_bench.py not found in ${SCRIPT_DIR}"
+ exit 1
+fi
+chmod +x "$GPU_BENCH"Then update the execution line:
-./gpu_bench.py \
+"$GPU_BENCH" \Option 2: If embedding is required for distribution, generate it from the canonical source during build/release rather than maintaining two copies manually.
Committable suggestion skipped: line range outside the PR's diff.
Merge benchmarks by Russell and Yuxuan into Development
This reverts commit 4fca4aa.
added recommender benchmark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (5)
benchmarks/200.multimedia/220.video-processing/input.py (1)
19-43: LGTM! Well-implemented with good error handling.The function correctly:
- Tracks the last uploaded file with clear variable naming
- Handles the case when no .mp4 files are found
- Provides comprehensive documentation through comments
- Configures the GPU filter operation with appropriate parameters
The unused
sizeandnosql_funcparameters are likely required by the benchmark framework interface. Based on learnings, input validation is not required since benchmarks are invoked by maintainers.Note: This implementation is nearly identical to benchmarks/200.multimedia/220.video-processing-gpu/input.py (which has critical bugs). Consider whether both files are necessary or if they should share common logic to avoid drift.
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (4)
3-12: Tighten imports and ensure external GPU deps are provisioned
subprocessis no longer used anywhere; keeping it is noise and may confuse future readers.numpy,cv2, andtorchare hard runtime dependencies of this function. Make sure they are either in this benchmark’s requirements / Docker image or clearly documented, otherwise the handler will fail at import time in some environments.Consider:
-import subprocess # not strictly needed anymore, but you can remove it -import stat # can also be removed if you drop ffmpeg entirely +import stat # can also be removed if you drop ffmpeg entirely…and verify that the environment used by this benchmark explicitly includes
numpy,opencv-python, andtorch(or their equivalent packages).
42-49: Rename unused loop index for clarity / lint compliance
iis not used in the loop body, which Ruff flags (B007) and mildly hurts readability.- for i in range(max_frames): + for _ in range(max_frames):This communicates intent and silences the warning without behavior change.
52-57: Fix ambiguous dash in comment and unused shape components
- The
0–255comment uses an en dash instead of a hyphen (RUF003).TandCare never used (RUF059); better to avoid binding them.- # Stack into (T, H, W, C) - video_np = np.stack(frames, axis=0) # uint8, 0–255 - T, H, W, C = video_np.shape + # Stack into (T, H, W, C) + video_np = np.stack(frames, axis=0) # uint8, 0-255 + _, H, W, _ = video_np.shapeThis keeps the doc value of the comment while aligning with the linter and avoiding unused variables.
154-160: Consider dropping the ffmpeg chmod block if ffmpeg is no longer usedThe handler no longer shells out to an embedded
ffmpegbinary; this block now just does astat/chmodin a try/except, usually on a non-existent file.If you’re confident no path still relies on that embedded binary, simplifying by removing this block will cut a tiny bit of cold-start and per-invocation overhead and reduce cognitive load:
- ffmpeg_binary = os.path.join(SCRIPT_DIR, "ffmpeg", "ffmpeg") - try: - st = os.stat(ffmpeg_binary) - os.chmod(ffmpeg_binary, st.st_mode | stat.S_IEXEC) - except OSError: - # Ignore if ffmpeg is not present or filesystem is read-only. - passIf you do still need the binary for other operations, keeping this as-is is fine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
benchmarks/200.multimedia/220.video-processing-gpu/resources/watermark.pngis excluded by!**/*.png
📒 Files selected for processing (7)
benchmarks/200.multimedia/220.video-processing-gpu/config.json(1 hunks)benchmarks/200.multimedia/220.video-processing-gpu/init.sh(1 hunks)benchmarks/200.multimedia/220.video-processing-gpu/input.py(1 hunks)benchmarks/200.multimedia/220.video-processing-gpu/python/function.py(1 hunks)benchmarks/200.multimedia/220.video-processing-gpu/python/requirements.txt(1 hunks)benchmarks/200.multimedia/220.video-processing/input.py(1 hunks)experiments.json(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- benchmarks/200.multimedia/220.video-processing-gpu/config.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T16:15:01.811Z
Learnt from: mcopik
Repo: spcl/serverless-benchmarks PR: 99
File: benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp:11-0
Timestamp: 2025-06-19T16:15:01.811Z
Learning: In the serverless-benchmarks project, input validation for benchmark functions is not required since the benchmarks are invoked directly by the maintainers who control the input values. This applies to functions like the sleep benchmark in benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp.
Applied to files:
benchmarks/200.multimedia/220.video-processing-gpu/init.sh
🧬 Code graph analysis (2)
benchmarks/200.multimedia/220.video-processing-gpu/input.py (1)
benchmarks/200.multimedia/220.video-processing/input.py (2)
buckets_count(4-6)generate_input(19-43)
benchmarks/200.multimedia/220.video-processing/input.py (1)
benchmarks/200.multimedia/210.thumbnailer/input.py (1)
generate_input(15-29)
🪛 Gitleaks (8.29.0)
experiments.json
[high] 266-266: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 273-273: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🪛 Ruff (0.14.5)
benchmarks/200.multimedia/220.video-processing-gpu/input.py
15-15: Unused function argument: size
(ARG001)
15-15: Unused function argument: nosql_func
(ARG001)
benchmarks/200.multimedia/220.video-processing/input.py
19-19: Unused function argument: size
(ARG001)
19-19: Unused function argument: nosql_func
(ARG001)
29-29: Avoid specifying long messages outside the exception class
(TRY003)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py
33-33: Avoid specifying long messages outside the exception class
(TRY003)
42-42: Loop control variable i not used within loop body
Rename unused i to _i
(B007)
53-53: Avoid specifying long messages outside the exception class
(TRY003)
56-56: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?
(RUF003)
57-57: Unpacked variable T is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
57-57: Unpacked variable C is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
88-88: Probable insecure usage of temporary file or directory: "/tmp/processed-"
(S108)
93-93: Avoid specifying long messages outside the exception class
(TRY003)
146-148: Avoid specifying long messages outside the exception class
(TRY003)
150-150: Probable insecure usage of temporary file or directory: "/tmp/"
(S108)
🪛 Shellcheck (0.11.0)
benchmarks/200.multimedia/220.video-processing-gpu/init.sh
[warning] 4-4: DIR appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 5-5: VERBOSE appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 6-6: TARGET_ARCHITECTURE appears unused. Verify use (or export if used externally).
(SC2034)
🔇 Additional comments (4)
benchmarks/200.multimedia/220.video-processing-gpu/init.sh (1)
1-10: LGTM!The script correctly implements a no-op initialization that conforms to the SeBS framework interface. The unused variables are intentional as documented in the comment.
benchmarks/200.multimedia/220.video-processing-gpu/input.py (1)
3-4: LGTM!The
buckets_count()function correctly returns one input and one output bucket.benchmarks/200.multimedia/220.video-processing/input.py (2)
1-6: LGTM!The imports and
buckets_count()function are correctly implemented with clear documentation.
9-18: LGTM!The docstring accurately describes the function and its parameters.
| def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func): | ||
| for file in glob.glob(os.path.join(data_dir, '*.mp4')): | ||
| img = os.path.relpath(file, data_dir) | ||
| upload_func(0, img, file) | ||
| #TODO: multiple datasets | ||
| input_config = {'object': {}, 'bucket': {}} | ||
| input_config['object']['key'] = img | ||
| # The function supports only "gpu-filter" at the moment. | ||
| input_config['object']['op'] = 'gpu-filter' | ||
| input_config['object']['duration'] = 1 | ||
| input_config['bucket']['bucket'] = benchmarks_bucket | ||
| input_config['bucket']['input'] = input_paths[0] | ||
| input_config['bucket']['output'] = output_paths[0] | ||
| return input_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Add error handling for missing files and improve variable naming.
This function has several issues:
- NameError if no .mp4 files exist: If the glob finds no files, the loop never executes and
imgis never defined, causing a NameError on line 21. - Misleading variable name:
imgsuggests an image file, but this processes video files. - Missing parameter: The
num_itersfield present in the similar function at benchmarks/200.multimedia/220.video-processing/input.py (line 37) is missing here.
Apply this diff to fix these issues (matching the implementation in 220.video-processing/input.py):
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
+ last_key = None
+
+ # Upload all .mp4 files from data_dir to bucket 0
for file in glob.glob(os.path.join(data_dir, '*.mp4')):
- img = os.path.relpath(file, data_dir)
- upload_func(0, img, file)
+ key = os.path.relpath(file, data_dir)
+ upload_func(0, key, file)
+ last_key = key
+
+ if last_key is None:
+ raise RuntimeError(f"No .mp4 files found in {data_dir}")
+
#TODO: multiple datasets
input_config = {'object': {}, 'bucket': {}}
- input_config['object']['key'] = img
+ # Use the last uploaded file as the input object
+ input_config['object']['key'] = last_key
# The function supports only "gpu-filter" at the moment.
- input_config['object']['op'] = 'gpu-filter'
- input_config['object']['duration'] = 1
+ input_config['object']['op'] = 'gpu-filter' # must match your handler's operations dict
+ input_config['object']['duration'] = 1 # seconds of video to process
+ input_config['object']['num_iters'] = 10 # extra param for GPU workload intensity
+
input_config['bucket']['bucket'] = benchmarks_bucket🧰 Tools
🪛 Ruff (0.14.5)
15-15: Unused function argument: size
(ARG001)
15-15: Unused function argument: nosql_func
(ARG001)
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/input.py around lines 15
to 28, handle the case where no .mp4 files are found (avoid NameError) by
checking glob results and raising/logging an error or returning None, rename the
variable from img to a clearer video_relpath or video_key, ensure you call
upload_func for the selected video(s) with the correct args, and add the missing
'num_iters' field (set to 1 like the non-GPU implementation) into input_config;
mirror the control flow and field names used in
benchmarks/200.multimedia/220.video-processing/input.py so input_config always
has object.key, object.op ('gpu-filter'), object.duration, object.num_iters, and
bucket.* set, and fail fast if no video files are available.
| fps = cap.get(cv2.CAP_PROP_FPS) | ||
| if not fps or fps <= 0: | ||
| fps = 25.0 # fallback | ||
|
|
||
| max_frames = int(fps * duration) | ||
| frames = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against NaN/invalid FPS values before computing max_frames
If cap.get(cv2.CAP_PROP_FPS) returns NaN (which some containers/drivers do), the condition if not fps or fps <= 0 will not trigger, and int(fps * duration) will raise at runtime.
You can defensively handle this:
+import math
...
- fps = cap.get(cv2.CAP_PROP_FPS)
- if not fps or fps <= 0:
+ fps = cap.get(cv2.CAP_PROP_FPS)
+ if not fps or fps <= 0 or math.isnan(fps):
fps = 25.0 # fallbackThis keeps the simple fallback while avoiding a crash on malformed metadata.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fps = cap.get(cv2.CAP_PROP_FPS) | |
| if not fps or fps <= 0: | |
| fps = 25.0 # fallback | |
| max_frames = int(fps * duration) | |
| frames = [] | |
| fps = cap.get(cv2.CAP_PROP_FPS) | |
| if not fps or fps <= 0 or math.isnan(fps): | |
| fps = 25.0 # fallback | |
| max_frames = int(fps * duration) | |
| frames = [] |
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
lines 35 to 40, fps obtained from cap.get(cv2.CAP_PROP_FPS) can be NaN or
non-finite so the current check (if not fps or fps <= 0) can miss it and cause
int(fps * duration) to crash; defend by validating fps with math.isfinite (or
math.isnan) and the existing <= 0 check, e.g. import math and set fps = fallback
when not math.isfinite(fps) or fps <= 0, then compute max_frames from the
validated fps.
| # Encode processed video with OpenCV (CPU) | ||
| base = os.path.splitext(os.path.basename(video_path))[0] | ||
| out_path = f"/tmp/processed-{base}.mp4" | ||
|
|
||
| fourcc = cv2.VideoWriter_fourcc(*"mp4v") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harden /tmp usage and sanitize user-controlled key
Two related issues here:
download_path = f"/tmp/{key}":- If
keycontains path separators or.., this can escape/tmpor attempt to create nested directories that don’t exist, leading to failures or even overwriting unintended files.
- If
- Static analysis (S108) is also concerned about the predictable paths
/tmp/processed-{base}.mp4and/tmp/{key}.
A minimal hardening that preserves current naming:
- base = os.path.splitext(os.path.basename(video_path))[0]
- out_path = f"/tmp/processed-{base}.mp4"
+ base = os.path.splitext(os.path.basename(video_path))[0]
+ out_path = os.path.join("/tmp", f"processed-{base}.mp4")
...
- download_path = f"/tmp/{key}"
+ # Restrict local path to a basename to avoid directory traversal and
+ # missing parent directories under /tmp.
+ local_name = os.path.basename(key)
+ download_path = os.path.join("/tmp", local_name)If you want to fully satisfy stricter security tooling, consider using tempfile.mkstemp / NamedTemporaryFile for one or both of these paths, but the basename sanitization above already removes the traversal risk while staying simple.
Also applies to: 150-177
🧰 Tools
🪛 Ruff (0.14.5)
88-88: Probable insecure usage of temporary file or directory: "/tmp/processed-"
(S108)
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
lines 86 to 90 (and similarly lines 150-177), the code constructs predictable
/tmp paths using user-controlled values; sanitize the inputs and avoid path
traversal by taking only the basename of user-controlled key (e.g., key_base =
os.path.basename(key)) and validating/stripping any remaining suspicious
segments like '..' or os.sep, then build paths with os.path.join("/tmp",
key_base); for the processed output prefer creating a temporary file via
tempfile.NamedTemporaryFile or tempfile.mkstemp (or at minimum sanitize base =
os.path.basename(os.path.splitext(os.path.basename(video_path))[0]) and join
with /tmp) to eliminate predictable filenames and ensure the directory is
correct and exists before writing.
| torch | ||
| opencv-python-headless | ||
| numpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Pin dependency versions for reproducibility and security.
The requirements file lacks version constraints, which poses reproducibility and security risks. Unpinned versions can produce non-deterministic benchmark results and may introduce breaking changes across CI/CD runs.
Additionally, torch is specified without explicit GPU support configuration—confirm whether CUDA/cuDNN dependencies are documented elsewhere in the benchmark setup, or consider pinning a CUDA-specific torch variant if GPU acceleration is required.
Consider applying version pins like this:
-torch
-opencv-python-headless
-numpy
+torch==2.0.1
+opencv-python-headless==4.8.1.78
+numpy==1.24.3Once you've determined the appropriate versions, please verify:
- The versions are compatible with your target GPU/CUDA environment.
- Any external CUDA/cuDNN dependencies are documented in a setup guide for the benchmark.
- Run a test build to confirm all dependencies resolve without conflicts.
---
🏁 Script executed:
```shell
#!/bin/bash
# Search for all requirements files to understand version pinning patterns in codebase
echo "=== Finding all requirements.txt files ==="
find . -name "requirements*.txt" -type f | head -20
echo -e "\n=== Checking versions in other requirements files ==="
find . -name "requirements*.txt" -type f -exec sh -c 'echo "File: $1"; head -10 "$1"' _ {} \; | head -60
Length of output: 2594
🏁 Script executed:
#!/bin/bash
# Search for GPU, CUDA, cuDNN references in benchmark directory and related docs
echo "=== Searching for GPU/CUDA references in benchmark context ==="
rg -i "cuda|gpu|cudnn" benchmarks/200.multimedia/220.video-processing-gpu/ --type-list | head -5
rg -i "cuda|gpu|cudnn" benchmarks/200.multimedia/220.video-processing-gpu/ -A2 -B2 2>/dev/null | head -40
echo -e "\n=== Looking for setup documentation ==="
find benchmarks/200.multimedia/220.video-processing-gpu/ -type f \( -name "*.md" -o -name "*.txt" -o -name "setup*" -o -name "README*" \) Length of output: 4364
🏁 Script executed:
#!/bin/bash
# Check structure of the benchmark directory and look for related docs
echo "=== Contents of video-processing-gpu benchmark directory ==="
ls -la benchmarks/200.multimedia/220.video-processing-gpu/ 2>/dev/null || echo "Directory check"
find benchmarks/200.multimedia/220.video-processing-gpu/ -type f | head -20
echo -e "\n=== Parent directory context ==="
ls -la benchmarks/200.multimedia/Length of output: 1214
🏁 Script executed:
#!/bin/bash
# Search for any documentation about dependencies or torch in the benchmark context
echo "=== Searching for torch, dependencies, or installation docs ==="
rg -i "torch|dependencies|install|setup" benchmarks/200.multimedia/ --type md --type txt -A2 | head -50
echo -e "\n=== Checking if there's a main setup guide ==="
find benchmarks/ -maxdepth 2 -name "README*" -o -name "setup*" -o -name "INSTALL*" | head -10
cat benchmarks/README.md 2>/dev/null | head -30Length of output: 371
🌐 Web query:
torch 2.0.1 opencv-python-headless 4.8.1 numpy 1.24 compatibility
💡 Result:
Short answer: Yes — torch 2.0.1 works with NumPy 1.24, and opencv-python-headless 4.8.1 also supports NumPy 1.24. The main incompatibility to avoid is NumPy 2.x (PyTorch wheels built against NumPy 1.x can fail under NumPy 2.x). Test by importing torch and cv2 after installing.
Recommended pip pins:
- pip install "numpy==1.24.*"
- pip install "torch==2.0.1"
- pip install "opencv-python-headless==4.8.1"
References:
- PyTorch issues describing failures when NumPy 2.x is used with PyTorch built against NumPy 1.x. [1][2]
- opencv-python-headless PyPI (wheels follow NumPy/Python compatibility). [3]
Sources:
[1] PyTorch issue: Numpy compatibility / DataLoader RuntimeError.
[2] PyTorch issue: modules compiled with NumPy 1.x incompatible with NumPy 2.x.
[3] opencv-python-headless PyPI project page.
🌐 Web query:
torch GPU setup requirements CUDA environment variables
💡 Result:
Summary — what you need for PyTorch GPU + relevant CUDA environment variables
- Requirements (high level)
-
Install a PyTorch binary built for the CUDA/ROCm version you intend to use, or build PyTorch from source against your local CUDA. The pip/conda wheel’s CUDA version must match the binary you select (or you must build). Verify with torch.cuda.is_available() after install. [1][3]
-
NVIDIA GPU + matching NVIDIA driver installed on host (driver must support the CUDA runtime version used by the PyTorch binary). The driver + GPU must be functional (check nvidia-smi). [3]
-
If building extensions or PyTorch from source you need a local CUDA toolkit (nvcc), cuDNN, NVML, etc., and to point environment variables to the toolkit and libs. [2][4]
- Important environment variables you’ll commonly set/check
- CUDA_HOME (or CUDA_PATH on Windows): root of CUDA installation (e.g. /usr/local/cuda or C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1). Used by build scripts and some tools. [2][4]
- CUDACXX: path to nvcc (e.g. ${CUDA_HOME}/bin/nvcc) when building from source. [4]
- PATH: include ${CUDA_HOME}/bin so nvcc and tools are found. [2][4]
- LD_LIBRARY_PATH (Linux) / PATH (Windows) / DYLD_LIBRARY_PATH (macOS older): include CUDA lib directories (e.g. ${CUDA_HOME}/lib64) so shared libs (libcudart, cuDNN) are found at runtime. [2][4]
- CUDA_VISIBLE_DEVICES: control which GPUs are visible to CUDA (comma-separated indices, or -1 to hide all). Useful for multi-GPU hosts. [2]
- CUDA_LAUNCH_BLOCKING=1: make CUDA calls synchronous (debugging). [2]
- PYTORCH_CUDA_ALLOC_CONF: control PyTorch CUDA allocator behavior (memory fragmentation, max split size, etc.). [2]
- PYTORCH_NO_CUDA_MEMORY_CACHING: disable PyTorch caching allocator (useful for debugging). [2]
- TORCH_CUDNN_V8_API_DISABLED / TORCH_CUDNN_V8_API_LRU_CACHE_LIMIT etc.: cuDNN-specific controls. [2]
- NVML-related (NVML_LIB_PATH) when building packages that query NVML. [4]
- Typical quick checks & steps (Linux)
- Ensure driver/GPU visible: nvidia-smi
- Ensure CUDA bin/libs on PATH/LD_LIBRARY_PATH:
export CUDA_HOME=/usr/local/cuda
export CUDACXX="${CUDA_HOME}/bin/nvcc"
export PATH="${CUDA_HOME}/bin:${PATH}"
export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
source ~/.bashrc - Install PyTorch GPU wheel matching desired CUDA version (or conda package with cudatoolkit). Then in Python:
import torch; print(torch.cuda.is_available(), torch.version.cuda, torch.cuda.device_count())
- Notes / gotchas
- Prebuilt PyTorch wheels include CUDA runtimes for the CUDA versions they were built for; setting CUDA_HOME / local CUDA does not make a CPU-only wheel suddenly use your system CUDA — you must install a GPU-enabled PyTorch build or build from source. [3]
- Mismatch between PyTorch’s bundled CUDA version and your GPU driver can cause torch.cuda.is_available() to be False; updating the NVIDIA driver or installing a matching PyTorch binary usually resolves this. [3]
Sources
- PyTorch CUDA environment variables and PyTorch-specific vars. [1]
- PyTorch Get Started / installation and verification notes (matching binaries, building from source). [3]
- Example CUDA env var usage and build-time vars (CUDA_HOME, CUDACXX, LD_LIBRARY_PATH, NVML paths). [4]
References
[1] PyTorch — CUDA Environment Variables.
[2] PyTorch/XLA GPU environment setup (examples of PATH/LD_LIBRARY_PATH).
[3] PyTorch Get Started / Installation & verification.
(If you want, tell me your OS, PyTorch version and how you installed it and I’ll give the exact install / env var commands and a short checklist to get torch.cuda.is_available() → True.)
Pin dependency versions to prevent NumPy 2.x incompatibility and ensure reproducibility.
Version pinning is missing, which creates reproducibility risks. Most critically, NumPy 2.x is incompatible with PyTorch wheels built against NumPy 1.x, leading to runtime failures if versions drift across CI/CD runs.
Regarding GPU support: the code correctly uses torch.cuda.is_available() to detect and enable GPU at runtime, which is the proper pattern. GPU availability depends on the environment (CUDA toolkit, driver, PyTorch build selection) rather than the requirements.txt file itself. The standard torch package handles both CPU and GPU cases gracefully.
Consider pinning versions with NumPy <2.0 constraint:
-torch
-opencv-python-headless
-numpy
+torch==2.0.1
+opencv-python-headless==4.8.1.78
+numpy<2.0Verify compatibility with your target environment after applying pins.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| torch | |
| opencv-python-headless | |
| numpy | |
| torch==2.0.1 | |
| opencv-python-headless==4.8.1.78 | |
| numpy<2.0 |
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/requirements.txt
lines 1-3, dependency versions are not pinned which risks pulling NumPy 2.x
(incompatible with many PyTorch wheels) and breaks reproducibility; update the
file to pin specific, tested versions (at minimum constrain NumPy to <2.0), pin
opencv-python-headless to a known working release, and pin torch to the version
you intend to run (or document the exact wheel URL/extra-index for CUDA-enabled
builds); after changing, run CI or a local environment install to verify
compatibility with your target CUDA/toolchain and adjust pins accordingly.
| "version": "6.0" | ||
| }, | ||
| "storage": { | ||
| "access_key": "cStL2xQ1C1c2wpB_25M3QnnNO4aizEjhHEH203q_a7s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove hardcoded credentials from version control.
Lines 266 and 273 contain hardcoded MinIO storage credentials (access_key and secret_key). Storing secrets in committed code is a critical security vulnerability. These must be removed, replaced with placeholder values, or sanitized before merging.
If this file contains actual test data with development credentials, replace the values with non-sensitive placeholders:
"storage": {
- "access_key": "cStL2xQ1C1c2wpB_25M3QnnNO4aizEjhHEH203q_a7s",
+ "access_key": "PLACEHOLDER_ACCESS_KEY",
"address": "host.docker.internal:9011",
"data_volume": "minio-volume",
"input_buckets": [],
"instance_id": "0571b3121a5e06773187dff7ffc30b4c2a9f63ae7c05ff592e3c45b9b3a6d1e9",
"mapped_port": 9011,
"output_buckets": [],
- "secret_key": "22a1207b6c97d3669271e9896097e3d8fddd6160e3c75444fb0eabf8b9ef8681",
+ "secret_key": "PLACEHOLDER_SECRET_KEY",
"type": "minio",
"version": "RELEASE.2024-07-16T23-46-41Z"
}If credentials are needed for runtime, inject them via environment variables or a secure credential store instead.
Also applies to: 273-273
🧰 Tools
🪛 Gitleaks (8.29.0)
[high] 266-266: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
experiments.json lines 266 and 273: the file contains hardcoded MinIO
credentials (access_key and secret_key); remove these literal values and replace
them with non-sensitive placeholders or reference environment/config variables
(e.g., "${MINIO_ACCESS_KEY}" and "${MINIO_SECRET_KEY}") so the runtime reads
credentials from env vars or a secrets manager; update any README or deployment
docs to describe setting the required env vars and ensure the file is not
carrying real secrets (if these were real, revoke/rotate them immediately and
sanitize the repository history if necessary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
benchmarks/000.microbenchmarks/040.server-reply/python/function.py (1)
8-12: Fix socket resource leak and add error handling.The socket is never closed, causing a resource leak. Additionally, socket operations (
connect,recv) can fail but are not wrapped in error handling.Apply this diff to fix the resource leak and add error handling:
socket.setdefaulttimeout(20) s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) - s.connect(addr) - msg = s.recv(1024).decode() - return {"result": msg} + try: + s.connect(addr) + msg = s.recv(1024).decode() + return {"result": msg} + except (socket.error, socket.timeout) as e: + return {"error": f"Socket error: {str(e)}"} + finally: + s.close()
♻️ Duplicate comments (3)
benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/requirements.txt (1)
1-1: Consider pinning the JAX version for reproducibility.Same unpinned dependency issue as benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/requirements.txt.
benchmarks/600.linearalgebra/601.matmul/python/requirements.txt (1)
1-1: Critical security vulnerabilities in torch 2.4.1.This file uses the same vulnerable torch version (2.4.1) as benchmarks/600.linearalgebra/606.spmv/python/requirements.txt, exposing the benchmark to the same critical security issues including RCE and DoS vulnerabilities.
benchmarks/400.inference/413.image-classification/python/requirements.txt (1)
3-3: Critical security vulnerabilities in torch 2.4.1.This file uses the same vulnerable torch version (2.4.1) with critical security issues. This is particularly concerning for an inference benchmark that may use
torch.loadto load models.
🟠 Major comments (17)
benchmarks/000.microbenchmarks/040.server-reply/python/function.py-7-7 (1)
7-7: Add input validation for required event keys.The
event.get()calls will returnNoneif keys are missing, causingsocket.connect()to fail with a less-clear error. Validate that"ip-address"and"port"exist and are valid before attempting the connection.Apply this diff to add validation:
def handler(event): # start timing + ip_address = event.get("ip-address") + port = event.get("port") + + if not ip_address or not port: + return {"error": "Missing required fields: ip-address and port"} + - addr = (event.get("ip-address"), event.get("port")) + addr = (ip_address, port) socket.setdefaulttimeout(20)Committable suggestion skipped: line range outside the PR's diff.
benchmarks/200.multimedia/220.video-processing/input.py-24-26 (1)
24-26: Fix loop logic: all .mp4 files are uploaded, but only the last filename is stored in the configuration.Lines 24-26 iterate over all matching files, but the
imgvariable gets overwritten each iteration. When line 29 setsinput_config["object"]["key"] = img, only the last file's path is used, while all files were uploaded.Additionally, if no
.mp4files match,imgis undefined on line 29, causing a runtime error.Decide the intended behavior: (1) process only one specific file (add break or filter), or (2) support multiple files (modify config structure). The TODO comment suggests incomplete multi-dataset support.
benchmarks/200.multimedia/220.video-processing/input.py-10-18 (1)
10-18: Move docstring inside thegenerate_inputfunction.The docstring is currently placed at module level, but it documents the
generate_inputfunction parameters. Python convention requires function docstrings to be the first statement inside the function body.Apply this diff to move the docstring to the correct location:
- -""" - Generate test, small and large workload for thumbnailer. - - :param data_dir: directory where benchmark data is placed - :param size: workload size - :param input_buckets: input storage containers for this benchmark - :param output_buckets: - :param upload_func: upload function taking three params(bucket_idx, key, filepath) -""" - - def generate_input( data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func ): + """ + Generate test, small and large workload for video processing. + + :param data_dir: directory where benchmark data is placed + :param size: workload size + :param benchmarks_bucket: benchmark storage bucket + :param input_paths: input storage paths for this benchmark + :param output_paths: output storage paths for this benchmark + :param upload_func: upload function taking three params(bucket_idx, key, filepath) + :param nosql_func: nosql function for database operations + """ for file in glob.glob(os.path.join(data_dir, "*.mp4")):benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py-62-73 (1)
62-73: Inconsistent thresholds confirmed – review comment is accurateThe full function context confirms the review comment's analysis. The thresholds are indeed misaligned:
- Loop break condition (line 44–45): breaks when
consecutive_failures == 7with "Can't setup the connection"- CSV upload condition (line 62): skips upload only when
consecutive_failures == 5This inverted logic causes a failed connection attempt (7 consecutive timeouts) to still upload a CSV file, while a run exiting with exactly 5 consecutive failures suppresses upload. The suggestion to align thresholds or use a named constant (e.g.,
MAX_CONSECUTIVE_FAILURES = 7) is appropriate.benchmarks/400.inference/413.image-classification/python/package.sh-16-17 (1)
16-17: Handle filenames with spaces or special characters.Using
xargswithout-print0can break on filenames containing spaces, newlines, or special characters.Apply this diff:
# cleaning # stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure -find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip -find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip +find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 strip +find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 stripbenchmarks/400.inference/413.image-classification/python/package.sh-25-25 (1)
25-25: Add error handling for directory change.This
cdcommand needs error handling.Apply this diff:
find . -name \*.pyc -delete -cd ${CUR_DIR} +cd "${CUR_DIR}" || exit 1 echo "Stripped size $(du -sh $1 | cut -f1)"benchmarks/400.inference/413.image-classification/python/package.sh-7-7 (1)
7-7: Add error handling for directory changes.The
cdcommand should include error handling to prevent subsequent commands from executing in an unexpected directory if the change fails.Apply this diff:
CUR_DIR=$(pwd) -cd $1 +cd "$1" || exit 1 # cleaning libsbenchmarks/400.inference/412.language-bert/python/package.sh-25-25 (1)
25-25: Add error handling for directory change.Similar to Line 7, this
cdcommand needs error handling.Apply this diff:
find . -name \*.pyc -delete -cd ${CUR_DIR} +cd "${CUR_DIR}" || exit 1 echo "Stripped size $(du -sh $1 | cut -f1)"benchmarks/400.inference/412.language-bert/python/package.sh-16-17 (1)
16-17: Handle filenames with spaces or special characters.Using
xargswithout-print0can break on filenames containing spaces, newlines, or special characters.Apply this diff:
# cleaning # stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure -find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip -find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip +find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 strip +find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 stripbenchmarks/400.inference/412.language-bert/python/package.sh-7-7 (1)
7-7: Add error handling for directory changes.The
cdcommand should include error handling to prevent subsequent commands from executing in an unexpected directory if the change fails.Apply this diff:
CUR_DIR=$(pwd) -cd $1 +cd "$1" || exit 1 # cleaning libsbenchmarks/400.inference/412.language-bert/python/package.sh-30-33 (1)
30-33: Add error handling for directory changes.Both
cdcommands in the torch-zipping block need error handling.Apply this diff:
if [ -d "$1/${TORCH_DIR}" ]; then - cd $1 + cd "$1" || exit 1 zip -qr torch.zip ${TORCH_DIR} rm -rf ${TORCH_DIR} - cd ${CUR_DIR} + cd "${CUR_DIR}" || exit 1 echo "Torch-zipped size $(du -sh $1 | cut -f1)" fibenchmarks/600.linearalgebra/603.jacobi2d/python/requirements.txt-1-1 (1)
1-1: I need to verify the PYSEC vulnerabilities mentioned in the original review before finalizing the rewritten comment.Update torch 2.4.1 to address security vulnerabilities; upgrade to 2.6.0 minimum.
Torch 2.4.1 does contain vulnerabilities, but the original characterization overstates severity. PYSEC-2024-259 involves RemoteModule deserialization RCE affecting PyTorch <=2.4.1, but is marked withdrawn as disputed.
Verified vulnerabilities:
- PYSEC-2025-41 (CVE-2025-32434): CRITICAL RCE via
torch.loadwithweights_only=True; fixed in 2.6.0- GHSA-3749-ghw9-m3mg: Low severity DoS, requires upgrade to 2.7.1+
- GHSA-887c-mr87-cxwp: Medium severity resource management, requires upgrade to 2.8.0+
Upgrade to version 2.6.0 or above to address the CRITICAL issue. Latest available is 2.9.1.
benchmarks/400.inference/413.recommendation/python/requirements.txt.3.8-1-1 (1)
1-1: Upgrade torch from 2.2.2 to 2.4.1 for Python 3.8 projects.Verification confirms torch 2.2.2 contains two active CRITICAL vulnerabilities including an RCE via torch.load with weights_only=True. PyTorch 2.5+ requires Python 3.9+, so 2.4.1 is the latest version supporting Python 3.8 compatibility. While PyTorch 2.4.1 has a known deserialization RCE (CVE-2024-48063) affecting RemoteModule, it is still preferable to 2.2.2 due to fewer vulnerabilities.
Repository currently uses: 5× torch==2.2.2, 10× torch==2.4.1, 5× torch==2.5.1. Upgrade this file to torch==2.4.1 for consistency and security.
Alternatively, if Python 3.8 support can be dropped (EOL October 2024), upgrade to torch==2.5.1+ for better security posture across all benchmarks.
benchmarks/400.inference/413.image-classification/python/init.sh-10-10 (1)
10-10: Quote variables to handle paths with spaces.The
cpcommand uses unquoted variables which will fail if paths contain spaces or special characters.Apply this diff:
-cp ${path} ${DIR} +cp "${path}" "${DIR}"benchmarks/600.linearalgebra/601.matmul/python/function.py-48-62 (1)
48-62: Include gpu_time in the measurement result.The
gpu_msvariable is computed fromkernel_gemm(line 48) but is not included in the returned measurement dictionary. All other linear algebra benchmarks (602.axpy, 603.jacobi2d, 604.cholesky, 605.lu) includegpu_timein their measurements.Apply this diff to include GPU timing:
return { # "result": result[0], "measurement": { "generating_time": matrix_generating_time, "compute_time": matmul_time, + "gpu_time": gpu_ms, }, }Also, consider prefixing
C_outwith an underscore if it's intentionally unused:- C_out, gpu_ms = kernel_gemm(alpha, beta, C, A, B, reps=1) + _C_out, gpu_ms = kernel_gemm(alpha, beta, C, A, B, reps=1)benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py-1-8 (1)
1-8: Resolve the license TODO before mergingYou’re explicitly copying/deriving from Barba & Forsyth’s CFD Python work and note CC‑BY 4.0 and BSD‑3 in the header, but there’s a
# TODO: Licensecomment.Before merging, ensure:
- The appropriate BSD‑3 license text is present in the repo (e.g., LICENSE file plus per‑file header, as required).
- Attribution requirements from CC‑BY are satisfied.
Leaving this as a TODO is a compliance risk.
benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py-39-45 (1)
39-45: Batch norm is usingstdinstead ofvarin the denominatorCanonical BN uses
(x - mean) / sqrt(var + eps)wherevaris the variance ofx. Here you computestd = jnp.std(...)and then divide bysqrt(std + eps), i.e.sqrt(sigma + eps)rather thansqrt(sigma² + eps), which changes the scaling, especially for large variances.If you intend standard BN semantics:
-@jax.jit -def batchnorm2d(x, eps=1e-5): - mean = jnp.mean(x, axis=0, keepdims=True) - std = jnp.std(x, axis=0, keepdims=True) - return (x - mean) / jnp.sqrt(std + eps) +@jax.jit +def batchnorm2d(x, eps=1e-5): + mean = jnp.mean(x, axis=0, keepdims=True) + var = jnp.var(x, axis=0, keepdims=True) + return (x - mean) / jnp.sqrt(var + eps)If this is an intentional simplification for benchmarking, a short comment would help avoid future “fixes.”
🟡 Minor comments (3)
benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py-42-51 (1)
42-51: Verify counter name semantics and logging overheadThe latency loop is functional, but two valid concerns merit addressing:
Counter semantics mismatch:
measurements_not_smallersuggests ≥ behavior, but the condition uses>(line 48). Whencur_time == cur_min, the counter resets. If strictly-greater is intentional, the name should reflect that; if "not smaller" (≥) is the goal, change the condition to>=.Per-iteration logging: The print statement (lines 45–47) runs inside the while loop and can emit up to ~1000 lines per invocation. Consider guarding it behind a debug flag or environment variable to prevent timing skew in benchmarking.
benchmarks/400.inference/413.recommendation/python/package.sh-3-4 (1)
3-4: Address unused variable and quote path arguments.The script has two issues:
PACKAGE_DIRis assigned but never used (line 4 uses$1directly).- Unquoted
$1will fail on paths with spaces.Apply this diff:
-PACKAGE_DIR=$1 -echo "DLRM GPU package size $(du -sh $1 | cut -f1)" +PACKAGE_DIR="$1" +echo "DLRM GPU package size $(du -sh "$PACKAGE_DIR" | cut -f1)"benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py-82-123 (1)
82-123: Acknowledge broad exception handling is safer than narrowing to undocumentedTypeErrorThe web search confirms that
jax.device_getdoes not document a specific set of exception types—it only describes operation and inputs. Narrowing toTypeErrorbased on implementation details is risky. The current broadExceptioncatch, while hiding potential bugs, is actually more robust than the suggested fix.Instead of narrowing the exception, add logging/metrics to understand when this occurs without losing visibility:
try: results = jax.device_get(results) - except Exception: - pass + except Exception as e: + # Not a JAX DeviceArray or already on host; leave as-is. + # Consider logging e or emitting a metric here for observability. + pass
🧹 Nitpick comments (39)
benchmarks/200.multimedia/210.thumbnailer/input.py (1)
20-22: Consider removing unused function parameters.Function arguments
sizeandnosql_funcare not used within the function body. If these parameters are no longer needed, removing them would clean up the API surface and reduce confusion. If they are retained for backward compatibility, consider adding a comment to clarify intent.benchmarks/100.webapps/120.uploader/input.py (1)
19-20: Unused function parameters.The parameters
data_dir,input_buckets,upload_func, andnosql_funcare unused in the function body. While this may be acceptable (the benchmark may not require file uploads), consider whether these parameters should be removed to simplify the interface, or if they should be retained for API consistency with other benchmark modules.benchmarks/200.multimedia/220.video-processing/input.py (2)
27-27: TODO: Multiple datasets support is not implemented.The TODO comment indicates that multiple dataset support is planned but not yet implemented. This aligns with the loop logic issue identified above.
Do you want me to help design and implement support for multiple datasets, or open a new issue to track this enhancement?
21-23: Unused function parameters.The
sizeandnosql_funcparameters are not used in the function body. If these are required by the interface contract for consistency with other benchmark input generators, consider adding a comment explaining why they're unused. Otherwise, they could be removed.benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py (2)
11-16: Event field extraction looks correct; consider hardening bucket accessThe extracted fields match the expected AWS wrapper contract and are straightforward. The nested
event.get("bucket").get("bucket")/.get("output")will raise if"bucket"is missing orNone, though, so if this benchmark might be invoked without a bucket configuration, consider something more defensive like:bucket_cfg = event.get("bucket") or {} output_bucket = bucket_cfg.get("bucket") output_prefix = bucket_cfg.get("output", "")or enforce presence early with a clear error.
63-65: Static/tmp/data.csvpath: consider a safer temp‑file patternRuff’s S108 warning about
/tmp/data.csvis valid in a general Unix context: using a fixed filename in/tmpcan open you up to symlink/race issues and, in non‑serverless environments, cross‑user interference.In typical AWS Lambda‑style deployments the risk is lower (per‑container
/tmp, single active invocation), but if this code might ever be reused elsewhere, it’s safer to use a unique temp file:import tempfile with tempfile.NamedTemporaryFile(mode="w", newline="", suffix=".csv", delete=False) as csvfile: tmp_path = csvfile.name writer = csv.writer(csvfile, delimiter=",") writer.writerow(["id", "client_send", "client_rcv"]) for row in times: writer.writerow(row) key = client.upload( output_bucket, os.path.join(output_prefix, filename), tmp_path, )This also avoids accidental reuse of stale data if anything goes wrong between writes and uploads.
Also applies to: 70-71
benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/requirements.txt (1)
1-1: Consider pinning the JAX version for reproducibility.The dependency is unpinned, which may lead to non-reproducible benchmark results across different environments as JAX updates. Consider pinning to a specific version if reproducibility is important for this benchmark suite. Note that all three JAX benchmarks in the project currently follow the same unpinned pattern, reflecting the project's mixed approach to dependency management across benchmarks.
benchmarks/400.inference/412.language-bert/python/package.sh (1)
1-1: Add shebang for portability.The script lacks a shebang line, which can lead to unexpected behavior depending on how it's invoked.
Apply this diff:
+#!/bin/bash # Stripping package code is based on https://github.com/ryfeus/lambda-packs repobenchmarks/400.inference/413.image-classification/python/package.sh (1)
1-1: Add shebang for Bash-specific syntax.The script uses
[[which is Bash-specific, but lacks a shebang line.Apply this diff:
+#!/bin/bash # Stripping package code is based on https://github.com/ryfeus/lambda-packs repobenchmarks/200.multimedia/220.video-processing/python/function.py (3)
17-23: Consider using iterable unpacking instead of list concatenation.The list concatenation on line 18 can be simplified using iterable unpacking for better readability.
Apply this diff:
- ret = subprocess.run( - [os.path.join(SCRIPT_DIR, "ffmpeg", "ffmpeg"), "-y"] + args, + ret = subprocess.run( + [os.path.join(SCRIPT_DIR, "ffmpeg", "ffmpeg"), "-y", *args], # subprocess might inherit Lambda's input for some reason stdin=subprocess.DEVNULL, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, )
31-46: Unused parameter can be removed or prefixed.The
eventparameter is not used in this function. If it's required by the operations interface, consider prefixing it with an underscore (_event) to indicate it's intentionally unused.Apply this diff if the parameter is not needed:
-def to_gif(video, duration, event): +def to_gif(video, duration, _event): output = "/tmp/processed-{}.gif".format(os.path.basename(video)) call_ffmpeg( [ "-i", video, "-t", "{0}".format(duration), "-vf", "fps=10,scale=320:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse", "-loop", "0", output, ] ) return output
50-66: Unused parameter can be removed or prefixed.The
eventparameter is not used in this function. If it's required by the operations interface, consider prefixing it with an underscore (_event) to indicate it's intentionally unused.Apply this diff if the parameter is not needed:
-def watermark(video, duration, event): +def watermark(video, duration, _event): output = "/tmp/processed-{}".format(os.path.basename(video)) watermark_file = os.path.dirname(os.path.realpath(__file__)) call_ffmpeg( [ "-i", video, "-i", os.path.join(watermark_file, os.path.join("resources", "watermark.png")), "-t", "{0}".format(duration), "-filter_complex", "overlay=main_w/2-overlay_w/2:main_h/2-overlay_h/2", output, ] ) return outputbenchmarks/500.scientific/5xx.channel_flow_jax_npbench/input.py (1)
1-17: generate_input contract and size presets look consistent; address Ruff ARG001 via config, not signature changesThe
size_generatorsmapping andgenerate_inputimplementation match the pattern used in other JAX NPBench inputs (e.g.,5xx.compute_jax_npbench/input.py), so behavior is consistent. Ruff correctly notes thatdata_dir,benchmarks_bucket,input_paths,output_paths,upload_func, andnosql_funcare unused, but these parameters are part of the commongenerate_inputinterface used across benchmarks. If this file is linted, consider suppressingARG001at the function or file level (e.g.,# ruff: noqa: ARG001) rather than altering the signature.benchmarks/500.scientific/503.graph-bfs/input.py (1)
1-7: Deterministic size/seed configuration is correct; handle unused-arg linting consistently
size_generatorsplus a constant"seed": 42matches the pattern in the pagerank and MST inputs, so graph generation for BFS will be deterministic and comparable across the graph benchmarks. As elsewhere, thegenerate_inputparameters are unused but required by the shared interface; if Ruff runs on benchmarks, it’s better to ignoreARG001for this function/file (or this directory) than to change the signature and risk diverging from the orchestrator’s expectations.benchmarks/600.linearalgebra/602.axpy/input.py (1)
1-7: Input generator logic is fine; consider minimal handling of unused-arg lint.The
size_generatorsmapping andgenerate_inputreturn shape are consistent with the other linear-algebra benchmarks and look correct. The unused parameters ingenerate_inputare only there to satisfy the common benchmark interface, but they trigger RuffARG001warnings.If Ruff is enforced in CI, you could cheaply acknowledge these without changing the public API, for example:
def generate_input( data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func ): - return {"size": size_generators[size], "seed": 42} + # Parameters are part of the common benchmark interface but not used here. + _ = (data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func) + return {"size": size_generators[size], "seed": 42}benchmarks/500.scientific/5xx.compute_jax_npbench/input.py (1)
1-17: Consistent JAX size mapping; same note about unused-arg warnings.The
size_generatorsstructure andgenerate_inputbehavior are aligned with the other npbench-style inputs and look good.As with the other generators, Ruff flags the unused parameters in
generate_input. If lint is enforced, you can keep the interface and quiet the warnings:def generate_input( @@ -): - return {"size": size_generators[size]} +): + # Unused parameters kept to match the common benchmark interface. + _ = (data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func) + return {"size": size_generators[size]}benchmarks/600.linearalgebra/604.cholesky/input.py (1)
1-7: Matches the linear-algebra input pattern; optional lint-silencing for unused args.This mirrors the other 600.linearalgebra input modules and is functionally fine.
If Ruff
ARG001is a concern, you can acknowledge the interface-only parameters:def generate_input( data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func ): - return {"size": size_generators[size], "seed": 42} + # Unused, required by the shared benchmark interface. + _ = (data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func) + return {"size": size_generators[size], "seed": 42}benchmarks/400.inference/411.image-recognition/input.py (1)
8-15: Minor polish: unused loop variable, unused args, and a small edge-case ingenerate_input.
- Line 10:
dirsis never used; to satisfy Ruff B007 you can rename it to_dirs(or_) to make intent explicit:for root, _dirs, files in os.walk(data_dir):- Lines 29–31:
size,output_paths, andnosql_funcare unused but required by the shared interface. If Ruff is enforced, consider the same_ = (...)pattern used in other generators to quietARG001without changing behavior.- Lines 37–48:
input_images[0][0]assumes thatval_map.txtis non-empty. If there’s any chance of an empty file in some runs, a simple guard would prevent an IndexError, e.g.:if not input_images: raise ValueError("val_map.txt did not contain any images")Also applies to: 29-51
benchmarks/600.linearalgebra/603.jacobi2d/input.py (1)
1-7: Linear-algebra input looks good; same optional unused-arg handling as siblings.This module is consistent with the other 600.linearalgebra inputs (
601.matmul,602.axpy,604.cholesky) and the return structure is fine.If you decide to quiet Ruff
ARG001in one of these files, consider applying the same_ = (...)pattern here for the unused parameters to keep the suite consistent.benchmarks/400.inference/413.image-classification/python/function.py (2)
17-21: Inference preprocessing and label mapping look correct; consider minor robustness tweaks.Overall, this follows a standard ResNet-50 ImageNet pipeline and looks sound:
imagenet_class_index.jsonis loaded once andidx2labelis built as expected._prepare_tensordoes resize→center-crop→normalize→NCHW with ImageNet mean/std; that’s correct for typical ResNet-50.A few low-priority cleanups you might consider:
- Line 18: wrap the JSON load in a context manager so the file handle is explicitly closed:
with open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r") as f: class_idx = json.load(f)- Lines 92, 95, 102, 103: the outer
int()aroundround(...)is redundant (Ruff RUF046) and can be dropped without changing behavior._prepare_tensor: if you expect occasional bad image paths, you might want to let theImage.openerror bubble up with a clearer message or catch/log it, but for a controlled benchmark dataset the current behavior is acceptable.Also applies to: 30-31, 109-117
34-85: Secure tar extraction and temp handling — well-established patterns exist; consider applying them.Yes, well-established patterns do exist for this. Tar path-traversal vulnerabilities are real (documented in CVEs) and Python's tarfile module has known risks with
extractall()on untrusted archives.Recommended hardening approach:
Replace hardcoded
/tmpwithtempfile.TemporaryDirectory()(context manager for automatic cleanup) ortempfile.mkstemp()for per-file temps. This avoids TOCTOU issues and makes intent clearer.Validate tar members before extraction instead of calling
extractall()directly:
- Compute resolved target path:
resolved = (Path(extract_dir) / member.name).resolve()- Verify it stays within
extract_dirusingPath.is_relative_to()or prefix comparison- Reject absolute paths,
..segments, symlinks (member.issym()), device nodes (member.isdev()), and other special members- Use
tar.extractfile()and write validated content manually rather thanextractall()Example pattern (lines 46–60):
with tempfile.TemporaryDirectory() as temp_dir: archive_path = os.path.join(temp_dir, archive_basename) # ... download to archive_path ... with tarfile.open(archive_path, "r:gz") as tar: for member in tar.getmembers(): resolved = (Path(MODEL_DIRECTORY) / member.name).resolve() if not resolved.is_relative_to(Path(MODEL_DIRECTORY).resolve()): raise ValueError(f"Unsafe member path: {member.name}") if member.issym() or member.isdev() or member.islnk(): raise ValueError(f"Unsafe member type: {member.name}") tar.extract(member, MODEL_DIRECTORY) # or extractfile + manual writeFor a controlled benchmark, the current code is functionally safe, but applying these patterns would make it safer to reuse and demonstrates defense-in-depth.
benchmarks/600.linearalgebra/603.jacobi2d/python/function.py (2)
44-50: Remove unused seed handling code.The seed handling logic extracts a
seedvariable (lines 44-50), but it's never used. Theinitialize_torchfunction at line 53 doesn't accept a seed parameter, and the seed isn't referenced elsewhere in the handler. This is dead code.Apply this diff to remove the unused code:
def handler(event): size = event.get("size") - if "seed" in event: - import random - - random.seed(event["seed"]) - - seed = event.get("seed", 42) - seed = int(seed) matrix_generating_begin = datetime.datetime.now()
57-57: Prefix unused variables with underscore.The variables
A_outandB_outare unpacked but never used. Apply the underscore prefix to indicate they are intentionally unused.Apply this diff:
- A_out, B_out, gpu_ms = kernel_jacobi2d(A, B, iters=50) + _, _, gpu_ms = kernel_jacobi2d(A, B, iters=50)benchmarks/600.linearalgebra/601.matmul/python/function.py (1)
35-41: Remove unused seed handling code.The seed handling logic (lines 35-41) extracts a
seedvariable, but it's never used. Theinitialize_torchfunction doesn't accept a seed parameter.Apply this diff to remove the unused code:
def handler(event): size = event.get("size") - if "seed" in event: - import random - - random.seed(event["seed"]) - - seed = event.get("seed", 42) - seed = int(seed) matrix_generating_begin = datetime.datetime.now()benchmarks/600.linearalgebra/605.lu/python/function.py (2)
53-59: Remove unused seed handling code.The seed handling logic (lines 53-59) extracts a
seedvariable, but it's never used sinceinitialize_torchdoesn't accept a seed parameter.Apply this diff to remove the unused code:
def handler(event): size = event.get("size") - if "seed" in event: - import random - - random.seed(event["seed"]) - - seed = event.get("seed", 42) - seed = int(seed) gen_begin = datetime.datetime.now()
66-66: Prefix unused variable with underscore.The variable
Bis unpacked but never used. Apply the underscore prefix to indicate it's intentionally unused.Apply this diff:
- B, gpu_ms = kernel(A) + _, gpu_ms = kernel(A)benchmarks/600.linearalgebra/604.cholesky/python/function.py (2)
36-42: Remove unused seed handling code.The seed handling logic (lines 36-42) extracts a
seedvariable, but it's never used sinceinitialize_torchdoesn't accept a seed parameter.Apply this diff to remove the unused code:
def handler(event): size = event.get("size") - if "seed" in event: - import random - - random.seed(event["seed"]) - - seed = event.get("seed", 42) - seed = int(seed) gen_begin = datetime.datetime.now()
49-49: Prefix unused variable with underscore.The variable
Lis unpacked but never used. Apply the underscore prefix to indicate it's intentionally unused.Apply this diff:
- L, gpu_ms = kernel_cholesky(A) + _, gpu_ms = kernel_cholesky(A)benchmarks/600.linearalgebra/606.spmv/python/function.py (1)
58-60: Clarify that the SpMV output is intentionally unused
y_outis never used afterkernel_spmvreturns, which triggers the Ruff warning and may confuse readers.You can explicitly mark the result as unused:
- y_out, gpu_ms = kernel_spmv(A, x, reps=100) + _, gpu_ms = kernel_spmv(A, x, reps=100)This keeps behavior identical while silencing the unused-variable warning.
benchmarks/000.microbenchmarks/010.sleep/input.py (1)
1-11: Input generator matches project conventions; unused args are expected
size_generatorsandgenerate_inputfollow the common input.py pattern (fixed sizes, onlysizeused; other parameters kept for a uniform interface). Unused-argument warnings here can be safely ignored since the benchmark framework supplies these arguments but this benchmark only needssize. Based on learnings, inputs are maintainer-controlled, so additional validation isn’t required.benchmarks/400.inference/413.image-classification/input.py (1)
18-26: Update the outdated docstring to match image‑classification behaviorThe docstring still mentions “compression test”, which no longer matches this benchmark’s purpose and can mislead future readers.
Consider updating it along these lines:
-""" - Generate test, small and large workload for compression test. +""" + Generate test, small and large workloads for the image classification benchmark.The rest of the logic (model/image uploads and input_config layout) looks consistent with other inference benchmarks.
benchmarks/400.inference/413.recommendation/python/function.py (1)
54-63: UseMODEL_CACHEconsistently for temporary model files
MODEL_CACHEis created but the downloaded checkpoint is written directly under"/tmp"instead of that directory. Using the cache directory fortmp_pathmakes the intent clearer and keeps related files grouped.- os.makedirs(MODEL_CACHE, exist_ok=True) - tmp_path = os.path.join("/tmp", f"{uuid.uuid4()}-{MODEL_FILE}") + os.makedirs(MODEL_CACHE, exist_ok=True) + tmp_path = os.path.join(MODEL_CACHE, f"{uuid.uuid4()}-{MODEL_FILE}")Behavior (download, load, remove) stays the same, but the cache directory is now actually used.
benchmarks/400.inference/412.language-bert/python/function.py (4)
31-45: Harden model archive handling and temp file usageCurrent model download/extraction uses hard-coded
/tmppaths andtarfile.extractalldirectly on an archive that comes from external storage. That opens you up to classic temp-file and tar path‑traversal issues (e.g., a malicious archive writing outsideMODEL_DIRECTORY).Consider:
- Using
tempfile.NamedTemporaryFile/TemporaryDirectoryinstead of manual/tmppaths.- Validating archive members before extraction and refusing entries that would escape
MODEL_DIRECTORY.For example:
-import tarfile +import tarfile +import tempfile ... - archive_path = os.path.join("/tmp", f"{uuid.uuid4()}-{MODEL_ARCHIVE}") + fd, archive_path = tempfile.mkstemp(suffix=f"-{MODEL_ARCHIVE}") + os.close(fd) ... - with tarfile.open(archive_path, "r:gz") as tar: - tar.extractall(MODEL_DIRECTORY) + with tarfile.open(archive_path, "r:gz") as tar: + base = os.path.realpath(MODEL_DIRECTORY) + for member in tar.getmembers(): + dest = os.path.realpath(os.path.join(MODEL_DIRECTORY, member.name)) + if not dest.startswith(base + os.sep): + raise RuntimeError(f"Refusing to extract outside {MODEL_DIRECTORY}: {member.name}") + tar.extractall(MODEL_DIRECTORY)
72-79: Clarify model timing semantics
model_download_timeandmodel_process_timeare measured in microseconds and can legitimately be zero when the model is already initialized. That’s fine, but for consistency with other benchmarks (which generally report milliseconds), consider normalizing units or documenting that these are µs so downstream tooling doesn’t misinterpret values.
107-118: Validate required event fields before composing storage paths
bucket,model_prefix,text_prefix, andtext_keyare assumed present; if any are missing/None,os.path.joinorclient.downloadwill raise confusingTypeErrors.Consider an explicit validation up front:
def handler(event): - bucket = event.get("bucket", {}).get("bucket") - model_prefix = event.get("bucket", {}).get("model") - text_prefix = event.get("bucket", {}).get("text") - text_key = event.get("object", {}).get("input") + bucket_info = event.get("bucket", {}) + object_info = event.get("object", {}) + bucket = bucket_info.get("bucket") + model_prefix = bucket_info.get("model") + text_prefix = bucket_info.get("text") + text_key = object_info.get("input") + + if not all([bucket, model_prefix, text_prefix, text_key]): + raise ValueError(f"Missing required bucket/text fields in event: {event}")This keeps failures deterministic and easier to debug when wiring new benchmarks.
133-144: Usezip(..., strict=True)to catch length mismatches earlyStatic analysis is right that an explicit
stricthelps detect subtle bugs ifsentencesandprobabilitiesever diverge in length.- for sentence, probs in zip(sentences, probabilities): + for sentence, probs in zip(sentences, probabilities, strict=True):That will raise immediately instead of silently dropping or duplicating elements.
benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py (2)
13-36: Convolution logic is correct but could uselax.conv_general_dilatedfor performanceThe hand‑rolled
conv2dusing nestedlax.scananddynamic_slice/dynamic_update_sliceis functionally correct for stride‑1 valid convolution, but it will be significantly slower than usinglax.conv_general_dilatedon GPUs/TPUs.If you care about maximizing throughput for this benchmark, consider switching to:
def conv2d(x, w): return lax.conv_general_dilated( x, w, window_strides=(1, 1), padding="VALID", dimension_numbers=("NHWC", "HWIO", "NHWC"), )
68-79: Prefer NumPy/JAX dtypes explicitly in RNG initializationYou’re passing
dtype=jnp.float32intonumpy.random.default_rng().random. That’s usually convertible, but being explicit avoids surprises:- input = rng.random((N, H, W, C1), dtype=jnp.float32) - conv1 = rng.random((1, 1, C1, C2), dtype=jnp.float32) - conv2 = rng.random((3, 3, C2, C2), dtype=jnp.float32) - conv3 = rng.random((1, 1, C2, C1), dtype=jnp.float32) + import numpy as np + input = rng.random((N, H, W, C1), dtype=np.float32) + conv1 = rng.random((1, 1, C1, C2), dtype=np.float32) + conv2 = rng.random((3, 3, C2, C2), dtype=np.float32) + conv3 = rng.random((1, 1, C2, C1), dtype=np.float32)JAX will still happily accept these arrays and move them to the device on first use.
benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py (1)
237-279: Catch specific transfer exceptions instead of broadexcept ExceptionThe core concern is valid: swallowing all exceptions hides real issues. However, based on
jax.device_getbehavior,TypeErroris not the primary exception you'd encounter. Transfer guard violations raiseRuntimeError, and backend/XLA failures surface as runtime errors likeXlaRuntimeError. Catching onlyTypeErrorwould miss these cases. Consider:- try: - results = jax.device_get(results) - except Exception: - pass + try: + results = jax.device_get(results) + except (RuntimeError, TypeError): + # Transfer failed or __array__ not available; keep device array. + passOr remove the catch entirely if device/host transfer failure should propagate.
| - name: Install Python dependencies (via install.py) | ||
| run: | | ||
| python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Activate virtualenv before installing dependencies.
The python3 install.py command runs without activating the virtualenv first. This causes dependencies to be installed in the system Python instead of the cached virtualenv, making the cache ineffective and causing subsequent steps to fail if they expect tools in the venv.
Activate the virtualenv before running install.py:
- name: Install Python dependencies (via install.py)
run: |
+ . python-venv/bin/activate
python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-localAlternatively, use the Python binary from the venv directly:
- name: Install Python dependencies (via install.py)
run: |
- python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
+ python-venv/bin/python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Install Python dependencies (via install.py) | |
| run: | | |
| python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local | |
| - name: Install Python dependencies (via install.py) | |
| run: | | |
| . python-venv/bin/activate | |
| python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local |
| - name: Install Python dependencies (via install.py) | |
| run: | | |
| python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local | |
| - name: Install Python dependencies (via install.py) | |
| run: | | |
| python-venv/bin/python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local |
🤖 Prompt for AI Agents
In .github/workflows/lint.yml around lines 35-37, the workflow runs python3
install.py without activating the cached virtualenv so packages are installed
into the system Python; before running install.py either activate the virtualenv
created/checked-out by the job (e.g. source the venv activation script for
Linux/macOS or the Scripts\Activate script on Windows) and then run install.py,
or invoke the Python binary from the virtualenv directly (e.g. use the venv's
python executable to run install.py) so dependencies are installed into and
cached with the venv.
| data_dir, size, benchmarks_bucket, input_buckets, output_buckets, upload_func, nosql_func | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify: Breaking function signature change across call sites.
The function signature changed parameter names from input_paths, output_paths to input_buckets, output_buckets. This breaks any code calling this function with positional arguments or keyword arguments using the old names. Existing call sites throughout the codebase must be updated to match.
Additionally, this signature diverges from the standard pattern used in other benchmarks (e.g., 110.dynamic-html, 030.clock-synchronization, 411.image-recognition), which use input_paths and output_paths. Confirm this is intentional and not an incomplete refactor.
🏁 Script executed:
#!/bin/bash
# Description: Find and verify all call sites to generate_input in the uploader benchmark module
# Search for calls to generate_input that might still use old parameter names
rg -n 'generate_input' --type py -- benchmarks/100.webapps/120.uploader/ -A 3 -B 3
# Search for imports or references to this module's generate_input
rg -n 'from.*120.uploader.*import\|import.*120.uploader' --type pyLength of output: 237
🏁 Script executed:
#!/bin/bash
# Corrected: Find all call sites to generate_input with proper rg syntax
# Search for all calls to generate_input in the entire codebase
rg -n 'generate_input\s*\(' --type py -A 5 -B 2
# Search for imports from 120.uploader module
rg -n 'from.*120\.uploader|import.*120\.uploader' --type pyLength of output: 18273
🏁 Script executed:
#!/bin/bash
# Check if 130.crud-api is part of the same change
git log -1 --oneline -- benchmarks/100.webapps/130.crud-api/input.py
# Check whether abstract method signature matches the two changed implementations
grep -A 7 "def generate_input" sebs/benchmark.py | head -20Length of output: 335
Critical: Abstract method signature mismatch with implementation.
The generate_input function signature in benchmarks/100.webapps/120.uploader/input.py:18-20 uses parameter names input_buckets, output_buckets, but the abstract method definition at sebs/benchmark.py:804-809 still declares input_paths, output_paths. This same inconsistency also exists in benchmarks/100.webapps/130.crud-api/input.py:8-10, while ~30 other benchmark implementations use the standard input_paths, output_paths signature.
Required fix: Update the abstract method signature in sebs/benchmark.py:804-809 to use input_buckets, output_buckets to match these implementations, or revert both 120.uploader and 130.crud-api to use the standard parameter names. The interface contract must be consistent across all implementations.
🧰 Tools
🪛 Ruff (0.14.5)
19-19: Unused function argument: data_dir
(ARG001)
19-19: Unused function argument: input_buckets
(ARG001)
19-19: Unused function argument: upload_func
(ARG001)
19-19: Unused function argument: nosql_func
(ARG001)
| class_idx = json.load(open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r")) | ||
| idx2label = [class_idx[str(k)][1] for k in range(len(class_idx))] | ||
| model = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix Tensor‑index bug when mapping prediction to label (and tighten JSON loading)
torch.max returns a Tensor index; using it directly to index a Python list (idx2label[index]) will raise a TypeError. Convert the index to a plain integer first. While here, it’s also cleaner to load imagenet_class_index.json via a context manager.
-SCRIPT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__)))
-class_idx = json.load(open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r"))
+SCRIPT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__)))
+with open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r") as f:
+ class_idx = json.load(f)
@@
- input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
- output = model(input_batch)
- _, index = torch.max(output, 1)
- ret = idx2label[index]
+ input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
+ output = model(input_batch)
+ _, index = torch.max(output, 1)
+ idx = index.item()
+ ret = idx2label[idx]This removes the runtime error and keeps behavior otherwise unchanged.
Also applies to: 78-79
🤖 Prompt for AI Agents
In benchmarks/400.inference/411.image-recognition/python/function.py around
lines 29-31 (and similarly lines 78-79) the code loads imagenet_class_index.json
without a context manager and later uses a torch tensor index directly to index
a Python list; open the JSON with a with-statement (using open(..., "r") as f
and json.load(f)) and when mapping a prediction to a label convert the tensor
index to an int (e.g., index = int(index_tensor) or index_tensor.item()) before
using idx2label[index]; keep all other logic unchanged.
| torch==2.5.1 | ||
| torchvision==0.20.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify torch 2.5.1 is the latest secure version.
torch 2.5.1 is newer than the vulnerable 2.4.1 used elsewhere in this PR, which is good. However, please verify:
- If 2.5.1 is the latest version or if a newer patch release is available
- Whether 2.5.1 addresses all the security vulnerabilities found in 2.4.1
- If torchvision 0.20.1 is compatible and secure
Run the following script to verify:
🏁 Script executed:
#!/bin/bash
# Check latest torch and torchvision versions
echo "=== Latest torch version ==="
curl -s https://pypi.org/pypi/torch/json | jq '.info.version'
echo "=== Latest torchvision version ==="
curl -s https://pypi.org/pypi/torchvision/json | jq '.info.version'
echo "=== Security advisories for torch 2.5.x ==="
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: PIP, package: "torch") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}' | jq '.data.securityVulnerabilities.nodes[] | select(.vulnerableVersionRange | contains("2.5"))'Length of output: 124
I need to gather additional information to complete the verification. Let me check security advisories for torch 2.4.1 vs 2.5.1 and version compatibility.
Upgrade torch and torchvision to versions with critical security patch.
torch 2.5.1 contains a critical Remote Code Execution vulnerability (CVE-2025-32434) in the torch.load() function when used with weights_only=True. Update to PyTorch 2.6.0 or higher, and correspondingly upgrade torchvision to 0.21.0 or later. torch 2.5.1 is compatible with torchvision 0.20.1, so current compatibility is not the issue, but the security vulnerability is critical and requires immediate patching.
| @@ -0,0 +1,4 @@ | |||
| numpy>=2.0 | |||
| pillow>=10.0 | |||
| torch==2.5.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify torch 2.5.1 is the latest secure version and standardize across files.
This file uses torch==2.5.1, which differs from torch==2.4.1 used in other requirements files in this PR. Ensure that 2.5.1 is the latest patched version and update all other requirements files to match this version for consistency.
🌐 Web query:
What is the latest stable version of PyTorch and does version 2.5.1 have any known security vulnerabilities?
💡 Result:
-
Latest stable PyTorch (as of Nov 23, 2025): PyTorch 2.9 (released Oct 2025). [1]
-
Does PyTorch 2.5.1 have known security vulnerabilities? Yes — PyTorch ≤ 2.5.1 is affected by a critical RCE vulnerability (CVE-2025-32434) in torch.load(...) when using weights_only=True; the issue was fixed in PyTorch 2.6.0. Update to ≥2.6.0 (or the current stable release) to mitigate. [2][3]
Sources:
- PyTorch 2.9 release announcement. [PyTorch blog]. [1]
- NVD / CVE-2025-32434 details (lists affected versions and fix in 2.6.0). [2]
- Additional vendor advisories summarizing the issue (e.g., INCIBE/IBM). [3]
Update torch to 2.6.0 or later immediately—torch 2.5.1 contains a critical RCE vulnerability.
PyTorch 2.5.1 is affected by a critical RCE vulnerability (CVE-2025-32434) in torch.load() when using weights_only=True, which was fixed in PyTorch 2.6.0. The latest stable version is PyTorch 2.9 (released Oct 2025). Update this file and all other requirements files using torch 2.4.1 to a secure version (≥2.6.0, preferably 2.9).
🤖 Prompt for AI Agents
In
benchmarks/400.inference/413.image-classification/python/requirements.txt.3.12
around line 3, the file pins torch to 2.5.1 which is vulnerable
(CVE-2025-32434); update the pin to a secure release (at minimum >=2.6.0,
preferably the latest stable such as 2.9.x) across this file and any other
requirements that reference torch 2.4.x/2.5.x, run dependency resolution and
tests to ensure compatibility, and if needed adjust any torch-specific code or
extra pins (CUDA builds, torchvision) to match the chosen secure torch version.
| @partial(jax.jit, static_argnums=(0,)) | ||
| def pressure_poisson_periodic(nit, p, dx, dy, b): | ||
| def body_func(p, q): | ||
| pn = p.copy() | ||
| p = p.at[1:-1, 1:-1].set( | ||
| ((pn[1:-1, 2:] + pn[1:-1, 0:-2]) * dy**2 + (pn[2:, 1:-1] + pn[0:-2, 1:-1]) * dx**2) | ||
| / (2 * (dx**2 + dy**2)) | ||
| - dx**2 * dy**2 / (2 * (dx**2 + dy**2)) * b[1:-1, 1:-1] | ||
| ) | ||
|
|
||
| # Periodic BC Pressure @ x = 2 | ||
| p = p.at[1:-1, -1].set( | ||
| ((pn[1:-1, 0] + pn[1:-1, -2]) * dy**2 + (pn[2:, -1] + pn[0:-2, -1]) * dx**2) | ||
| / (2 * (dx**2 + dy**2)) | ||
| - dx**2 * dy**2 / (2 * (dx**2 + dy**2)) * b[1:-1, -1] | ||
| ) | ||
|
|
||
| # Periodic BC Pressure @ x = 0 | ||
| p = p.at[1:-1, 0].set( | ||
| ( | ||
| ((pn[1:-1, 1] + pn[1:-1, -1]) * dy**2 + (pn[2:, 0] + pn[0:-2, 0]) * dx**2) | ||
| / (2 * (dx**2 + dy**2)) | ||
| - dx**2 * dy**2 / (2 * (dx**2 + dy**2)) * b[1:-1, 0] | ||
| ) | ||
| ) | ||
|
|
||
| # Wall boundary conditions, pressure | ||
| p = p.at[-1, :].set(p[-2, :]) # dp/dy = 0 at y = 2 | ||
| p = p.at[0, :].set(p[1, :]) # dp/dy = 0 at y = 0 | ||
|
|
||
| return p, None | ||
|
|
||
| p, _ = lax.scan(body_func, p, jnp.arange(nit)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pressure_poisson_periodic never returns the updated pressure field
pressure_poisson_periodic runs a lax.scan over nit iterations and computes an updated p, but the function doesn’t return it. As written, it returns None, and channel_flow ignores the result:
b = build_up_b(...)
pressure_poisson_periodic(nit, p, dx, dy, b) # result unused
# p is not mutated in-place; JAX arrays are immutableSo p in channel_flow is never updated by the Poisson solve, which is a correctness bug.
You likely want:
@partial(jax.jit, static_argnums=(0,))
def pressure_poisson_periodic(nit, p, dx, dy, b):
- def body_func(p, q):
+ def body_func(p, _):
pn = p.copy()
...
- return p, None
+ return p, None
@@
- p, _ = lax.scan(body_func, p, jnp.arange(nit))
-
+ p, _ = lax.scan(body_func, p, jnp.arange(nit))
+ return pAnd in channel_flow:
- b = build_up_b(rho, dt, dx, dy, u, v)
- pressure_poisson_periodic(nit, p, dx, dy, b)
+ b = build_up_b(rho, dt, dx, dy, u, v)
+ p = pressure_poisson_periodic(nit, p, dx, dy, b)This makes the pressure updates actually feed into the momentum equations.
🧰 Tools
🪛 Ruff (0.14.5)
79-79: Unused function argument: q
(ARG001)
🤖 Prompt for AI Agents
In benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py
around lines 77-110, pressure_poisson_periodic runs lax.scan to compute the
updated pressure but never returns it, so callers keep the old p; change the
function to return the final p from the scan (e.g. capture the output p and
return it), and update channel_flow to assign its p to the returned value (p =
pressure_poisson_periodic(nit, p, dx, dy, b)) so the updated pressure feeds into
the momentum solve.
| def handler(event): | ||
| size = event.get("size") | ||
| if "seed" in event: | ||
| import random | ||
|
|
||
| random.seed(event["seed"]) | ||
|
|
||
| seed = event.get("seed", 42) | ||
| seed = int(seed) | ||
|
|
||
| gen_begin = datetime.datetime.now() | ||
| alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed) | ||
| gen_end = datetime.datetime.now() | ||
|
|
||
| comp_begin = datetime.datetime.now() | ||
| y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100) | ||
| comp_end = datetime.datetime.now() | ||
|
|
||
| gen_us = (gen_end - gen_begin) / datetime.timedelta(microseconds=1) | ||
| comp_us = (comp_end - comp_begin) / datetime.timedelta(microseconds=1) | ||
|
|
||
| return { | ||
| "measurement": { | ||
| "generating_time": gen_us, | ||
| "compute_time": comp_us, | ||
| "gpu_time": gpu_ms, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined variable when seed is not provided.
Line 43 references seed, but this variable is only defined inside the if "seed" in event block (lines 34-40). If the event does not contain a "seed" key, seed will be undefined, causing a NameError at runtime.
Apply this diff to fix the issue:
def handler(event):
size = event.get("size")
+ seed = 42 # default seed
if "seed" in event:
import random
random.seed(event["seed"])
- seed = event.get("seed", 42)
- seed = int(seed)
+ seed = int(event["seed"])
gen_begin = datetime.datetime.now()
alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed)Note: The random.seed() call on line 37 appears redundant since PyTorch's RNG is seeded within initialize_torch.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def handler(event): | |
| size = event.get("size") | |
| if "seed" in event: | |
| import random | |
| random.seed(event["seed"]) | |
| seed = event.get("seed", 42) | |
| seed = int(seed) | |
| gen_begin = datetime.datetime.now() | |
| alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed) | |
| gen_end = datetime.datetime.now() | |
| comp_begin = datetime.datetime.now() | |
| y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100) | |
| comp_end = datetime.datetime.now() | |
| gen_us = (gen_end - gen_begin) / datetime.timedelta(microseconds=1) | |
| comp_us = (comp_end - comp_begin) / datetime.timedelta(microseconds=1) | |
| return { | |
| "measurement": { | |
| "generating_time": gen_us, | |
| "compute_time": comp_us, | |
| "gpu_time": gpu_ms, | |
| } | |
| } | |
| def handler(event): | |
| size = event.get("size") | |
| seed = 42 # default seed | |
| if "seed" in event: | |
| import random | |
| random.seed(event["seed"]) | |
| seed = int(event["seed"]) | |
| gen_begin = datetime.datetime.now() | |
| alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed) | |
| gen_end = datetime.datetime.now() | |
| comp_begin = datetime.datetime.now() | |
| y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100) | |
| comp_end = datetime.datetime.now() | |
| gen_us = (gen_end - gen_begin) / datetime.timedelta(microseconds=1) | |
| comp_us = (comp_end - comp_begin) / datetime.timedelta(microseconds=1) | |
| return { | |
| "measurement": { | |
| "generating_time": gen_us, | |
| "compute_time": comp_us, | |
| "gpu_time": gpu_ms, | |
| } | |
| } |
🧰 Tools
🪛 Ruff (0.14.5)
47-47: Unpacked variable y_out is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/602.axpy/python/function.py around lines 32 to
59, the variable `seed` can be undefined when the incoming event lacks a "seed"
key; to fix, initialize `seed` unconditionally at the top (e.g., seed =
int(event.get("seed", 42))) before any conditional, remove the redundant
import/random.seed usage inside the if-block (or only keep the import if needed
elsewhere), and ensure all references use this initialized `seed` so NameError
cannot occur when "seed" is omitted.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Critical security vulnerabilities in torch 2.4.1.
Static analysis has flagged 5 critical security vulnerabilities in torch 2.4.1, including remote code execution risks (GHSA-53q9-r3pm-6pq6), denial of service (GHSA-3749-ghw9-m3mg), and resource management issues (GHSA-887c-mr87-cxwp).
Consider upgrading to torch 2.5.1 or later, which appears to be used elsewhere in this PR (see benchmarks/400.inference/413.image-classification/python/requirements.txt.3.12). If torch 2.4.1 is required for specific compatibility reasons, document the rationale and ensure the vulnerable functionality is not exposed in the benchmark.
🌐 Web query:
What are the security vulnerabilities in PyTorch 2.4.1 and what is the latest patched version?
💡 Result:
Short answer
-
Known security issues affecting PyTorch 2.4.1:
- CVE-2024-48063 — deserialization (RemoteModule / torch.distributed.rpc) can lead to remote code execution (disputed as “intended” for distributed RPC, but tracked as a security issue). [1][2][3]
- Earlier heap buffer-overflow bug (CVE-2024-31580) affected versions before v2.2.0 and is not present in 2.4.1 (it was fixed in earlier 2.x releases). [4]
-
More recent related RCE (reported Apr 2025) — GHSA/CVE-2025-32434: torch.load(..., weights_only=True) RCE — affects <=2.5.1 and was patched in 2.6.0. That issue therefore also implies 2.4.1 is vulnerable to that finding. [5]
What to upgrade to
- Minimum patched release for the GHSA/CVE-2025-32434 fix: 2.6.0. [5]
- PyTorch’s current “stable” release (as shown on the PyTorch site) is 2.7.0 — upgrade to the latest stable (or at least ≥2.6.0) to obtain the patches and additional fixes. [6]
Sources
- NVD / CVE-2024-48063 (deserialization RCE, affects <=2.4.1). [1]
- Multiple vulnerability trackers / advisories summarizing CVE-2024-48063 (Wiz, Snyk, Rapid7, ResolvedSecurity). [2][3][7]
- CVE-2024-31580 (heap buffer overflow — fixed before 2.4.1). [4]
- PyTorch GitHub security advisory GHSA-53q9-r3pm-6pq6 / CVE-2025-32434 (torch.load weights_only=True RCE) — patched in 2.6.0. [5]
- PyTorch official site (current stable release listing). [6]
If you want, I can:
- give the exact pip/conda upgrade command for your environment, or
- fetch the specific upstream commits/PRs that backport each fix.
I need to verify the specific CVEs mentioned in the review and check if torch 2.5.1 is actually used elsewhere in the codebase.
Based on the search results, I can now identify critical inaccuracies in the original review comment:
Key Findings:
-
Incorrect CVE severity and count: The review claims "5 critical security vulnerabilities" but GHSA-3749-ghw9-m3mg and GHSA-887c-mr87-cxwp are both classified with LOW severity (CVSS 3.3), not critical.
-
Wrong affected versions: GHSA-3749-ghw9-m3mg (CVE-2025-2953) is described as found in PyTorch 2.6.0+cu124 and GHSA-887c-mr87-cxwp (CVE-2025-3730) is found in PyTorch 2.6.0, suggesting they may not affect 2.4.1 at all.
-
Critical security issue acknowledged: CVE-2024-48063 (the RemoteModule deserialization RCE) affects PyTorch ≤2.4.1 with CVSS 9.8 Critical. However, this is disputed by multiple parties because this is intended behavior in PyTorch distributed computing.
-
Incorrect upgrade recommendation: CVE-2025-32434 (torch.load RCE) affects PyTorch versions 2.5.1 and earlier, with users advised to update to version 2.6.0 or later. Upgrading to 2.5.1 as suggested in the review would not resolve this vulnerability.
Upgrade to torch 2.6.0 or later to address the confirmed RCE vulnerabilities affecting 2.4.1 and 2.5.1. Note that CVE-2024-48063 (the primary RCE in 2.4.1) is disputed as intended behavior for distributed RPC—verify whether RemoteModule deserialization is actually used in this benchmark.
🧰 Tools
🪛 OSV Scanner (2.2.4)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/604.cholesky/python/requirements.txt lines 1-1:
the review's upgrade advice and CVE counts are incorrect and the project may
still be vulnerable; verify CVE applicability and then either upgrade or
document mitigations. Search the repo for any occurrences of "torch==2.5.1" or
other pinned torch versions and for uses of torch.load, RemoteModule, RPC or
RemoteModule deserialization in this benchmark and related code; if any code
uses remote deserialization or torch.load on untrusted data, change the
requirement to torch>=2.6.0 (or unpinned >=2.6.0) to include fixes, run tests to
confirm compatibility, and add a comment in requirements or README explaining
the CVE mapping and why the chosen version was selected if you choose not to
upgrade.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical security vulnerabilities in torch 2.4.1.
This file uses torch==2.4.1, which has the same critical security vulnerabilities flagged in other requirements files in this PR. Upgrade to torch 2.5.1 or the latest patched version to address these security concerns and maintain consistency with requirements.txt.3.12.
🧰 Tools
🪛 OSV Scanner (2.2.4)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/605.lu/python/requirements.txt around lines 1 to
1, the file pins torch==2.4.1 which contains critical security vulnerabilities;
update the requirement to torch==2.5.1 (or the latest patched release used
elsewhere in the repo, e.g., requirements.txt.3.12) to resolve the
vulnerabilities, ensure consistency across requirements files, and run the test
suite/CI to verify no compatibility regressions.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Critical security vulnerabilities in torch 2.4.1.
The static analysis tool has flagged five critical security vulnerabilities in torch 2.4.1, including remote code execution via torch.load and denial of service issues. These are legitimate security concerns that should be addressed before release.
Please verify the latest patched version of PyTorch and update this dependency:
🌐 Web query:
What is the latest stable version of PyTorch and are there security patches for torch 2.4.1 vulnerabilities GHSA-53q9-r3pm-6pq6, GHSA-3749-ghw9-m3mg, GHSA-887c-mr87-cxwp, PYSEC-2024-259, and PYSEC-2025-41?
💡 Result:
Short answer
- Latest stable PyTorch (as of Nov 23, 2025): PyTorch 2.9.0. [1]
Status of the listed vulnerabilities (impacting torch 2.4.1)
- GHSA-53q9-r3pm-6pq6 / PYSEC-2025-41 (CVE-2025-32434): RCE via torch.load(..., weights_only=True). Patched in 2.6.0 (i.e., fixed in releases after 2.4.1). [2][3]
- PYSEC-2024-259: Deserialization RCE in RemoteModule affecting torch ≤ 2.4.1. Reported fixed in later releases (fixed in <2.5.0 / addressed in 2.5.x+). [4]
- GHSA-3749-ghw9-m3mg (CVE-2025-2953): local DoS in torch.mkldnn_max_pool2d. Patched in the 2.7.1 release candidate / fixed in 2.7.1+. [5]
- GHSA-887c-mr87-cxwp (CVE-2025-3730): improper resource shutdown (ctc_loss) — fixed in 2.8.0 (patched in the upstream commits and released in 2.8+). [6]
Recommendation
- Upgrade to the current stable release (2.9.0) or at minimum to the first fixed release that addresses the specific advisory relevant to you (2.6.0+ for the RCE advisories; 2.7.1+ and 2.8.0+ for the DoS/resource-release issues). Verify your deployment’s exact wheel/CUDA variant and check PyTorch release notes/SECURITY.md for any platform-specific patching instructions. [1][2][5][6]
Sources
[1] PyTorch 2.9 release blog (Oct 15, 2025). (pytorch.org)
[2] OSV / PYSEC-2025-41 / GHSA-53q9-r3pm-6pq6 (shows fixed version 2.6.0). (osv.dev)
[3] Debian / security tracker for CVE-2025-32434 (notes fix in 2.6.0). (security-tracker.debian.org)
[4] PYSEC-2024-259 advisory / resolved-security summary (affects <=2.4.1; fixed in later releases). (resolvedsecurity.com)
[5] GitHub Advisory GHSA-3749-ghw9-m3mg (CVE-2025-2953) — patched in 2.7.1-rc1 / fixed in 2.7.1+. (github.com)
[6] GHSA-887c-mr87-cxwp / CVE-2025-3730 (ctc_loss) — fixed in 2.8.0 per advisory/patch. (vulnerability.circl.lu)
If you want, I can (a) give the exact pip install commands to upgrade to 2.9.0 for your OS/CUDA, or (b) scan a requirements.txt to show whether your environment includes any of the vulnerable versions.
Citations:
- 1: https://pytorch.org/blog/pytorch-2-9/?utm_source=openai
- 2: https://osv.dev/PYSEC-2025-41?utm_source=openai
- 3: https://security-tracker.debian.org/tracker/CVE-2025-32434?utm_source=openai
- 4: https://www.resolvedsecurity.com/vulnerability-catalog/PYSEC-2024-259?utm_source=openai
- 5: https://github.com/advisories/GHSA-3749-ghw9-m3mg?utm_source=openai
- 6: https://vulnerability.circl.lu/vuln/ghsa-887c-mr87-cxwp?utm_source=openai
Update torch from 2.4.1 to address five critical security vulnerabilities.
The five flagged vulnerabilities are confirmed as critical and affecting torch 2.4.1: GHSA-53q9-r3pm-6pq6 / PYSEC-2025-41 (RCE via torch.load; fixed in 2.6.0), PYSEC-2024-259 (deserialization RCE; fixed in 2.5.x+), GHSA-3749-ghw9-m3mg (local DoS; fixed in 2.7.1+), and GHSA-887c-mr87-cxwp (resource shutdown issue; fixed in 2.8.0+).
Update benchmarks/600.linearalgebra/606.spmv/python/requirements.txt to PyTorch 2.9.0 (the latest stable version) or at minimum 2.8.0 to address all five vulnerabilities in a single upgrade.
🧰 Tools
🪛 OSV Scanner (2.2.4)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/606.spmv/python/requirements.txt lines 1-1, the
pinned torch version 2.4.1 contains multiple critical security vulnerabilities;
update the torch entry to a secure release (preferably torch==2.9.0, or at
minimum torch==2.8.0) to obtain the fixes for the listed CVEs, then run
pip-compile / pip install -r requirements.txt or update your lockfile and CI
dependency checks accordingly to ensure the new version is used in builds.
…xuan Feature/matmul microbenchmark/yuxuan
add vector-add & host-device-copy microbenchmark written in pytorch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (14)
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.8 (1)
1-3: Consider pinning torch/torchvision for reproducible vector-add results on Python 3.8.You already pin numpy to 1.24.0; leaving torch and torchvision floating can make benchmark behavior depend on whatever PyPI exposes at install time, and may break if future releases change CUDA/ABI expectations. It’s worth deciding and documenting specific torch/torchvision versions for this env (even if only in a comment) and pinning accordingly.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.10 (1)
1-3: Align dependency strategy across Python versions and consider pinning torch/torchvision.Here numpy is pinned to 1.26.4 (vs 1.24.0 in the 3.7–3.9 manifests), while torch/torchvision are left unpinned. Please confirm:
- That the numpy 1.24 vs 1.26 split across Python versions is intentional for host-device-copy benchmarks.
- Whether you want fixed torch/torchvision versions for 3.10 to keep results comparable over time.
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.9 (1)
1-3: Reproducibility and cross-version numpy choices for vector-add (Python 3.9).As with the 3.8 manifest, numpy is pinned (1.24.0) while torch/torchvision are not. Please:
- Consider pinning torch/torchvision to explicit versions used in your benchmark environment.
- Double-check that keeping numpy at 1.24.0 for 3.9 but 1.26.x for newer Pythons is an intentional trade-off for this microbenchmark.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.11 (1)
1-3: Confirm host-device-copy dependency policy for Python 3.11.This mirrors the 3.10 setup (numpy==1.26.4, unpinned torch/torchvision). To avoid surprises:
- Verify that 1.26.4 is the desired baseline numpy for 3.11 benchmarks.
- Consider explicitly pinning torch/torchvision to the versions you actually benchmark, so future PyPI updates don’t silently change performance characteristics.
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt (1)
1-2: Clarify role of this base requirements file vs the Python-versioned manifests.Since the 3.x-specific requirement files pin numpy and (presumably) target specific environments, it would help to:
- Document whether this base file is only for ad-hoc/local runs where floating torch/torchvision versions are acceptable.
- Or, if it’s part of CI/benchmarking as well, consider pinning torch/torchvision here too for consistency.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt (1)
1-2: Document intended usage of unversioned host-device-copy requirements.Given that the Python-versioned files introduce numpy pins and target specific runtimes, please clarify whether this unversioned requirements.txt is:
- Just a convenience for local experimentation (where floating torch/torchvision is fine), or
- Also used for official runs, in which case pinning torch/torchvision would improve reproducibility.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.7 (1)
1-3: Validate 3.7 support and pinning choices for host-device-copy.For a Python 3.7–specific env, please double-check:
- That numpy==1.24.0, torch, and torchvision still have supported/available wheels for 3.7 in your target platforms.
- Whether you want to keep 3.7 as a supported benchmark target at all, or drop it if the stack is no longer easily installable.
As elsewhere, consider pinning torch/torchvision to concrete versions once you settle on a known-good combo.benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.9 (1)
1-3: Host-device-copy on Python 3.9: confirm numpy split and pin torch/torchvision if needed.This matches the 3.7/3.8 manifests (numpy==1.24.0, unpinned torch/torchvision). Please confirm:
- That keeping 3.9 on numpy 1.24.x while newer Pythons use 1.26.x is intentional for your comparisons.
- Whether you want to fix torch/torchvision versions here to avoid future PyPI changes silently affecting benchmark numbers.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.8 (1)
1-3: Consider pinningtorchandtorchvisionversions for reproducible benchmarks
numpy==1.24.0is fine for a Python 3.8 environment, but leavingtorchandtorchvisionunpinned means behavior and performance characteristics of this microbenchmark can change over time as upstream releases evolve (and may introduce silent incompatibilities).For a benchmarking suite, it’s usually worth:
- Pinning specific
torch/torchvisionversions known to work well withnumpy==1.24.0and your CUDA stack.- Keeping those pins aligned across the various versioned
requirements.txt.*files where feasible.Please verify compatible version triples (Python 3.8, NumPy 1.24.0, torch, torchvision) against the current library docs/release notes before pinning.
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.11 (1)
1-3: Consider pinning torch and torchvision versions for reproducibility.While numpy is pinned, torch and torchvision are unpinned. For benchmark reproducibility across runs and environments, consider pinning specific versions.
benchmarks/000.microbenchmarks/050.peak-performance/python/function.py (1)
17-24: Simplify redundant seed retrieval.Line 21 uses
event.get("seed")when we already know"seed" in eventis true. Also, the retrieved value is immediately overwritten by converting to int on line 22.if "seed" in event: import random - random.seed(event["seed"]) - seed = event.get("seed") - seed = int(seed) + seed = int(event["seed"]) + random.seed(seed) torch.manual_seed(seed) torch.cuda.manual_seed_all(seed)benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py (3)
55-59: Unusedsyncvariable.The
syncfunction fromget_device_and_syncis never used. Prefix with underscore to indicate intentional discard.def initialize(size, device_str="cuda"): - device, sync = get_device_and_sync(device_str) + device, _ = get_device_and_sync(device_str) x = torch.randn(size, device=device, dtype=torch.float32) y = torch.randn(size, device=device, dtype=torch.float32) return x, y
8-19: Duplicatedget_device_and_syncfunction.This function is identical to the one in
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py. Consider extracting to a shared utility module to reduce duplication.
119-120: Use English for code comments.The comment "回退到 CPU" should be in English for consistency with the rest of the codebase.
- # 回退到 CPU + # Fall back to CPU
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
benchmarks/000.microbenchmarks/050.peak-performance/config.json(1 hunks)benchmarks/000.microbenchmarks/050.peak-performance/input.py(1 hunks)benchmarks/000.microbenchmarks/050.peak-performance/python/function.py(1 hunks)benchmarks/000.microbenchmarks/050.peak-performance/python/requirements.txt(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/config.json(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/init.sh(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/package.sh(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.10(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.11(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.6(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.7(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.8(1 hunks)benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.9(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/config.json(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/input.py(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/init.sh(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/package.sh(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.10(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.11(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.6(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.7(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.8(1 hunks)benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.9(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.6
- benchmarks/000.microbenchmarks/0xx.vector-add/config.json
- benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.10
- benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.6
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T16:15:01.811Z
Learnt from: mcopik
Repo: spcl/serverless-benchmarks PR: 99
File: benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp:11-0
Timestamp: 2025-06-19T16:15:01.811Z
Learning: In the serverless-benchmarks project, input validation for benchmark functions is not required since the benchmarks are invoked directly by the maintainers who control the input values. This applies to functions like the sleep benchmark in benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp.
Applied to files:
benchmarks/000.microbenchmarks/050.peak-performance/input.py
🧬 Code graph analysis (2)
benchmarks/000.microbenchmarks/0xx.vector-add/input.py (1)
benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py (2)
buckets_count(4-5)generate_input(8-11)
benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py (1)
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py (5)
get_device_and_sync(5-16)sync(13-14)initialize(19-24)_run_once(27-70)handler(73-117)
🪛 OSV Scanner (2.3.0)
benchmarks/000.microbenchmarks/050.peak-performance/python/requirements.txt
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🪛 Ruff (0.14.8)
benchmarks/000.microbenchmarks/0xx.vector-add/input.py
9-9: Unused function argument: data_dir
(ARG001)
9-9: Unused function argument: benchmarks_bucket
(ARG001)
9-9: Unused function argument: input_paths
(ARG001)
9-9: Unused function argument: output_paths
(ARG001)
9-9: Unused function argument: upload_func
(ARG001)
9-9: Unused function argument: nosql_func
(ARG001)
benchmarks/000.microbenchmarks/050.peak-performance/input.py
9-9: Unused function argument: data_dir
(ARG001)
9-9: Unused function argument: benchmarks_bucket
(ARG001)
9-9: Unused function argument: input_paths
(ARG001)
9-9: Unused function argument: output_paths
(ARG001)
9-9: Unused function argument: upload_func
(ARG001)
9-9: Unused function argument: nosql_func
(ARG001)
benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py
9-9: Unused function argument: data_dir
(ARG001)
9-9: Unused function argument: benchmarks_bucket
(ARG001)
9-9: Unused function argument: input_paths
(ARG001)
9-9: Unused function argument: output_paths
(ARG001)
9-9: Unused function argument: upload_func
(ARG001)
9-9: Unused function argument: nosql_func
(ARG001)
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py
75-75: Avoid specifying long messages outside the exception class
(TRY003)
benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py
32-32: Unused function argument: num_elems
(ARG001)
56-56: Unpacked variable sync is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
97-97: Avoid specifying long messages outside the exception class
(TRY003)
🪛 Shellcheck (0.11.0)
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/package.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
benchmarks/000.microbenchmarks/0xx.vector-add/python/package.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
🔇 Additional comments (9)
benchmarks/000.microbenchmarks/0xx.host-device-copy/config.json (1)
1-6: Verify configuration values are appropriate for the host-device-copy benchmark workload.The JSON structure is valid and follows the expected format. However, the configuration values require verification:
- Memory (line 3): 128 (units assumed MB?) seems low for a host-device-copy benchmark. Verify this is sufficient for the intended data transfer workload.
- Languages (line 4): NodeJS support for a GPU host-device-copy benchmark is atypical. Confirm NodeJS is necessary or if this should be Python-only or include compiled languages like C/C++.
Also note: This file is a host-device-copy microbenchmark config, which appears unrelated to the PR title "video watermarking." Confirm this file is intentionally part of this PR.
benchmarks/000.microbenchmarks/0xx.vector-add/python/init.sh (1)
1-3: No-op init script looks goodShebang is present and making this a no-op script with a clear comment is perfectly fine for this benchmark.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/init.sh (1)
1-3: Init hook is a clean no-opThe bash shebang plus explanatory comment is sufficient here; no issues from a correctness or maintainability standpoint.
benchmarks/000.microbenchmarks/050.peak-performance/config.json (1)
1-6: Config JSON is well-formed and consistentThe timeout, memory, languages, and modules fields are valid and align with the usual microbenchmark config structure. Nothing blocking here.
benchmarks/000.microbenchmarks/0xx.vector-add/input.py (1)
1-11: LGTM!The implementation follows the established pattern for input generators in this benchmark suite. The unused function parameters are part of a common interface contract used across benchmarks.
benchmarks/000.microbenchmarks/050.peak-performance/input.py (1)
1-11: LGTM!The implementation follows the established input generator pattern. The "sleep" key appropriately reflects the benchmark's purpose.
benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py (1)
1-11: LGTM!Consistent with other input generators in the benchmark suite.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py (2)
36-38: Transfer loop allocates new tensors each iteration.The
.to()call returns a new tensor each iteration, which gets discarded. This measures allocation + transfer overhead rather than pure transfer bandwidth. If pure transfer is intended, consider usingcopy_()instead:for _ in range(iters): device_tensor.copy_(host_tensor, non_blocking=True)
73-116: LGTM on handler logic.The CUDA fallback handling with specific error message detection and graceful CPU fallback is well-structured.
| # Warm up | ||
| for _ in range(10): | ||
| _ = torch.matmul(A, B) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing synchronization after warm-up may skew timing measurements.
The warm-up loop doesn't have a sync() call after it. This means warm-up operations may still be in-flight when the timed section begins, causing measurement inaccuracy.
Apply this diff:
# Warm up
for _ in range(10):
_ = torch.matmul(A, B)
+ sync()
matmul_begin = datetime.datetime.now()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Warm up | |
| for _ in range(10): | |
| _ = torch.matmul(A, B) | |
| # Warm up | |
| for _ in range(10): | |
| _ = torch.matmul(A, B) | |
| sync() |
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/050.peak-performance/python/function.py around
lines 34 to 36, the warm-up loop issues asynchronous CUDA ops that can still be
in-flight when timing starts; add a synchronization after the warm-up (e.g.,
call torch.cuda.synchronize() when CUDA is available) so all GPU work completes
before the timed section begins; if you need portability, guard the call with
torch.cuda.is_available() (or a no-op for CPU) to avoid errors.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
GHSA-3749-ghw9-m3mg GHSA-53q9-r3pm-6pq6 GHSA-887c-mr87-cxwp PyTorch 2.4.1
💡 Result:
Summary — impact on PyTorch 2.4.1
-
GHSA-3749-ghw9-m3mg (CVE-2025-2953): local Denial-of-Service in torch.mkldnn_max_pool2d. Affects versions older than 2.7.1-rc1 — so 2.4.1 is vulnerable. Patch in 2.7.1-rc1. [1]
-
GHSA-53q9-r3pm-6pq6 (CVE-2025-32434): remote code execution when loading models with torch.load(..., weights_only=True). Patched in 2.6.0; versions ≤2.5.1 (and therefore 2.4.1) are vulnerable. Upgrade immediately. [2]
-
GHSA-887c-mr87-cxwp (CVE-2025-3730): local Denial-of-Service in torch.nn.functional.ctc_loss. Affects versions before 2.8.0 — 2.4.1 is vulnerable. Fixed in 2.8.0. [3]
Recommendation: Do not run PyTorch 2.4.1 on untrusted inputs/models. Upgrade to a patched release (at least 2.8.0 to cover GHSA-887c, and ≥2.6.0 to cover the RCE GHSA-53q9). Audit any models you load from untrusted sources and follow PyTorch security guidance.
Sources: [1] GitHub advisory GHSA-3749-ghw9-m3mg; [2] GitHub advisory GHSA-53q9-r3pm-6pq6; [3] GitHub advisory GHSA-887c-mr87-cxwp.
🌐 Web query:
PyTorch 2.4.1 security vulnerabilities critical
💡 Result:
Short answer: Yes — PyTorch 2.4.1 is affected by multiple security issues including at least one critical RCE-class vulnerability; you should treat 2.4.1 as vulnerable and upgrade or apply mitigations.
Key items and actions
-
CVE-2025-32434 / GHSA-53q9-r3pm-6pq6 — Remote code execution when loading models with torch.load(..., weights_only=True). Affects PyTorch ≤ 2.5.1 (so includes 2.4.1). Patched in 2.6.0. Published Apr 18, 2025. Mitigation: upgrade to PyTorch ≥ 2.6.0 or avoid loading untrusted models; follow the upstream advisory. [1][2]
-
CVE-2024-48063 — RemoteModule deserialization RCE reported for PyTorch ≤ 2.4.1 (disputed by some as “intended” distributed behavior). Treat as high risk for exposed distributed setups; follow vendor guidance or upgrade. (reported Oct 29, 2024). [3]
-
CVE-2024-31580 and related memory/out-of-bounds issues — earlier DoS/out-of-bounds bugs were fixed in releases around 2.4.1 or later; ensure you’re on a fixed release series. [4]
What to do now (concise)
- If possible, upgrade to the latest PyTorch release (at least 2.6.0 to address the torch.load RCE). Verify release notes/security advisories from the PyTorch repo before upgrade. [1][2]
- If you cannot upgrade immediately, do not load models from untrusted sources; disable or restrict distributed RPC/RemoteModule features and torch.load(weights_only=True) usage where feasible. [1][3]
- Monitor official PyTorch security advisories and your OS/vendor security bulletins for backported fixes (Debian/Ubuntu/RHEL may have different package versions). [2][4]
Sources
- PyTorch GitHub security advisory / GHSA for torch.load RCE (CVE-2025-32434). [1]
- IBM security bulletin referencing CVE-2025-32434 (lists affected versions and patched version 2.6.0). [2]
- CVE entries / vulnerability databases for CVE-2024-48063 (RemoteModule deserialization RCE). [3]
- Debian / Snyk trackers for memory/DoS CVEs (e.g., CVE-2024-31580) and fixed-package notes. [4]
Would you like me to: (A) fetch the exact PyTorch release notes and advisory links so you can upgrade safely, or (B) check whether your OS/package manager has a backport available?
🏁 Script executed:
# Check if torch.load is used in the benchmark code
find benchmarks -type f -name "*.py" | head -20Repository: spcl/serverless-benchmarks
Length of output: 1271
🏁 Script executed:
# Search for torch.load usage in benchmarks
rg "torch\.load" benchmarks/ --type py -A 2
# Also search for ctc_loss usage
rg "ctc_loss" benchmarks/ --type py -A 2
# Check the peak-performance benchmark specifically
cat benchmarks/000.microbenchmarks/050.peak-performance/python/function.pyRepository: spcl/serverless-benchmarks
Length of output: 2444
Upgrade torch==2.4.1 to a patched version due to confirmed critical vulnerabilities
All three reported GHSA IDs are confirmed:
- GHSA-53q9-r3pm-6pq6 (CVE-2025-32434): Remote code execution via
torch.loadwithweights_only=True— affects ≤2.5.1, fixed in 2.6.0 - GHSA-3749-ghw9-m3mg (CVE-2025-2953): Denial-of-service in
torch.mkldnn_max_pool2d— affects <2.7.1-rc1 - GHSA-887c-mr87-cxwp (CVE-2025-3730): Denial-of-service in
torch.nn.functional.ctc_loss— affects <2.8.0
While the peak-performance benchmark itself does not directly use torch.load or the affected APIs, pinning a vulnerable version is poor security practice. Upgrade to at least 2.6.0 (to address the critical RCE), or preferably 2.8.0 (to address all three). Also verify and update torch pins in other benchmarks (e.g., 400.inference/411 and 413) that do use torch.load.
🧰 Tools
🪛 OSV Scanner (2.3.0)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/050.peak-performance/python/requirements.txt
around line 1, the pinned torch==2.4.1 contains known critical vulnerabilities;
update the pin to at least 2.6.0 (preferably 2.8.0 to address all reported CVEs)
and run any tests to ensure compatibility, and also search for and update other
benchmark requirements that pin torch (e.g., benchmarks/400.inference/411 and
413) to the same safe minimum version.
| # Stripping package code is based on https://github.com/ryfeus/lambda-packs repo | ||
|
|
||
| PACKAGE_DIR=$1 | ||
| echo "Original size $(du -sh $1 | cut -f1)" | ||
|
|
||
| CUR_DIR=$(pwd) | ||
| cd $1 | ||
| # cleaning libs | ||
| rm -rf external | ||
| find . -type d -name "tests" -exec rm -rf {} + | ||
| find . -type d -name "test" -exec rm -rf {} + | ||
| find . -type d -name "bin" -not -path "*/torch/*" -exec rm -rf {} + | ||
|
|
||
| # cleaning | ||
| # stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure | ||
| find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip | ||
| find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip | ||
|
|
||
| rm -r pip >/dev/null | ||
| rm -r pip-* >/dev/null | ||
| rm -r wheel >/dev/null | ||
| rm -r wheel-* >/dev/null | ||
| rm easy_install.py >/dev/null | ||
| find . -name \*.pyc -delete | ||
| cd ${CUR_DIR} | ||
| echo "Stripped size $(du -sh $1 | cut -f1)" | ||
|
|
||
| TORCH_DIR=".python_packages/lib/site-packages/torch" | ||
| if [ -d "$1/${TORCH_DIR}" ]; then | ||
| cd $1 | ||
| zip -qr torch.zip ${TORCH_DIR} | ||
| rm -rf ${TORCH_DIR} | ||
| cd ${CUR_DIR} | ||
| echo "Torch-zipped size $(du -sh $1 | cut -f1)" | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apply the same hardening as in 0xx.vector-add/python/package.sh (cd guards, quoting, safer find|xargs)
This script has the same structure and issues as the 0xx.vector-add package.sh:
- Guard all
cdcalls (cd "$1" || exit 1,cd "$CUR_DIR" || exit 1) so that failure to change into the package directory cannot causerm -rf external/find ... -exec rm -rfto run in the wrong place. - Either use
PACKAGE_DIR=$1consistently instead of$1or remove the unused variable. - Quote variable expansions used as paths and arguments.
- Prefer
find ... -exec strip {} +overfind ... | xargs stripto handle spaces/newlines in filenames and avoid callingstripwith no args. - Optionally add a shebang and tighten error silencing on the
rmcalls, as in the earlier comment.
You might also consider sharing a single common packaging script between these microbenchmarks to avoid duplicating this logic. As per static analysis hints, this file surfaces the same shellcheck warnings.
🧰 Tools
🪛 Shellcheck (0.11.0)
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.host-device-copy/python/package.sh lines
1-35, the script lacks safe cd guards, inconsistent variable use, unquoted
expansions, and uses find | xargs which breaks on spaces and can call strip with
no args; update to use PACKAGE_DIR consistently, add guards like cd
"$PACKAGE_DIR" || exit 1 and cd "$CUR_DIR" || exit 1, quote all path and
variable expansions, replace piped xargs invocations with find ... -exec strip
{} + (and similarly for *.so.*), tighten rm error silencing (e.g. >&/dev/null
only for expected failures) or check existence before rm, and optionally add a
shebang at the top; these changes mirror the hardening in
0xx.vector-add/python/package.sh.
| @triton.jit | ||
| def vector_add_kernel(x_ptr, y_ptr, out_ptr, n, BLOCK_SIZE: tl.constexpr): | ||
| pid = tl.program_id(0) | ||
| offsets = pid * BLOCK_SIZE + tl.arange(0, BLOCK_SIZE) | ||
| mask = offsets < n | ||
| x = tl.load(x_ptr + offsets, mask=mask) | ||
| y = tl.load(y_ptr + offsets, mask=mask) | ||
| tl.store(out_ptr + offsets, x + y, mask=mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triton kernel is defined but never invoked.
The vector_add_kernel is a decorated Triton JIT kernel that's never called. The actual computation in vector_add() at line 45 uses PyTorch's native x + y instead. Either remove the dead code or integrate the Triton kernel.
If Triton execution is intended, invoke the kernel in vector_add():
def vector_add(num_elems, iters=100, device_str="cuda", x=None, y=None):
device = torch.device(device_str)
x = x.to(device)
y = y.to(device)
out = torch.empty_like(x)
BLOCK_SIZE = 1024
grid = (triton.cdiv(num_elems, BLOCK_SIZE),)
if device.type == "cuda":
torch.cuda.synchronize()
for _ in range(iters):
vector_add_kernel[grid](x, y, out, num_elems, BLOCK_SIZE)
if device.type == "cuda":
torch.cuda.synchronize()
return out🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py around lines
22 to 29, the Triton JIT kernel vector_add_kernel is defined but never used;
update the vector_add function to call this kernel when running on CUDA (compute
grid = (triton.cdiv(num_elems, BLOCK_SIZE),) and launch
vector_add_kernel[grid](x, y, out, num_elems, BLOCK_SIZE) inside the timing loop
with proper torch.cuda.synchronize() before and after the loop), or if Triton is
not intended, remove the unused kernel definition to eliminate dead code.
| # Stripping package code is based on https://github.com/ryfeus/lambda-packs repo | ||
|
|
||
| PACKAGE_DIR=$1 | ||
| echo "Original size $(du -sh $1 | cut -f1)" | ||
|
|
||
| CUR_DIR=$(pwd) | ||
| cd $1 | ||
| # cleaning libs | ||
| rm -rf external | ||
| find . -type d -name "tests" -exec rm -rf {} + | ||
| find . -type d -name "test" -exec rm -rf {} + | ||
| find . -type d -name "bin" -not -path "*/torch/*" -exec rm -rf {} + | ||
|
|
||
| # cleaning | ||
| # stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure | ||
| find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip | ||
| find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip | ||
|
|
||
| rm -r pip >/dev/null | ||
| rm -r pip-* >/dev/null | ||
| rm -r wheel >/dev/null | ||
| rm -r wheel-* >/dev/null | ||
| rm easy_install.py >/dev/null | ||
| find . -name \*.pyc -delete | ||
| cd ${CUR_DIR} | ||
| echo "Stripped size $(du -sh $1 | cut -f1)" | ||
|
|
||
| TORCH_DIR=".python_packages/lib/site-packages/torch" | ||
| if [ -d "$1/${TORCH_DIR}" ]; then | ||
| cd $1 | ||
| zip -qr torch.zip ${TORCH_DIR} | ||
| rm -rf ${TORCH_DIR} | ||
| cd ${CUR_DIR} | ||
| echo "Torch-zipped size $(du -sh $1 | cut -f1)" | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harden package.sh: guard cd, quote args, and make find|xargs strip safer
The overall logic matches other packaging scripts, but there are a few robustness/safety issues:
cd $1(Line 7) and latercdcalls (Lines 30, 33) run without|| exit. If$1is missing/invalid orcdfails, the subsequentrm -rf externalandfind ... -exec rm -rf {}will operate in the wrong directory, which is dangerous. Prefer e.g.:PACKAGE_DIR=$1 cd "$PACKAGE_DIR" || exit 1 ... cd "$CUR_DIR" || exit 1
PACKAGE_DIR=$1(Line 3) is currently unused; either use it consistently instead of$1or drop the assignment.- Several parameter expansions should be quoted for safety:
du -sh "$PACKAGE_DIR"/"$1",cd "$PACKAGE_DIR",cd "$CUR_DIR",zip -qr torch.zip "$TORCH_DIR", etc. find ... | xargs strip(Lines 16–17) will misbehave with paths containing spaces or newlines and may callstripwith no arguments. Consider:find . -name '*.so' -not -path '*/PIL/*' -not -path '*/Pillow.libs/*' -not -path '*libgfortran*' -exec strip {} + find . -name '*.so.*' -not -path '*/PIL/*' -not -path '*/Pillow.libs/*' -not -path '*libgfortran*' -exec strip {} +
- Optional: add a shebang (
#!/bin/bash) so tools and callers know which shell is expected, and silencermerrors fully with2>/dev/nullif you don’t care about missing paths.
Addressing the cd handling is the most important to avoid accidental deletion outside the target package. As per static analysis hints, these are the same issues shellcheck is surfacing.
🧰 Tools
🪛 Shellcheck (0.11.0)
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.
(SC2038)
[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.vector-add/python/package.sh around lines
1 to 35, the script uses unguarded cd calls, unquoted parameters, and unsafe
find|xargs strips which can cause destructive operations in the wrong directory
or fail on paths with spaces; update to use the PACKAGE_DIR variable everywhere
(e.g. cd "$PACKAGE_DIR" || exit 1 and du -sh "$PACKAGE_DIR"), quote all
parameter expansions and paths (e.g. "$CUR_DIR", "$TORCH_DIR"), replace find |
xargs strip with find ... -exec strip {} + to handle spaces safely, and ensure
returning to CUR_DIR uses cd "$CUR_DIR" || exit 1; optionally add a shebang
(#!/bin/bash) at the top and redirect rm errors to /dev/null (rm -r pip
2>/dev/null) if you want silent removals.
| numpy==1.24.0 | ||
| torch | ||
| torchvision No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
last NumPy version supports Python 3.7 official compatibility matrix
💡 Result:
The last NumPy release series that officially supported Python 3.7 was NumPy 1.21 (latest patch: 1.21.6). NumPy 1.22.0 dropped Python 3.7 support. [1][2][3]
Sources:
- NumPy 1.21 release notes (supports Python 3.7–3.9). [1]
- NumPy 1.21.6 release notes (last 1.21 patch). [2]
- NumPy 1.22.0 release notes (Python 3.7 dropped; supports 3.8–3.10). [3]
🌐 Web query:
torch torchvision Python 3.7 compatibility latest versions
💡 Result:
Short answer: The newest PyTorch/TorchVision that officially support Python 3.7 are from the 1.13.x series — e.g. torch 1.13 (1.13.0/1.13.1) with the matching torchvision 0.14.x; PyTorch 2.0+ and the 2.x torchvision releases require Python >=3.8/3.9. [1][2][3]
Sources:
- PyTorch release compatibility matrix (shows 1.13 supports Python >=3.7, 2.x require >=3.8/3.9+). [1]
- torchvision release metadata / PyPI (shows torchvision 0.14.x pairs with torch 1.13 and requires Python >=3.7.2 for that pairing). [2]
- PyTorch “Get Started” / installation docs (current 2.x builds list supported Python versions >=3.8/3.9+). [3]
If you want, tell me your exact Python 3.7 micro version (e.g. 3.7.9) and OS/CUDA target and I’ll give the exact pip/conda install command for the latest 1.13.x-wheel that matches it.
numpy==1.24.0 is incompatible with Python 3.7 — downgrade and pin torch/torchvision
numpy==1.24.0dropped support for Python 3.7 in favor of 3.8+. The last NumPy version supporting Python 3.7 is1.21.6. Using 1.24.0 in a Python 3.7 environment will cause installation failures.torchandtorchvisionare unpinned. For Python 3.7 compatibility, pin totorch==1.13.1withtorchvision==0.14.1(PyTorch 2.0+ requires Python 3.8+). Unpinned versions will pull incompatible releases and break reproducibility.
Update to:
numpy==1.21.6
torch==1.13.1
torchvision==0.14.1
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.7
around lines 1-3, numpy==1.24.0 is incompatible with Python 3.7 and
torch/torchvision are unpinned; replace the three lines so the file pins the
last NumPy supporting 3.7 and compatible PyTorch builds by changing to
numpy==1.21.6, torch==1.13.1, and torchvision==0.14.1 to ensure successful
installs and reproducible environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (2)
34-38: NaN fps values will bypass the fallback check.As flagged in the previous review,
cap.get(cv2.CAP_PROP_FPS)can return NaN with certain containers or drivers. The conditionif not fps or fps <= 0will not catch NaN, causingint(fps * duration)to fail at runtime.Apply this diff to add NaN validation:
+import math ... fps = cap.get(cv2.CAP_PROP_FPS) - if not fps or fps <= 0: + if not fps or fps <= 0 or math.isnan(fps): fps = 25.0 # fallback
149-149: Critical: Path traversal vulnerability via user-controlledkey.As flagged in the previous review, constructing
download_path = f"/tmp/{key}"without sanitization allows directory traversal attacks. Ifkeycontains path separators (../, etc.), it can escape/tmpor create nested directories, leading to file overwrites or crashes.Apply this diff to sanitize the key:
- download_path = f"/tmp/{key}" + # Restrict to basename to prevent directory traversal + safe_key = os.path.basename(key) + download_path = os.path.join("/tmp", safe_key)
🧹 Nitpick comments (2)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (2)
41-47: Loop variableiis unused.The loop counter is never referenced in the loop body. Use
_to indicate an intentionally unused variable.Apply this diff:
- for i in range(max_frames): + for _ in range(max_frames): ret, frame_bgr = cap.read()
56-56: Unpacked variablesTandCare unused.Prefix unused variables with underscores to clarify they're intentionally ignored.
Apply this diff:
- T, H, W, C = video_np.shape + _T, H, W, _C = video_np.shape
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (1)
benchmarks/100.webapps/120.uploader/nodejs/function.js (3)
bucket(18-18)output_prefix(19-19)download_path(22-22)
🪛 Ruff (0.14.8)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py
32-32: Avoid specifying long messages outside the exception class
(TRY003)
41-41: Loop control variable i not used within loop body
Rename unused i to _i
(B007)
52-52: Avoid specifying long messages outside the exception class
(TRY003)
55-55: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?
(RUF003)
56-56: Unpacked variable T is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
56-56: Unpacked variable C is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
87-87: Probable insecure usage of temporary file or directory: "/tmp/processed-"
(S108)
92-92: Avoid specifying long messages outside the exception class
(TRY003)
145-147: Avoid specifying long messages outside the exception class
(TRY003)
149-149: Probable insecure usage of temporary file or directory: "/tmp/"
(S108)
🔇 Additional comments (4)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (4)
61-80: LGTM! GPU processing logic is well-structured.The device selection, configurable iteration count, and use of
torch.no_grad()for inference are all appropriate. The GPU workload design effectively simulates a realistic compute-heavy FaaS scenario.
104-110: LGTM! Operation registry provides clean extensibility.The operation mapping design allows easy addition of new processing functions in the future.
153-159: FFmpeg chmod handling is defensive and appropriate.The graceful error handling (via
try/except OSError) ensures the code works whether or not the ffmpeg binary exists, which is good for maintaining compatibility across different deployment configurations.
161-200: LGTM! Download/compute/upload orchestration is well-implemented.The three-phase workflow with separate timing measurements provides excellent observability for benchmarking. The microsecond-precision timing and size tracking are appropriate for performance analysis.
|
|
||
| import datetime | ||
| import os | ||
| import stat # can also be removed if you drop ffmpeg entirely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misleading comment about removing stat import.
The comment suggests stat can be removed if ffmpeg is dropped, but lines 155-156 use stat.S_IEXEC for the ffmpeg chmod logic. Either update the comment to be accurate or remove the chmod block entirely if ffmpeg is no longer needed.
Apply this diff to clarify:
-import stat # can also be removed if you drop ffmpeg entirely
+import stat # used for ffmpeg binary chmod🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
line 5, the comment on the import of stat is misleading because later code
(lines ~155-156) uses stat.S_IEXEC for ffmpeg chmod logic; either update the
comment to state that stat is required by the chmod block, or if ffmpeg and its
chmod block are being removed, delete the chmod block (and then remove the stat
import). Ensure the chosen change keeps imports and code consistent: if keeping
ffmpeg code, change the comment to reflect stat is needed for chmod; if removing
ffmpeg, remove the chmod lines and then safely remove the stat import and any
related constants.
| # Encode processed video with OpenCV (CPU) | ||
| base = os.path.splitext(os.path.basename(video_path))[0] | ||
| out_path = f"/tmp/processed-{base}.mp4" | ||
|
|
||
| fourcc = cv2.VideoWriter_fourcc(*"mp4v") | ||
| writer = cv2.VideoWriter(out_path, fourcc, fps, (W, H)) | ||
| if not writer.isOpened(): | ||
| raise RuntimeError(f"Could not open VideoWriter for: {out_path}") | ||
|
|
||
| for frame_rgb in video_np_out: | ||
| frame_bgr = cv2.cvtColor(frame_rgb, cv2.COLOR_RGB2BGR) | ||
| writer.write(frame_bgr) | ||
|
|
||
| writer.release() | ||
| return out_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insecure /tmp path construction remains unresolved.
As flagged in the previous review, constructing predictable /tmp paths exposes the code to security issues. The current implementation doesn't address the S108 static analysis warning or the security concerns raised earlier.
At minimum, use os.path.join for path construction:
base = os.path.splitext(os.path.basename(video_path))[0]
- out_path = f"/tmp/processed-{base}.mp4"
+ out_path = os.path.join("/tmp", f"processed-{base}.mp4")For better security, consider using tempfile.NamedTemporaryFile:
import tempfile
...
with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False, dir="/tmp") as tmp:
out_path = tmp.name🧰 Tools
🪛 Ruff (0.14.8)
87-87: Probable insecure usage of temporary file or directory: "/tmp/processed-"
(S108)
92-92: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
lines 85 to 99, the code creates a predictable /tmp filename which is a security
risk; replace the fixed path with a securely created temporary file (e.g., use
tempfile.NamedTemporaryFile or tempfile.mkstemp) to obtain a unique,
non-guessable out_path with the .mp4 suffix, pass dir="/tmp" only if required,
set delete=False so OpenCV can open/write to it, close the temp handle before
returning, and remove the manual string concatenation (os.path.join is
acceptable for any explicit directory joins).
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.