Skip to content

Conversation

@JessieeeNotLi
Copy link

@JessieeeNotLi JessieeeNotLi commented Nov 6, 2025

Summary by CodeRabbit

  • New Features

    • Added NVENC GPU video benchmark, GPU-aware video filter workflow, and many new GPU/CPU benchmarks (inference, deep‑learning, linear‑algebra, microbenchmarks, and utilities).
  • Documentation

    • Added usage/readme and run examples for the GPU benchmark.
  • Chores

    • Added benchmark configs, input generators, dependency manifests, packaging/init helpers, experiments data, submodule update, CI migration to GitHub Actions, pre-commit and editor settings, and a placeholder to preserve the new benchmark directory.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

Warning

Rate limit exceeded

@JessieeeNotLi has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 57 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 83c5309 and 791c881.

📒 Files selected for processing (72)
  • benchmarks/000.microbenchmarks/010.sleep/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/040.server-reply/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/050.peak-performance/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/input.py (1 hunks)
  • benchmarks/100.webapps/110.dynamic-html/input.py (1 hunks)
  • benchmarks/100.webapps/120.uploader/input.py (1 hunks)
  • benchmarks/100.webapps/130.crud-api/input.py (1 hunks)
  • benchmarks/100.webapps/130.crud-api/python/function.py (1 hunks)
  • benchmarks/200.multimedia/210.thumbnailer/input.py (1 hunks)
  • benchmarks/200.multimedia/220.video-processing-gpu/input.py (1 hunks)
  • benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (1 hunks)
  • benchmarks/200.multimedia/220.video-processing/input.py (1 hunks)
  • benchmarks/300.utilities/311.compression/input.py (2 hunks)
  • benchmarks/300.utilities/311.compression/python/function.py (3 hunks)
  • benchmarks/400.inference/411.image-recognition/input.py (1 hunks)
  • benchmarks/400.inference/412.language-bert/input.py (1 hunks)
  • benchmarks/400.inference/413.image-classification/input.py (1 hunks)
  • benchmarks/400.inference/413.recommendation/input.py (1 hunks)
  • benchmarks/400.inference/413.recommendation/python/function.py (1 hunks)
  • benchmarks/500.scientific/501.graph-pagerank/input.py (1 hunks)
  • benchmarks/500.scientific/502.graph-mst/input.py (1 hunks)
  • benchmarks/500.scientific/503.graph-bfs/input.py (1 hunks)
  • benchmarks/500.scientific/504.dna-visualisation/input.py (1 hunks)
  • benchmarks/600.linearalgebra/601.matmul/input.py (1 hunks)
  • benchmarks/600.linearalgebra/602.axpy/input.py (1 hunks)
  • benchmarks/600.linearalgebra/603.jacobi2d/input.py (1 hunks)
  • benchmarks/600.linearalgebra/604.cholesky/input.py (1 hunks)
  • benchmarks/600.linearalgebra/605.lu/input.py (1 hunks)
  • benchmarks/600.linearalgebra/606.spmv/input.py (1 hunks)
  • benchmarks/600.linearalgebra/607.fw/input.py (1 hunks)
  • benchmarks/wrappers/aws/python/handler.py (1 hunks)
  • benchmarks/wrappers/aws/python/nosql.py (2 hunks)
  • benchmarks/wrappers/azure/python/handler.py (2 hunks)
  • benchmarks/wrappers/azure/python/nosql.py (2 hunks)
  • benchmarks/wrappers/gcp/python/handler.py (1 hunks)
  • benchmarks/wrappers/gcp/python/nosql.py (2 hunks)
  • benchmarks/wrappers/local/python/nosql.py (2 hunks)
  • benchmarks/wrappers/openwhisk/python/nosql.py (4 hunks)
  • sebs/__init__.py (1 hunks)
  • sebs/aws/aws.py (1 hunks)
  • sebs/aws/config.py (2 hunks)
  • sebs/aws/container.py (1 hunks)
  • sebs/aws/dynamodb.py (1 hunks)
  • sebs/aws/resources.py (1 hunks)
  • sebs/azure/azure.py (1 hunks)
  • sebs/azure/cli.py (0 hunks)
  • sebs/azure/cloud_resources.py (1 hunks)
  • sebs/azure/config.py (1 hunks)
  • sebs/azure/cosmosdb.py (2 hunks)
  • sebs/azure/system_resources.py (2 hunks)
  • sebs/benchmark.py (6 hunks)
  • sebs/experiments/perf_cost.py (1 hunks)
  • sebs/faas/container.py (2 hunks)
  • sebs/faas/nosql.py (2 hunks)
  • sebs/faas/storage.py (1 hunks)
  • sebs/faas/system.py (0 hunks)
  • sebs/gcp/cli.py (1 hunks)
  • sebs/gcp/datastore.py (2 hunks)
  • sebs/gcp/gcp.py (3 hunks)
  • sebs/gcp/resources.py (1 hunks)
  • sebs/gcp/storage.py (1 hunks)
  • sebs/local/config.py (2 hunks)
  • sebs/local/deployment.py (1 hunks)
  • sebs/openwhisk/config.py (1 hunks)
  • sebs/openwhisk/container.py (1 hunks)
  • sebs/openwhisk/function.py (2 hunks)
  • sebs/openwhisk/openwhisk.py (4 hunks)
  • sebs/sebs.py (1 hunks)
  • sebs/storage/scylladb.py (4 hunks)

Walkthrough

Adds a new NVENC/FFmpeg GPU benchmarking toolchain plus many new GPU/JAX/ONNX/Torch benchmark suites, CI/tooling/editor config updates, a submodule URL update, and widespread stylistic/input-generator edits across benchmarks.

Changes

Cohort / File(s) Summary
Watermarking / NVENC bench
watermarking_bench/.gitkeep, watermarking_bench/gpu_bench.py, watermarking_bench/run_nvenc_bench.sh, watermarking_bench/read.me, watermarking_bench/watermarking_readme.md
New NVENC benchmarking tooling: FFmpeg probing, GPU-aware filter-graph planner, transcode orchestration, progress parsing/metrics, Bash launcher, sample handling, placeholder and docs.
GPU video-processing benchmark
benchmarks/200.multimedia/220.video-processing-gpu/config.json, .../init.sh, .../input.py, .../python/function.py, .../python/requirements.txt
New benchmark: config/init, input uploader, FaaS handler that runs download → GPU conv+ReLU compute → re-encode/upload; adds torch/opencv/numpy requirements.
Inference & vision suites
benchmarks/400.inference/412.*, benchmarks/400.inference/413.*, benchmarks/400.inference/413.image-classification/python/imagenet_class_index.json, benchmarks/400.inference/413.*/requirements*.txt, .../init.sh, .../package.sh
New ONNX/Torch inference benchmarks (Tiny BERT ONNX, ResNet): lazy model download/init, CUDA checks, preprocessing/inference handlers, model artifacts, packaging/init scripts, pinned requirements.
JAX / deep-learning / linear-algebra suites
benchmarks/500.*, benchmarks/600.*, benchmarks/000.microbenchmarks/*
Many new/expanded benchmarks: JAX channel-flow/compute/ResNet, CUDA linear-algebra (matmul/axpy/jacobi/cholesky/lu/spmv/fw), Triton/Torch microbenchmarks; includes inputs, handlers, init/package scripts, and requirements.
CI, linting & editor tooling
.github/workflows/lint.yml, .pre-commit-config.yaml, .mypy.ini, .vscode/settings.json, .gitmodules, (removed) .circleci/config.yml
Replaced CircleCI with GitHub Actions lint workflow; added pre-commit config; expanded mypy ignore for docker submodules; VSCode settings; updated benchmarks-data submodule URL.
Benchmarks inputs & formatting
numerous benchmarks/*/input.py, many requirements.txt files
Widespread stylistic/formatting edits (quote normalization, signature reflow), added many input generators and size_generators constants, and many per-benchmark requirements files.
Misc data & snapshots
experiments.json, benchmarks-data (submodule update)
Added an experiments snapshot JSON and bumped the benchmarks-data submodule commit.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Shell as run_nvenc_bench.sh
    participant Bench as gpu_bench.py
    participant FFmpeg
    participant GPU as NVIDIA
    participant FS as FileSystem

    User->>Shell: start benchmark
    Shell->>GPU: query hardware (nvidia-smi)
    Shell->>Bench: invoke gpu_bench.py (probe ffmpeg)
    Shell->>FS: ensure sample video exists (create if missing)
    Shell->>Bench: pass trial list

    loop for each trial
      Bench->>FFmpeg: probe capabilities (encoders/filters)
      Bench->>Bench: plan filter graph (GPU vs CPU path)
      Bench->>FFmpeg: spawn transcode command
      FFmpeg->>GPU: use hw accel (if planned)
      FFmpeg-->>Bench: stream logs/progress
      Bench->>Bench: parse_progress, collect metrics
    end

    Bench-->>User: write JSON/CSV summary
Loading
sequenceDiagram
    participant FaaS as Handler
    participant Storage
    participant CPU
    participant GPU
    participant Encoder as FFmpeg

    FaaS->>Storage: download input video
    FaaS->>CPU: decode frames (or request hw decode)
    CPU->>GPU: upload frames / convert to tensors
    GPU-->>GPU: run conv2d + ReLU compute
    GPU->>CPU: download processed frames
    FaaS->>Encoder: invoke ffmpeg to re-encode/overlay/scale
    Encoder->>Storage: upload output
    FaaS-->>FaaS: return measurements (download/compute/upload)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

  • Files/areas that need extra attention:
    • watermarking_bench/gpu_bench.py — FFmpeg command construction, filter-graph selection (GPU vs CPU fallbacks), capability detection, progress parsing, error handling, and metric parsing.
    • watermarking_bench/run_nvenc_bench.sh — environment checks, sample generation, subprocess robustness and portability.
    • GPU/CUDA/JAX/ONNX handlers (benchmarks/400.*, benchmarks/500.*, benchmarks/600.*) — lazy model/session init, device/provider checks, memory usage, synchronization and fallback/error paths.
    • Packaging/init scripts and pinned requirements — compatibility and size/packaging effects.
    • CI/tooling changes — ensure GitHub Actions reproduce previous lint behavior and caching.

Possibly related PRs

Suggested reviewers

  • mcopik

Poem

🐇 I hopped into a bench at night,

Fans hummed, GPUs woke to light.
Frames danced through filters, stitched with care,
Metrics tallied bytes and air.
A joyful rabbit cheers: measure and share!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.80% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Jessie/video watermarking' is overly vague and does not clearly describe the main changes in the changeset, which involves adding multiple new benchmarks and tools beyond just watermarking. Use a more descriptive title that captures the primary scope, such as 'Add GPU video processing and watermarking benchmarks' or 'Add NVENC benchmarking and GPU video processing workloads'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@JessieeeNotLi
Copy link
Author

added video watermarking benchmark

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
watermarking_bench/.gitkeep (1)

1-1: .gitkeep may be redundant if the directory contains other files.

Since the PR already adds multiple files to watermarking_bench/ (gpu_bench.py, documentation, etc.), the .gitkeep placeholder is unnecessary—the directory will be preserved by these files alone. You can safely remove this.

watermarking_bench/gpu_bench.py (2)

32-40: Consider splitting multiple statements per line.

Lines 37-39 use semicolons to place multiple statements on single lines, which slightly reduces readability.

Apply this diff for improved clarity:

-        m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last);  fps = float(m.group(1)) if m else None
-        m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last); speed = float(m.group(1)) if m else None
-        m = re.search(r"frame=\s*([0-9]+)", last);              frames = int(m.group(1)) if m else None
+        m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last)
+        fps = float(m.group(1)) if m else None
+        m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last)
+        speed = float(m.group(1)) if m else None
+        m = re.search(r"frame=\s*([0-9]+)", last)
+        frames = int(m.group(1)) if m else None

245-293: Consider splitting the cleanup statements for clarity.

Lines 275-276 compress the temporary file cleanup into a single line. While functional, separating these statements would improve readability.

Apply this diff:

             idx += 1
-            try: os.remove(outp)
-            except OSError: pass
+            try:
+                os.remove(outp)
+            except OSError:
+                pass
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b54b46c and aa3483f.

⛔ Files ignored due to path filters (4)
  • watermarking_bench/results.csv is excluded by !**/*.csv
  • watermarking_bench/results_long.csv is excluded by !**/*.csv
  • watermarking_bench/sample.mp4 is excluded by !**/*.mp4
  • watermarking_bench/watermark.png is excluded by !**/*.png
📒 Files selected for processing (5)
  • watermarking_bench/.gitkeep (1 hunks)
  • watermarking_bench/gpu_bench.py (1 hunks)
  • watermarking_bench/read.me (1 hunks)
  • watermarking_bench/run_nvenc_bench.sh (1 hunks)
  • watermarking_bench/watermarking_readme.md (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
watermarking_bench/gpu_bench.py (1)
benchmarks/200.multimedia/220.video-processing/python/function.py (1)
  • watermark (38-47)
🪛 Ruff (0.14.3)
watermarking_bench/gpu_bench.py

1-1: Shebang is present but file is not executable

(EXE001)


14-14: subprocess call: check for execution of untrusted input

(S603)


29-29: Do not catch blind exception: Exception

(BLE001)


37-37: Multiple statements on one line (semicolon)

(E702)


38-38: Multiple statements on one line (semicolon)

(E702)


39-39: Multiple statements on one line (semicolon)

(E702)


149-149: Avoid specifying long messages outside the exception class

(TRY003)


275-275: Multiple statements on one line (colon)

(E701)


276-276: Multiple statements on one line (colon)

(E701)

🔇 Additional comments (10)
watermarking_bench/watermarking_readme.md (1)

1-9: LGTM! Clear and helpful documentation.

The instructions are concise and cover the common use cases effectively.

watermarking_bench/read.me (1)

1-3: LGTM! Concise usage instructions.

The commands are clear and cover the essential use cases.

watermarking_bench/gpu_bench.py (5)

7-22: LGTM! Well-designed helper functions.

The utility functions are clean and appropriate for the task. The subprocess call is safe since command arguments are controlled by the script logic, not untrusted user input.


24-30: LGTM! Appropriate error handling for optional GPU info.

The broad exception catch is acceptable here since nvidia-smi availability is optional, and the function gracefully degrades by returning None values.


44-130: LGTM! Comprehensive filter graph construction.

The function effectively handles multiple scenarios with proper GPU/CPU bridging. The explicit use of hwdownload and hwupload_cuda for format conversion is good practice and aligns with FFmpeg best practices for CUDA filters.


134-198: LGTM! Well-structured transcoding function.

The function properly orchestrates the FFmpeg command construction, execution, and metric collection. The detailed error message on line 149 (flagged by static analysis) is actually helpful for debugging, providing both the failure output and the command arguments.


200-244: LGTM! Comprehensive benchmark orchestration.

The main function effectively handles argument parsing, warmup execution, and trial iteration. The default trials include av1_nvenc, which will fail on systems without AV1 support, but this is caught by the encoder availability check in transcode_once(), so the error handling is appropriate.

watermarking_bench/run_nvenc_bench.sh (3)

1-26: LGTM! Robust script setup and validation.

The script properly validates prerequisites and provides helpful diagnostic output. The use of set -euo pipefail ensures failures are caught early.


269-278: LGTM! Helpful sample generation for testing.

The automatic generation of a 4K test video with audio is convenient for users who want to run the benchmark without providing their own input. The encoding parameters are reasonable for a test sample.


280-300: LGTM! Clear benchmark execution and results preview.

The benchmark execution is well-structured with timestamped CSV output and a helpful preview of results. The trial selection excludes av1_nvenc, which improves compatibility with systems that lack AV1 encoder support.

@@ -0,0 +1,293 @@
#!/usr/bin/env python3
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Set executable permission on the file.

The shebang indicates this script should be directly executable, but the file permissions aren't set. Ensure chmod +x is applied to match the usage shown in the documentation.

Apply this fix:

chmod +x watermarking_bench/gpu_bench.py
🧰 Tools
🪛 Ruff (0.14.3)

1-1: Shebang is present but file is not executable

(EXE001)

🤖 Prompt for AI Agents
In watermarking_bench/gpu_bench.py at line 1 the file contains a shebang but
lacks executable permission; make the script executable by setting the execute
bit (e.g., run chmod +x watermarking_bench/gpu_bench.py) so it can be run
directly as indicated in the docs and CI.

Comment on lines +28 to +264
# --- Make or update a working gpu_bench.py (GPU-first, safe bridges, skip missing encoders) ---
cat > gpu_bench.py <<'PY'
#!/usr/bin/env python3
import argparse, datetime, json, os, re, shutil, subprocess, sys, tempfile, csv
from typing import List, Dict, Any, Optional, Tuple
def which_ffmpeg() -> str:
p = shutil.which("ffmpeg")
if not p:
sys.exit("ffmpeg not found on PATH. Use Docker image with NVENC or install FFmpeg with NVENC.")
return p
def run(cmd: List[str]) -> subprocess.CompletedProcess:
return subprocess.run(cmd, stdin=subprocess.DEVNULL, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
def has_encoder(ffmpeg: str, enc: str) -> bool:
out = run([ffmpeg, "-hide_banner", "-encoders"]).stdout
return re.search(rf"\b{re.escape(enc)}\b", out) is not None
def has_filter(ffmpeg: str, name: str) -> bool:
out = run([ffmpeg, "-hide_banner", "-filters"]).stdout
return (f" {name} " in out)
def gpu_info() -> Dict[str, Any]:
try:
out = run(["nvidia-smi", "--query-gpu=name,memory.total,driver_version", "--format=csv,noheader,nounits"]).stdout.strip()
name, mem, drv = [x.strip() for x in out.splitlines()[0].split(",")]
return {"name": name, "memory_total_mb": int(mem), "driver_version": drv}
except Exception:
return {"name": None, "memory_total_mb": None, "driver_version": None}
def parse_progress(log: str) -> Dict[str, Any]:
lines = [ln for ln in log.splitlines() if ("fps=" in ln or "speed=" in ln or "frame=" in ln)]
fps = speed = frames = None
if lines:
last = lines[-1]
m = re.search(r"fps=\s*([0-9]+(?:\.[0-9]+)?)", last); fps = float(m.group(1)) if m else None
m = re.search(r"speed=\s*([0-9]+(?:\.[0-9]+)?)x", last); speed = float(m.group(1)) if m else None
m = re.search(r"frame=\s*([0-9]+)", last); frames = int(m.group(1)) if m else None
return {"fps": fps, "speed_x": speed, "frames": frames}
def build_vf_or_complex(ffmpeg: str, scale: Optional[str], wm_path: Optional[str], overlay: str, want_gpu_decode: bool) -> Tuple[List[str], str]:
used = []
vf_args: List[str] = []
complex_graph = ""
have_scale_npp = has_filter(ffmpeg, "scale_npp")
have_scale_cuda = has_filter(ffmpeg, "scale_cuda")
have_overlay_cuda = has_filter(ffmpeg, "overlay_cuda")
if not wm_path:
if scale:
if want_gpu_decode and have_scale_npp:
vf_args = ["-vf", f"scale_npp={scale}"]; used.append("scale_npp")
elif want_gpu_decode and have_scale_cuda:
vf_args = ["-vf", f"scale_cuda={scale}"]; used.append("scale_cuda")
else:
vf_args = ["-vf", f"hwdownload,format=nv12,scale={scale},hwupload_cuda"]
used.append("scale(cpu)+hwdownload+hwupload_cuda")
return (vf_args, "+".join(used))
# watermark path
if want_gpu_decode and have_overlay_cuda:
if scale and have_scale_npp:
complex_graph = f"[0:v]scale_npp={scale}[v0];[v0][1:v]overlay_cuda={overlay}[vout]"
used += ["scale_npp","overlay_cuda"]
elif scale and have_scale_cuda:
complex_graph = f"[0:v]scale_cuda={scale}[v0];[v0][1:v]overlay_cuda={overlay}[vout]"
used += ["scale_cuda","overlay_cuda"]
elif scale:
complex_graph = f"[0:v]hwdownload,format=nv12,scale={scale},hwupload_cuda[v0];[v0][1:v]overlay_cuda={overlay}[vout]"
used += ["scale(cpu)+hwdownload+hwupload_cuda","overlay_cuda"]
else:
complex_graph = f"[0:v][1:v]overlay_cuda={overlay}[vout]"
used += ["overlay_cuda"]
return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used))
# CPU overlay fallback (explicit bridges)
if scale and want_gpu_decode and (have_scale_npp or have_scale_cuda):
scaler = "scale_npp" if have_scale_npp else "scale_cuda"
complex_graph = (
f"[0:v]{scaler}={scale}[v0gpu];"
f"[v0gpu]hwdownload,format=nv12[v0cpu];"
f"[v0cpu][1:v]overlay={overlay}[mix];"
f"[mix]hwupload_cuda[vout]"
)
used += [scaler, "hwdownload+overlay(cpu)+hwupload_cuda"]
elif scale:
complex_graph = (
f"[0:v]hwdownload,format=nv12,scale={scale}[v0cpu];"
f"[v0cpu][1:v]overlay={overlay}[mix];"
f"[mix]hwupload_cuda[vout]"
)
used += ["scale(cpu)+overlay(cpu)+hwupload_cuda"]
else:
complex_graph = (
f"[0:v]hwdownload,format=nv12[v0cpu];"
f"[v0cpu][1:v]overlay={overlay}[mix];"
f"[mix]hwupload_cuda[vout]"
)
used += ["overlay(cpu)+hwupload_cuda"]
return (["-filter_complex", complex_graph, "-map", "[vout]"], "+".join(used))
def transcode_once(ffmpeg: str, inp: str, outp: str, codec: str, bitrate: str, preset: str,
duration: Optional[float], scale: Optional[str], wm_path: Optional[str],
overlay_pos: str, decode_mode: str = "gpu") -> Dict[str, Any]:
if not has_encoder(ffmpeg, codec):
raise RuntimeError(f"encoder '{codec}' not available; check your ffmpeg build (NVENC/AV1).")
want_gpu_decode = (decode_mode == "gpu")
args = [ffmpeg, "-hide_banner", "-y", "-vsync", "0"]
if want_gpu_decode:
args += ["-hwaccel", "cuda", "-hwaccel_output_format", "cuda", "-extra_hw_frames", "16",
"-init_hw_device", "cuda=cuda", "-filter_hw_device", "cuda"]
args += ["-i", inp]
if wm_path:
args += ["-loop", "1", "-i", wm_path]
if duration:
args += ["-t", str(duration)]
filt_args, filter_used = build_vf_or_complex(ffmpeg, scale, wm_path, overlay_pos, want_gpu_decode)
args += filt_args
args += ["-c:v", codec, "-b:v", bitrate, "-preset", preset, "-rc", "vbr", "-movflags", "+faststart"]
args += ["-c:a", "copy"]
args += [outp]
t0 = datetime.datetime.now()
proc = run(args)
t1 = datetime.datetime.now()
if proc.returncode != 0:
raise RuntimeError("ffmpeg failed:\n" + proc.stdout + f"\n\nARGS:\n{' '.join(args)}")
parsed = parse_progress(proc.stdout)
size = os.path.getsize(outp) if os.path.exists(outp) else 0
return {
"args": args,
"filter_used": filter_used,
"stdout_tail": "\n".join(proc.stdout.splitlines()[-15:]),
"compute_time_us": (t1 - t0) / datetime.timedelta(microseconds=1),
"fps": parsed["fps"],
"speed_x": parsed["speed_x"],
"frames": parsed["frames"],
"output_size_bytes": size
}
def main():
ap = argparse.ArgumentParser(description="GPU NVENC benchmark.")
ap.add_argument("--input", required=True)
ap.add_argument("--duration", type=float, default=None)
ap.add_argument("--repeat", type=int, default=1)
ap.add_argument("--warmup", action="store_true")
ap.add_argument("--csv", default=None)
ap.add_argument("--watermark", default=None)
ap.add_argument("--overlay", default="main_w/2-overlay_w/2:main_h/2-overlay_h/2")
ap.add_argument("--decode", choices=["gpu","cpu"], default="gpu")
ap.add_argument("--trials", nargs="+", default=[
"codec=h264_nvenc,bitrate=5M,preset=p5",
"codec=h264_nvenc,bitrate=12M,preset=p1,scale=1920:1080",
"codec=hevc_nvenc,bitrate=6M,preset=p4",
# "codec=av1_nvenc,bitrate=3M,preset=p5", # only if available
])
args = ap.parse_args()
ffmpeg = which_ffmpeg()
gi = gpu_info()
def parse_trial(s: str) -> Dict[str, str]:
d: Dict[str, str] = {}
for kv in s.split(","):
k, v = kv.split("=", 1); d[k.strip()] = v.strip()
return d
trial_specs = [parse_trial(s) for s in args.trials]
# Warmup with first available encoder
if args.warmup:
warm = next((t for t in trial_specs if has_encoder(ffmpeg, t.get("codec","h264_nvenc"))), None)
if warm:
with tempfile.NamedTemporaryFile(suffix=".mp4", delete=True) as tmp:
_ = transcode_once(ffmpeg, args.input, tmp.name,
warm.get("codec","h264_nvenc"),
warm.get("bitrate","5M"),
warm.get("preset","p5"),
args.duration,
warm.get("scale"),
args.watermark,
args.overlay,
args.decode)
results = []; idx = 0
for spec in trial_specs:
for _ in range(args.repeat):
if not has_encoder(ffmpeg, spec.get("codec","h264_nvenc")):
results.append({
"trial_index": idx, "codec": spec.get("codec"), "bitrate": spec.get("bitrate"),
"preset": spec.get("preset"), "scale_filter": "", "fps": None, "speed_x": None,
"frames": None, "compute_time_us": 0, "output_size_bytes": 0,
"stdout_tail": "SKIPPED: encoder not available", "argv": "", "status": "skipped"
}); idx += 1; continue
with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
outp = tmp.name
res = transcode_once(ffmpeg, args.input, outp,
spec.get("codec","h264_nvenc"),
spec.get("bitrate","5M"),
spec.get("preset","p5"),
args.duration,
spec.get("scale"),
args.watermark,
args.overlay,
args.decode)
results.append({
"trial_index": idx, "codec": spec.get("codec"), "bitrate": spec.get("bitrate"),
"preset": spec.get("preset"), "scale_filter": res["filter_used"], "fps": res["fps"],
"speed_x": res["speed_x"], "frames": res["frames"],
"compute_time_us": res["compute_time_us"], "output_size_bytes": res["output_size_bytes"],
"stdout_tail": res["stdout_tail"], "argv": " ".join(res["args"]), "status": "ok"
})
idx += 1
try: os.remove(outp)
except OSError: pass
report = {"gpu": gi, "ffmpeg_path": ffmpeg, "trial_count": len(results), "results": results}
print(json.dumps(report, indent=2))
if args.csv and results:
with open(args.csv, "w", newline="") as f:
w = csv.DictWriter(f, fieldnames=list(results[0].keys()))
w.writeheader(); w.writerows(results)
if __name__ == "__main__":
main()
PY
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Eliminate code duplication between embedded and standalone scripts.

The Python benchmarking code is duplicated here (237 lines embedded) and in the standalone gpu_bench.py file. This creates maintenance burden and increases the risk of the two versions diverging—which has already begun (the embedded version has av1_nvenc commented out on line 192 and includes different skipping logic on lines 225-231).

Consider one of these approaches:

Option 1 (recommended): Ship gpu_bench.py as a standalone script and invoke it:

-cat > gpu_bench.py <<'PY'
-#!/usr/bin/env python3
-[... entire script ...]
-PY
-chmod +x gpu_bench.py
+# Ensure gpu_bench.py is in the same directory
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+GPU_BENCH="${SCRIPT_DIR}/gpu_bench.py"
+
+if [[ ! -f "$GPU_BENCH" ]]; then
+  echo "Error: gpu_bench.py not found in ${SCRIPT_DIR}"
+  exit 1
+fi
+chmod +x "$GPU_BENCH"

Then update the execution line:

-./gpu_bench.py \
+"$GPU_BENCH" \

Option 2: If embedding is required for distribution, generate it from the canonical source during build/release rather than maintaining two copies manually.

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (5)
benchmarks/200.multimedia/220.video-processing/input.py (1)

19-43: LGTM! Well-implemented with good error handling.

The function correctly:

  • Tracks the last uploaded file with clear variable naming
  • Handles the case when no .mp4 files are found
  • Provides comprehensive documentation through comments
  • Configures the GPU filter operation with appropriate parameters

The unused size and nosql_func parameters are likely required by the benchmark framework interface. Based on learnings, input validation is not required since benchmarks are invoked by maintainers.

Note: This implementation is nearly identical to benchmarks/200.multimedia/220.video-processing-gpu/input.py (which has critical bugs). Consider whether both files are necessary or if they should share common logic to avoid drift.

benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (4)

3-12: Tighten imports and ensure external GPU deps are provisioned

  • subprocess is no longer used anywhere; keeping it is noise and may confuse future readers.
  • numpy, cv2, and torch are hard runtime dependencies of this function. Make sure they are either in this benchmark’s requirements / Docker image or clearly documented, otherwise the handler will fail at import time in some environments.

Consider:

-import subprocess  # not strictly needed anymore, but you can remove it
-import stat        # can also be removed if you drop ffmpeg entirely
+import stat        # can also be removed if you drop ffmpeg entirely

…and verify that the environment used by this benchmark explicitly includes numpy, opencv-python, and torch (or their equivalent packages).


42-49: Rename unused loop index for clarity / lint compliance

i is not used in the loop body, which Ruff flags (B007) and mildly hurts readability.

-    for i in range(max_frames):
+    for _ in range(max_frames):

This communicates intent and silences the warning without behavior change.


52-57: Fix ambiguous dash in comment and unused shape components

  • The 0–255 comment uses an en dash instead of a hyphen (RUF003).
  • T and C are never used (RUF059); better to avoid binding them.
-    # Stack into (T, H, W, C)
-    video_np = np.stack(frames, axis=0)  # uint8, 0–255
-    T, H, W, C = video_np.shape
+    # Stack into (T, H, W, C)
+    video_np = np.stack(frames, axis=0)  # uint8, 0-255
+    _, H, W, _ = video_np.shape

This keeps the doc value of the comment while aligning with the linter and avoiding unused variables.


154-160: Consider dropping the ffmpeg chmod block if ffmpeg is no longer used

The handler no longer shells out to an embedded ffmpeg binary; this block now just does a stat/chmod in a try/except, usually on a non-existent file.

If you’re confident no path still relies on that embedded binary, simplifying by removing this block will cut a tiny bit of cold-start and per-invocation overhead and reduce cognitive load:

-    ffmpeg_binary = os.path.join(SCRIPT_DIR, "ffmpeg", "ffmpeg")
-    try:
-        st = os.stat(ffmpeg_binary)
-        os.chmod(ffmpeg_binary, st.st_mode | stat.S_IEXEC)
-    except OSError:
-        # Ignore if ffmpeg is not present or filesystem is read-only.
-        pass

If you do still need the binary for other operations, keeping this as-is is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e53cfde and 703a05d.

⛔ Files ignored due to path filters (1)
  • benchmarks/200.multimedia/220.video-processing-gpu/resources/watermark.png is excluded by !**/*.png
📒 Files selected for processing (7)
  • benchmarks/200.multimedia/220.video-processing-gpu/config.json (1 hunks)
  • benchmarks/200.multimedia/220.video-processing-gpu/init.sh (1 hunks)
  • benchmarks/200.multimedia/220.video-processing-gpu/input.py (1 hunks)
  • benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (1 hunks)
  • benchmarks/200.multimedia/220.video-processing-gpu/python/requirements.txt (1 hunks)
  • benchmarks/200.multimedia/220.video-processing/input.py (1 hunks)
  • experiments.json (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • benchmarks/200.multimedia/220.video-processing-gpu/config.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T16:15:01.811Z
Learnt from: mcopik
Repo: spcl/serverless-benchmarks PR: 99
File: benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp:11-0
Timestamp: 2025-06-19T16:15:01.811Z
Learning: In the serverless-benchmarks project, input validation for benchmark functions is not required since the benchmarks are invoked directly by the maintainers who control the input values. This applies to functions like the sleep benchmark in benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp.

Applied to files:

  • benchmarks/200.multimedia/220.video-processing-gpu/init.sh
🧬 Code graph analysis (2)
benchmarks/200.multimedia/220.video-processing-gpu/input.py (1)
benchmarks/200.multimedia/220.video-processing/input.py (2)
  • buckets_count (4-6)
  • generate_input (19-43)
benchmarks/200.multimedia/220.video-processing/input.py (1)
benchmarks/200.multimedia/210.thumbnailer/input.py (1)
  • generate_input (15-29)
🪛 Gitleaks (8.29.0)
experiments.json

[high] 266-266: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 273-273: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🪛 Ruff (0.14.5)
benchmarks/200.multimedia/220.video-processing-gpu/input.py

15-15: Unused function argument: size

(ARG001)


15-15: Unused function argument: nosql_func

(ARG001)

benchmarks/200.multimedia/220.video-processing/input.py

19-19: Unused function argument: size

(ARG001)


19-19: Unused function argument: nosql_func

(ARG001)


29-29: Avoid specifying long messages outside the exception class

(TRY003)

benchmarks/200.multimedia/220.video-processing-gpu/python/function.py

33-33: Avoid specifying long messages outside the exception class

(TRY003)


42-42: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


53-53: Avoid specifying long messages outside the exception class

(TRY003)


56-56: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)


57-57: Unpacked variable T is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


57-57: Unpacked variable C is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


88-88: Probable insecure usage of temporary file or directory: "/tmp/processed-"

(S108)


93-93: Avoid specifying long messages outside the exception class

(TRY003)


146-148: Avoid specifying long messages outside the exception class

(TRY003)


150-150: Probable insecure usage of temporary file or directory: "/tmp/"

(S108)

🪛 Shellcheck (0.11.0)
benchmarks/200.multimedia/220.video-processing-gpu/init.sh

[warning] 4-4: DIR appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 5-5: VERBOSE appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 6-6: TARGET_ARCHITECTURE appears unused. Verify use (or export if used externally).

(SC2034)

🔇 Additional comments (4)
benchmarks/200.multimedia/220.video-processing-gpu/init.sh (1)

1-10: LGTM!

The script correctly implements a no-op initialization that conforms to the SeBS framework interface. The unused variables are intentional as documented in the comment.

benchmarks/200.multimedia/220.video-processing-gpu/input.py (1)

3-4: LGTM!

The buckets_count() function correctly returns one input and one output bucket.

benchmarks/200.multimedia/220.video-processing/input.py (2)

1-6: LGTM!

The imports and buckets_count() function are correctly implemented with clear documentation.


9-18: LGTM!

The docstring accurately describes the function and its parameters.

Comment on lines 15 to 28
def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
for file in glob.glob(os.path.join(data_dir, '*.mp4')):
img = os.path.relpath(file, data_dir)
upload_func(0, img, file)
#TODO: multiple datasets
input_config = {'object': {}, 'bucket': {}}
input_config['object']['key'] = img
# The function supports only "gpu-filter" at the moment.
input_config['object']['op'] = 'gpu-filter'
input_config['object']['duration'] = 1
input_config['bucket']['bucket'] = benchmarks_bucket
input_config['bucket']['input'] = input_paths[0]
input_config['bucket']['output'] = output_paths[0]
return input_config
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Add error handling for missing files and improve variable naming.

This function has several issues:

  1. NameError if no .mp4 files exist: If the glob finds no files, the loop never executes and img is never defined, causing a NameError on line 21.
  2. Misleading variable name: img suggests an image file, but this processes video files.
  3. Missing parameter: The num_iters field present in the similar function at benchmarks/200.multimedia/220.video-processing/input.py (line 37) is missing here.

Apply this diff to fix these issues (matching the implementation in 220.video-processing/input.py):

 def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func):
+    last_key = None
+
+    # Upload all .mp4 files from data_dir to bucket 0
     for file in glob.glob(os.path.join(data_dir, '*.mp4')):
-        img = os.path.relpath(file, data_dir)
-        upload_func(0, img, file)
+        key = os.path.relpath(file, data_dir)
+        upload_func(0, key, file)
+        last_key = key
+
+    if last_key is None:
+        raise RuntimeError(f"No .mp4 files found in {data_dir}")
+
     #TODO: multiple datasets
     input_config = {'object': {}, 'bucket': {}}
-    input_config['object']['key'] = img
+    # Use the last uploaded file as the input object
+    input_config['object']['key'] = last_key
     # The function supports only "gpu-filter" at the moment.
-    input_config['object']['op'] = 'gpu-filter'
-    input_config['object']['duration'] = 1
+    input_config['object']['op'] = 'gpu-filter'     # must match your handler's operations dict
+    input_config['object']['duration'] = 1          # seconds of video to process
+    input_config['object']['num_iters'] = 10        # extra param for GPU workload intensity
+
     input_config['bucket']['bucket'] = benchmarks_bucket
🧰 Tools
🪛 Ruff (0.14.5)

15-15: Unused function argument: size

(ARG001)


15-15: Unused function argument: nosql_func

(ARG001)

🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/input.py around lines 15
to 28, handle the case where no .mp4 files are found (avoid NameError) by
checking glob results and raising/logging an error or returning None, rename the
variable from img to a clearer video_relpath or video_key, ensure you call
upload_func for the selected video(s) with the correct args, and add the missing
'num_iters' field (set to 1 like the non-GPU implementation) into input_config;
mirror the control flow and field names used in
benchmarks/200.multimedia/220.video-processing/input.py so input_config always
has object.key, object.op ('gpu-filter'), object.duration, object.num_iters, and
bucket.* set, and fail fast if no video files are available.

Comment on lines +35 to +40
fps = cap.get(cv2.CAP_PROP_FPS)
if not fps or fps <= 0:
fps = 25.0 # fallback

max_frames = int(fps * duration)
frames = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against NaN/invalid FPS values before computing max_frames

If cap.get(cv2.CAP_PROP_FPS) returns NaN (which some containers/drivers do), the condition if not fps or fps <= 0 will not trigger, and int(fps * duration) will raise at runtime.

You can defensively handle this:

+import math
 ...
-    fps = cap.get(cv2.CAP_PROP_FPS)
-    if not fps or fps <= 0:
+    fps = cap.get(cv2.CAP_PROP_FPS)
+    if not fps or fps <= 0 or math.isnan(fps):
         fps = 25.0  # fallback

This keeps the simple fallback while avoiding a crash on malformed metadata.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fps = cap.get(cv2.CAP_PROP_FPS)
if not fps or fps <= 0:
fps = 25.0 # fallback
max_frames = int(fps * duration)
frames = []
fps = cap.get(cv2.CAP_PROP_FPS)
if not fps or fps <= 0 or math.isnan(fps):
fps = 25.0 # fallback
max_frames = int(fps * duration)
frames = []
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
lines 35 to 40, fps obtained from cap.get(cv2.CAP_PROP_FPS) can be NaN or
non-finite so the current check (if not fps or fps <= 0) can miss it and cause
int(fps * duration) to crash; defend by validating fps with math.isfinite (or
math.isnan) and the existing <= 0 check, e.g. import math and set fps = fallback
when not math.isfinite(fps) or fps <= 0, then compute max_frames from the
validated fps.

Comment on lines +86 to +90
# Encode processed video with OpenCV (CPU)
base = os.path.splitext(os.path.basename(video_path))[0]
out_path = f"/tmp/processed-{base}.mp4"

fourcc = cv2.VideoWriter_fourcc(*"mp4v")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden /tmp usage and sanitize user-controlled key

Two related issues here:

  1. download_path = f"/tmp/{key}":
    • If key contains path separators or .., this can escape /tmp or attempt to create nested directories that don’t exist, leading to failures or even overwriting unintended files.
  2. Static analysis (S108) is also concerned about the predictable paths /tmp/processed-{base}.mp4 and /tmp/{key}.

A minimal hardening that preserves current naming:

-    base = os.path.splitext(os.path.basename(video_path))[0]
-    out_path = f"/tmp/processed-{base}.mp4"
+    base = os.path.splitext(os.path.basename(video_path))[0]
+    out_path = os.path.join("/tmp", f"processed-{base}.mp4")
 ...
-    download_path = f"/tmp/{key}"
+    # Restrict local path to a basename to avoid directory traversal and
+    # missing parent directories under /tmp.
+    local_name = os.path.basename(key)
+    download_path = os.path.join("/tmp", local_name)

If you want to fully satisfy stricter security tooling, consider using tempfile.mkstemp / NamedTemporaryFile for one or both of these paths, but the basename sanitization above already removes the traversal risk while staying simple.

Also applies to: 150-177

🧰 Tools
🪛 Ruff (0.14.5)

88-88: Probable insecure usage of temporary file or directory: "/tmp/processed-"

(S108)

🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
lines 86 to 90 (and similarly lines 150-177), the code constructs predictable
/tmp paths using user-controlled values; sanitize the inputs and avoid path
traversal by taking only the basename of user-controlled key (e.g., key_base =
os.path.basename(key)) and validating/stripping any remaining suspicious
segments like '..' or os.sep, then build paths with os.path.join("/tmp",
key_base); for the processed output prefer creating a temporary file via
tempfile.NamedTemporaryFile or tempfile.mkstemp (or at minimum sanitize base =
os.path.basename(os.path.splitext(os.path.basename(video_path))[0]) and join
with /tmp) to eliminate predictable filenames and ensure the directory is
correct and exists before writing.

Comment on lines +1 to +3
torch
opencv-python-headless
numpy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

Pin dependency versions for reproducibility and security.

The requirements file lacks version constraints, which poses reproducibility and security risks. Unpinned versions can produce non-deterministic benchmark results and may introduce breaking changes across CI/CD runs.

Additionally, torch is specified without explicit GPU support configuration—confirm whether CUDA/cuDNN dependencies are documented elsewhere in the benchmark setup, or consider pinning a CUDA-specific torch variant if GPU acceleration is required.

Consider applying version pins like this:

-torch
-opencv-python-headless
-numpy
+torch==2.0.1
+opencv-python-headless==4.8.1.78
+numpy==1.24.3

Once you've determined the appropriate versions, please verify:

  1. The versions are compatible with your target GPU/CUDA environment.
  2. Any external CUDA/cuDNN dependencies are documented in a setup guide for the benchmark.
  3. Run a test build to confirm all dependencies resolve without conflicts.





---

🏁 Script executed:

```shell
#!/bin/bash
# Search for all requirements files to understand version pinning patterns in codebase
echo "=== Finding all requirements.txt files ==="
find . -name "requirements*.txt" -type f | head -20

echo -e "\n=== Checking versions in other requirements files ==="
find . -name "requirements*.txt" -type f -exec sh -c 'echo "File: $1"; head -10 "$1"' _ {} \; | head -60

Length of output: 2594


🏁 Script executed:

#!/bin/bash
# Search for GPU, CUDA, cuDNN references in benchmark directory and related docs
echo "=== Searching for GPU/CUDA references in benchmark context ==="
rg -i "cuda|gpu|cudnn" benchmarks/200.multimedia/220.video-processing-gpu/ --type-list | head -5
rg -i "cuda|gpu|cudnn" benchmarks/200.multimedia/220.video-processing-gpu/ -A2 -B2 2>/dev/null | head -40

echo -e "\n=== Looking for setup documentation ==="
find benchmarks/200.multimedia/220.video-processing-gpu/ -type f \( -name "*.md" -o -name "*.txt" -o -name "setup*" -o -name "README*" \) 

Length of output: 4364


🏁 Script executed:

#!/bin/bash
# Check structure of the benchmark directory and look for related docs
echo "=== Contents of video-processing-gpu benchmark directory ==="
ls -la benchmarks/200.multimedia/220.video-processing-gpu/ 2>/dev/null || echo "Directory check"
find benchmarks/200.multimedia/220.video-processing-gpu/ -type f | head -20

echo -e "\n=== Parent directory context ==="
ls -la benchmarks/200.multimedia/

Length of output: 1214


🏁 Script executed:

#!/bin/bash
# Search for any documentation about dependencies or torch in the benchmark context
echo "=== Searching for torch, dependencies, or installation docs ==="
rg -i "torch|dependencies|install|setup" benchmarks/200.multimedia/ --type md --type txt -A2 | head -50

echo -e "\n=== Checking if there's a main setup guide ==="
find benchmarks/ -maxdepth 2 -name "README*" -o -name "setup*" -o -name "INSTALL*" | head -10
cat benchmarks/README.md 2>/dev/null | head -30

Length of output: 371


🌐 Web query:

torch 2.0.1 opencv-python-headless 4.8.1 numpy 1.24 compatibility

💡 Result:

Short answer: Yes — torch 2.0.1 works with NumPy 1.24, and opencv-python-headless 4.8.1 also supports NumPy 1.24. The main incompatibility to avoid is NumPy 2.x (PyTorch wheels built against NumPy 1.x can fail under NumPy 2.x). Test by importing torch and cv2 after installing.

Recommended pip pins:

  • pip install "numpy==1.24.*"
  • pip install "torch==2.0.1"
  • pip install "opencv-python-headless==4.8.1"

References:

  • PyTorch issues describing failures when NumPy 2.x is used with PyTorch built against NumPy 1.x. [1][2]
  • opencv-python-headless PyPI (wheels follow NumPy/Python compatibility). [3]

Sources:
[1] PyTorch issue: Numpy compatibility / DataLoader RuntimeError.
[2] PyTorch issue: modules compiled with NumPy 1.x incompatible with NumPy 2.x.
[3] opencv-python-headless PyPI project page.


🌐 Web query:

torch GPU setup requirements CUDA environment variables

💡 Result:

Summary — what you need for PyTorch GPU + relevant CUDA environment variables

  1. Requirements (high level)
  • Install a PyTorch binary built for the CUDA/ROCm version you intend to use, or build PyTorch from source against your local CUDA. The pip/conda wheel’s CUDA version must match the binary you select (or you must build). Verify with torch.cuda.is_available() after install. [1][3]

  • NVIDIA GPU + matching NVIDIA driver installed on host (driver must support the CUDA runtime version used by the PyTorch binary). The driver + GPU must be functional (check nvidia-smi). [3]

  • If building extensions or PyTorch from source you need a local CUDA toolkit (nvcc), cuDNN, NVML, etc., and to point environment variables to the toolkit and libs. [2][4]

  1. Important environment variables you’ll commonly set/check
  • CUDA_HOME (or CUDA_PATH on Windows): root of CUDA installation (e.g. /usr/local/cuda or C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1). Used by build scripts and some tools. [2][4]
  • CUDACXX: path to nvcc (e.g. ${CUDA_HOME}/bin/nvcc) when building from source. [4]
  • PATH: include ${CUDA_HOME}/bin so nvcc and tools are found. [2][4]
  • LD_LIBRARY_PATH (Linux) / PATH (Windows) / DYLD_LIBRARY_PATH (macOS older): include CUDA lib directories (e.g. ${CUDA_HOME}/lib64) so shared libs (libcudart, cuDNN) are found at runtime. [2][4]
  • CUDA_VISIBLE_DEVICES: control which GPUs are visible to CUDA (comma-separated indices, or -1 to hide all). Useful for multi-GPU hosts. [2]
  • CUDA_LAUNCH_BLOCKING=1: make CUDA calls synchronous (debugging). [2]
  • PYTORCH_CUDA_ALLOC_CONF: control PyTorch CUDA allocator behavior (memory fragmentation, max split size, etc.). [2]
  • PYTORCH_NO_CUDA_MEMORY_CACHING: disable PyTorch caching allocator (useful for debugging). [2]
  • TORCH_CUDNN_V8_API_DISABLED / TORCH_CUDNN_V8_API_LRU_CACHE_LIMIT etc.: cuDNN-specific controls. [2]
  • NVML-related (NVML_LIB_PATH) when building packages that query NVML. [4]
  1. Typical quick checks & steps (Linux)
  • Ensure driver/GPU visible: nvidia-smi
  • Ensure CUDA bin/libs on PATH/LD_LIBRARY_PATH:
    export CUDA_HOME=/usr/local/cuda
    export CUDACXX="${CUDA_HOME}/bin/nvcc"
    export PATH="${CUDA_HOME}/bin:${PATH}"
    export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
    source ~/.bashrc
  • Install PyTorch GPU wheel matching desired CUDA version (or conda package with cudatoolkit). Then in Python:
    import torch; print(torch.cuda.is_available(), torch.version.cuda, torch.cuda.device_count())
  1. Notes / gotchas
  • Prebuilt PyTorch wheels include CUDA runtimes for the CUDA versions they were built for; setting CUDA_HOME / local CUDA does not make a CPU-only wheel suddenly use your system CUDA — you must install a GPU-enabled PyTorch build or build from source. [3]
  • Mismatch between PyTorch’s bundled CUDA version and your GPU driver can cause torch.cuda.is_available() to be False; updating the NVIDIA driver or installing a matching PyTorch binary usually resolves this. [3]

Sources

  • PyTorch CUDA environment variables and PyTorch-specific vars. [1]
  • PyTorch Get Started / installation and verification notes (matching binaries, building from source). [3]
  • Example CUDA env var usage and build-time vars (CUDA_HOME, CUDACXX, LD_LIBRARY_PATH, NVML paths). [4]

References
[1] PyTorch — CUDA Environment Variables.
[2] PyTorch/XLA GPU environment setup (examples of PATH/LD_LIBRARY_PATH).
[3] PyTorch Get Started / Installation & verification.

(If you want, tell me your OS, PyTorch version and how you installed it and I’ll give the exact install / env var commands and a short checklist to get torch.cuda.is_available() → True.)


Pin dependency versions to prevent NumPy 2.x incompatibility and ensure reproducibility.

Version pinning is missing, which creates reproducibility risks. Most critically, NumPy 2.x is incompatible with PyTorch wheels built against NumPy 1.x, leading to runtime failures if versions drift across CI/CD runs.

Regarding GPU support: the code correctly uses torch.cuda.is_available() to detect and enable GPU at runtime, which is the proper pattern. GPU availability depends on the environment (CUDA toolkit, driver, PyTorch build selection) rather than the requirements.txt file itself. The standard torch package handles both CPU and GPU cases gracefully.

Consider pinning versions with NumPy <2.0 constraint:

-torch
-opencv-python-headless
-numpy
+torch==2.0.1
+opencv-python-headless==4.8.1.78
+numpy<2.0

Verify compatibility with your target environment after applying pins.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
torch
opencv-python-headless
numpy
torch==2.0.1
opencv-python-headless==4.8.1.78
numpy<2.0
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/requirements.txt
lines 1-3, dependency versions are not pinned which risks pulling NumPy 2.x
(incompatible with many PyTorch wheels) and breaks reproducibility; update the
file to pin specific, tested versions (at minimum constrain NumPy to <2.0), pin
opencv-python-headless to a known working release, and pin torch to the version
you intend to run (or document the exact wheel URL/extra-index for CUDA-enabled
builds); after changing, run CI or a local environment install to verify
compatibility with your target CUDA/toolchain and adjust pins accordingly.

"version": "6.0"
},
"storage": {
"access_key": "cStL2xQ1C1c2wpB_25M3QnnNO4aizEjhHEH203q_a7s",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove hardcoded credentials from version control.

Lines 266 and 273 contain hardcoded MinIO storage credentials (access_key and secret_key). Storing secrets in committed code is a critical security vulnerability. These must be removed, replaced with placeholder values, or sanitized before merging.

If this file contains actual test data with development credentials, replace the values with non-sensitive placeholders:

         "storage": {
-          "access_key": "cStL2xQ1C1c2wpB_25M3QnnNO4aizEjhHEH203q_a7s",
+          "access_key": "PLACEHOLDER_ACCESS_KEY",
           "address": "host.docker.internal:9011",
           "data_volume": "minio-volume",
           "input_buckets": [],
           "instance_id": "0571b3121a5e06773187dff7ffc30b4c2a9f63ae7c05ff592e3c45b9b3a6d1e9",
           "mapped_port": 9011,
           "output_buckets": [],
-          "secret_key": "22a1207b6c97d3669271e9896097e3d8fddd6160e3c75444fb0eabf8b9ef8681",
+          "secret_key": "PLACEHOLDER_SECRET_KEY",
           "type": "minio",
           "version": "RELEASE.2024-07-16T23-46-41Z"
         }

If credentials are needed for runtime, inject them via environment variables or a secure credential store instead.

Also applies to: 273-273

🧰 Tools
🪛 Gitleaks (8.29.0)

[high] 266-266: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🤖 Prompt for AI Agents
experiments.json lines 266 and 273: the file contains hardcoded MinIO
credentials (access_key and secret_key); remove these literal values and replace
them with non-sensitive placeholders or reference environment/config variables
(e.g., "${MINIO_ACCESS_KEY}" and "${MINIO_SECRET_KEY}") so the runtime reads
credentials from env vars or a secrets manager; update any README or deployment
docs to describe setting the required env vars and ensure the file is not
carrying real secrets (if these were real, revoke/rotate them immediately and
sanitize the repository history if necessary).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
benchmarks/000.microbenchmarks/040.server-reply/python/function.py (1)

8-12: Fix socket resource leak and add error handling.

The socket is never closed, causing a resource leak. Additionally, socket operations (connect, recv) can fail but are not wrapped in error handling.

Apply this diff to fix the resource leak and add error handling:

     socket.setdefaulttimeout(20)
     s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-    s.connect(addr)
-    msg = s.recv(1024).decode()
-    return {"result": msg}
+    try:
+        s.connect(addr)
+        msg = s.recv(1024).decode()
+        return {"result": msg}
+    except (socket.error, socket.timeout) as e:
+        return {"error": f"Socket error: {str(e)}"}
+    finally:
+        s.close()
♻️ Duplicate comments (3)
benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/requirements.txt (1)

1-1: Consider pinning the JAX version for reproducibility.

Same unpinned dependency issue as benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/requirements.txt.

benchmarks/600.linearalgebra/601.matmul/python/requirements.txt (1)

1-1: Critical security vulnerabilities in torch 2.4.1.

This file uses the same vulnerable torch version (2.4.1) as benchmarks/600.linearalgebra/606.spmv/python/requirements.txt, exposing the benchmark to the same critical security issues including RCE and DoS vulnerabilities.

benchmarks/400.inference/413.image-classification/python/requirements.txt (1)

3-3: Critical security vulnerabilities in torch 2.4.1.

This file uses the same vulnerable torch version (2.4.1) with critical security issues. This is particularly concerning for an inference benchmark that may use torch.load to load models.

🟠 Major comments (17)
benchmarks/000.microbenchmarks/040.server-reply/python/function.py-7-7 (1)

7-7: Add input validation for required event keys.

The event.get() calls will return None if keys are missing, causing socket.connect() to fail with a less-clear error. Validate that "ip-address" and "port" exist and are valid before attempting the connection.

Apply this diff to add validation:

 def handler(event):
 
     # start timing
+    ip_address = event.get("ip-address")
+    port = event.get("port")
+    
+    if not ip_address or not port:
+        return {"error": "Missing required fields: ip-address and port"}
+    
-    addr = (event.get("ip-address"), event.get("port"))
+    addr = (ip_address, port)
     socket.setdefaulttimeout(20)

Committable suggestion skipped: line range outside the PR's diff.

benchmarks/200.multimedia/220.video-processing/input.py-24-26 (1)

24-26: Fix loop logic: all .mp4 files are uploaded, but only the last filename is stored in the configuration.

Lines 24-26 iterate over all matching files, but the img variable gets overwritten each iteration. When line 29 sets input_config["object"]["key"] = img, only the last file's path is used, while all files were uploaded.

Additionally, if no .mp4 files match, img is undefined on line 29, causing a runtime error.

Decide the intended behavior: (1) process only one specific file (add break or filter), or (2) support multiple files (modify config structure). The TODO comment suggests incomplete multi-dataset support.

benchmarks/200.multimedia/220.video-processing/input.py-10-18 (1)

10-18: Move docstring inside the generate_input function.

The docstring is currently placed at module level, but it documents the generate_input function parameters. Python convention requires function docstrings to be the first statement inside the function body.

Apply this diff to move the docstring to the correct location:

-
-"""
-    Generate test, small and large workload for thumbnailer.
-
-    :param data_dir: directory where benchmark data is placed
-    :param size: workload size
-    :param input_buckets: input storage containers for this benchmark
-    :param output_buckets:
-    :param upload_func: upload function taking three params(bucket_idx, key, filepath)
-"""
-
-
 def generate_input(
     data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func
 ):
+    """
+    Generate test, small and large workload for video processing.
+
+    :param data_dir: directory where benchmark data is placed
+    :param size: workload size
+    :param benchmarks_bucket: benchmark storage bucket
+    :param input_paths: input storage paths for this benchmark
+    :param output_paths: output storage paths for this benchmark
+    :param upload_func: upload function taking three params(bucket_idx, key, filepath)
+    :param nosql_func: nosql function for database operations
+    """
     for file in glob.glob(os.path.join(data_dir, "*.mp4")):
benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py-62-73 (1)

62-73: Inconsistent thresholds confirmed – review comment is accurate

The full function context confirms the review comment's analysis. The thresholds are indeed misaligned:

  • Loop break condition (line 44–45): breaks when consecutive_failures == 7 with "Can't setup the connection"
  • CSV upload condition (line 62): skips upload only when consecutive_failures == 5

This inverted logic causes a failed connection attempt (7 consecutive timeouts) to still upload a CSV file, while a run exiting with exactly 5 consecutive failures suppresses upload. The suggestion to align thresholds or use a named constant (e.g., MAX_CONSECUTIVE_FAILURES = 7) is appropriate.

benchmarks/400.inference/413.image-classification/python/package.sh-16-17 (1)

16-17: Handle filenames with spaces or special characters.

Using xargs without -print0 can break on filenames containing spaces, newlines, or special characters.

Apply this diff:

 # cleaning
 # stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure
-find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip
-find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip
+find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 strip
+find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 strip
benchmarks/400.inference/413.image-classification/python/package.sh-25-25 (1)

25-25: Add error handling for directory change.

This cd command needs error handling.

Apply this diff:

 find . -name \*.pyc -delete
-cd ${CUR_DIR}
+cd "${CUR_DIR}" || exit 1
 echo "Stripped size $(du -sh $1 | cut -f1)"
benchmarks/400.inference/413.image-classification/python/package.sh-7-7 (1)

7-7: Add error handling for directory changes.

The cd command should include error handling to prevent subsequent commands from executing in an unexpected directory if the change fails.

Apply this diff:

 CUR_DIR=$(pwd)
-cd $1
+cd "$1" || exit 1
 # cleaning libs
benchmarks/400.inference/412.language-bert/python/package.sh-25-25 (1)

25-25: Add error handling for directory change.

Similar to Line 7, this cd command needs error handling.

Apply this diff:

 find . -name \*.pyc -delete
-cd ${CUR_DIR}
+cd "${CUR_DIR}" || exit 1
 echo "Stripped size $(du -sh $1 | cut -f1)"
benchmarks/400.inference/412.language-bert/python/package.sh-16-17 (1)

16-17: Handle filenames with spaces or special characters.

Using xargs without -print0 can break on filenames containing spaces, newlines, or special characters.

Apply this diff:

 # cleaning
 # stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure
-find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip
-find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip
+find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 strip
+find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 strip
benchmarks/400.inference/412.language-bert/python/package.sh-7-7 (1)

7-7: Add error handling for directory changes.

The cd command should include error handling to prevent subsequent commands from executing in an unexpected directory if the change fails.

Apply this diff:

 CUR_DIR=$(pwd)
-cd $1
+cd "$1" || exit 1
 # cleaning libs
benchmarks/400.inference/412.language-bert/python/package.sh-30-33 (1)

30-33: Add error handling for directory changes.

Both cd commands in the torch-zipping block need error handling.

Apply this diff:

 if [ -d "$1/${TORCH_DIR}" ]; then
-	cd $1
+	cd "$1" || exit 1
 	zip -qr torch.zip ${TORCH_DIR}
 	rm -rf ${TORCH_DIR}
-	cd ${CUR_DIR}
+	cd "${CUR_DIR}" || exit 1
 	echo "Torch-zipped size $(du -sh $1 | cut -f1)"
 fi
benchmarks/600.linearalgebra/603.jacobi2d/python/requirements.txt-1-1 (1)

1-1: I need to verify the PYSEC vulnerabilities mentioned in the original review before finalizing the rewritten comment.

Update torch 2.4.1 to address security vulnerabilities; upgrade to 2.6.0 minimum.

Torch 2.4.1 does contain vulnerabilities, but the original characterization overstates severity. PYSEC-2024-259 involves RemoteModule deserialization RCE affecting PyTorch <=2.4.1, but is marked withdrawn as disputed.

Verified vulnerabilities:

  • PYSEC-2025-41 (CVE-2025-32434): CRITICAL RCE via torch.load with weights_only=True; fixed in 2.6.0
  • GHSA-3749-ghw9-m3mg: Low severity DoS, requires upgrade to 2.7.1+
  • GHSA-887c-mr87-cxwp: Medium severity resource management, requires upgrade to 2.8.0+

Upgrade to version 2.6.0 or above to address the CRITICAL issue. Latest available is 2.9.1.

benchmarks/400.inference/413.recommendation/python/requirements.txt.3.8-1-1 (1)

1-1: Upgrade torch from 2.2.2 to 2.4.1 for Python 3.8 projects.

Verification confirms torch 2.2.2 contains two active CRITICAL vulnerabilities including an RCE via torch.load with weights_only=True. PyTorch 2.5+ requires Python 3.9+, so 2.4.1 is the latest version supporting Python 3.8 compatibility. While PyTorch 2.4.1 has a known deserialization RCE (CVE-2024-48063) affecting RemoteModule, it is still preferable to 2.2.2 due to fewer vulnerabilities.

Repository currently uses: 5× torch==2.2.2, 10× torch==2.4.1, 5× torch==2.5.1. Upgrade this file to torch==2.4.1 for consistency and security.

Alternatively, if Python 3.8 support can be dropped (EOL October 2024), upgrade to torch==2.5.1+ for better security posture across all benchmarks.

benchmarks/400.inference/413.image-classification/python/init.sh-10-10 (1)

10-10: Quote variables to handle paths with spaces.

The cp command uses unquoted variables which will fail if paths contain spaces or special characters.

Apply this diff:

-cp ${path} ${DIR}
+cp "${path}" "${DIR}"
benchmarks/600.linearalgebra/601.matmul/python/function.py-48-62 (1)

48-62: Include gpu_time in the measurement result.

The gpu_ms variable is computed from kernel_gemm (line 48) but is not included in the returned measurement dictionary. All other linear algebra benchmarks (602.axpy, 603.jacobi2d, 604.cholesky, 605.lu) include gpu_time in their measurements.

Apply this diff to include GPU timing:

     return {
         # "result": result[0],
         "measurement": {
             "generating_time": matrix_generating_time,
             "compute_time": matmul_time,
+            "gpu_time": gpu_ms,
         },
     }

Also, consider prefixing C_out with an underscore if it's intentionally unused:

-    C_out, gpu_ms = kernel_gemm(alpha, beta, C, A, B, reps=1)
+    _C_out, gpu_ms = kernel_gemm(alpha, beta, C, A, B, reps=1)
benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py-1-8 (1)

1-8: Resolve the license TODO before merging

You’re explicitly copying/deriving from Barba & Forsyth’s CFD Python work and note CC‑BY 4.0 and BSD‑3 in the header, but there’s a # TODO: License comment.

Before merging, ensure:

  • The appropriate BSD‑3 license text is present in the repo (e.g., LICENSE file plus per‑file header, as required).
  • Attribution requirements from CC‑BY are satisfied.

Leaving this as a TODO is a compliance risk.

benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py-39-45 (1)

39-45: Batch norm is using std instead of var in the denominator

Canonical BN uses (x - mean) / sqrt(var + eps) where var is the variance of x. Here you compute std = jnp.std(...) and then divide by sqrt(std + eps), i.e. sqrt(sigma + eps) rather than sqrt(sigma² + eps), which changes the scaling, especially for large variances.

If you intend standard BN semantics:

-@jax.jit
-def batchnorm2d(x, eps=1e-5):
-    mean = jnp.mean(x, axis=0, keepdims=True)
-    std = jnp.std(x, axis=0, keepdims=True)
-    return (x - mean) / jnp.sqrt(std + eps)
+@jax.jit
+def batchnorm2d(x, eps=1e-5):
+    mean = jnp.mean(x, axis=0, keepdims=True)
+    var = jnp.var(x, axis=0, keepdims=True)
+    return (x - mean) / jnp.sqrt(var + eps)

If this is an intentional simplification for benchmarking, a short comment would help avoid future “fixes.”

🟡 Minor comments (3)
benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py-42-51 (1)

42-51: Verify counter name semantics and logging overhead

The latency loop is functional, but two valid concerns merit addressing:

  • Counter semantics mismatch: measurements_not_smaller suggests ≥ behavior, but the condition uses > (line 48). When cur_time == cur_min, the counter resets. If strictly-greater is intentional, the name should reflect that; if "not smaller" (≥) is the goal, change the condition to >=.

  • Per-iteration logging: The print statement (lines 45–47) runs inside the while loop and can emit up to ~1000 lines per invocation. Consider guarding it behind a debug flag or environment variable to prevent timing skew in benchmarking.

benchmarks/400.inference/413.recommendation/python/package.sh-3-4 (1)

3-4: Address unused variable and quote path arguments.

The script has two issues:

  1. PACKAGE_DIR is assigned but never used (line 4 uses $1 directly).
  2. Unquoted $1 will fail on paths with spaces.

Apply this diff:

-PACKAGE_DIR=$1
-echo "DLRM GPU package size $(du -sh $1 | cut -f1)"
+PACKAGE_DIR="$1"
+echo "DLRM GPU package size $(du -sh "$PACKAGE_DIR" | cut -f1)"
benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py-82-123 (1)

82-123: Acknowledge broad exception handling is safer than narrowing to undocumented TypeError

The web search confirms that jax.device_get does not document a specific set of exception types—it only describes operation and inputs. Narrowing to TypeError based on implementation details is risky. The current broad Exception catch, while hiding potential bugs, is actually more robust than the suggested fix.

Instead of narrowing the exception, add logging/metrics to understand when this occurs without losing visibility:

     try:
         results = jax.device_get(results)
-    except Exception:
-        pass
+    except Exception as e:
+        # Not a JAX DeviceArray or already on host; leave as-is.
+        # Consider logging e or emitting a metric here for observability.
+        pass
🧹 Nitpick comments (39)
benchmarks/200.multimedia/210.thumbnailer/input.py (1)

20-22: Consider removing unused function parameters.

Function arguments size and nosql_func are not used within the function body. If these parameters are no longer needed, removing them would clean up the API surface and reduce confusion. If they are retained for backward compatibility, consider adding a comment to clarify intent.

benchmarks/100.webapps/120.uploader/input.py (1)

19-20: Unused function parameters.

The parameters data_dir, input_buckets, upload_func, and nosql_func are unused in the function body. While this may be acceptable (the benchmark may not require file uploads), consider whether these parameters should be removed to simplify the interface, or if they should be retained for API consistency with other benchmark modules.

benchmarks/200.multimedia/220.video-processing/input.py (2)

27-27: TODO: Multiple datasets support is not implemented.

The TODO comment indicates that multiple dataset support is planned but not yet implemented. This aligns with the loop logic issue identified above.

Do you want me to help design and implement support for multiple datasets, or open a new issue to track this enhancement?


21-23: Unused function parameters.

The size and nosql_func parameters are not used in the function body. If these are required by the interface contract for consistency with other benchmark input generators, consider adding a comment explaining why they're unused. Otherwise, they could be removed.

benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py (2)

11-16: Event field extraction looks correct; consider hardening bucket access

The extracted fields match the expected AWS wrapper contract and are straightforward. The nested event.get("bucket").get("bucket") / .get("output") will raise if "bucket" is missing or None, though, so if this benchmark might be invoked without a bucket configuration, consider something more defensive like:

bucket_cfg = event.get("bucket") or {}
output_bucket = bucket_cfg.get("bucket")
output_prefix = bucket_cfg.get("output", "")

or enforce presence early with a clear error.


63-65: Static /tmp/data.csv path: consider a safer temp‑file pattern

Ruff’s S108 warning about /tmp/data.csv is valid in a general Unix context: using a fixed filename in /tmp can open you up to symlink/race issues and, in non‑serverless environments, cross‑user interference.

In typical AWS Lambda‑style deployments the risk is lower (per‑container /tmp, single active invocation), but if this code might ever be reused elsewhere, it’s safer to use a unique temp file:

import tempfile

with tempfile.NamedTemporaryFile(mode="w", newline="", suffix=".csv", delete=False) as csvfile:
    tmp_path = csvfile.name
    writer = csv.writer(csvfile, delimiter=",")
    writer.writerow(["id", "client_send", "client_rcv"])
    for row in times:
        writer.writerow(row)

key = client.upload(
    output_bucket,
    os.path.join(output_prefix, filename),
    tmp_path,
)

This also avoids accidental reuse of stale data if anything goes wrong between writes and uploads.

Also applies to: 70-71

benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/requirements.txt (1)

1-1: Consider pinning the JAX version for reproducibility.

The dependency is unpinned, which may lead to non-reproducible benchmark results across different environments as JAX updates. Consider pinning to a specific version if reproducibility is important for this benchmark suite. Note that all three JAX benchmarks in the project currently follow the same unpinned pattern, reflecting the project's mixed approach to dependency management across benchmarks.

benchmarks/400.inference/412.language-bert/python/package.sh (1)

1-1: Add shebang for portability.

The script lacks a shebang line, which can lead to unexpected behavior depending on how it's invoked.

Apply this diff:

+#!/bin/bash
 # Stripping package code is based on https://github.com/ryfeus/lambda-packs repo
benchmarks/400.inference/413.image-classification/python/package.sh (1)

1-1: Add shebang for Bash-specific syntax.

The script uses [[ which is Bash-specific, but lacks a shebang line.

Apply this diff:

+#!/bin/bash
 # Stripping package code is based on https://github.com/ryfeus/lambda-packs repo
benchmarks/200.multimedia/220.video-processing/python/function.py (3)

17-23: Consider using iterable unpacking instead of list concatenation.

The list concatenation on line 18 can be simplified using iterable unpacking for better readability.

Apply this diff:

-    ret = subprocess.run(
-        [os.path.join(SCRIPT_DIR, "ffmpeg", "ffmpeg"), "-y"] + args,
+    ret = subprocess.run(
+        [os.path.join(SCRIPT_DIR, "ffmpeg", "ffmpeg"), "-y", *args],
         # subprocess might inherit Lambda's input for some reason
         stdin=subprocess.DEVNULL,
         stdout=subprocess.PIPE,
         stderr=subprocess.STDOUT,
     )

31-46: Unused parameter can be removed or prefixed.

The event parameter is not used in this function. If it's required by the operations interface, consider prefixing it with an underscore (_event) to indicate it's intentionally unused.

Apply this diff if the parameter is not needed:

-def to_gif(video, duration, event):
+def to_gif(video, duration, _event):
     output = "/tmp/processed-{}.gif".format(os.path.basename(video))
     call_ffmpeg(
         [
             "-i",
             video,
             "-t",
             "{0}".format(duration),
             "-vf",
             "fps=10,scale=320:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse",
             "-loop",
             "0",
             output,
         ]
     )
     return output

50-66: Unused parameter can be removed or prefixed.

The event parameter is not used in this function. If it's required by the operations interface, consider prefixing it with an underscore (_event) to indicate it's intentionally unused.

Apply this diff if the parameter is not needed:

-def watermark(video, duration, event):
+def watermark(video, duration, _event):
     output = "/tmp/processed-{}".format(os.path.basename(video))
     watermark_file = os.path.dirname(os.path.realpath(__file__))
     call_ffmpeg(
         [
             "-i",
             video,
             "-i",
             os.path.join(watermark_file, os.path.join("resources", "watermark.png")),
             "-t",
             "{0}".format(duration),
             "-filter_complex",
             "overlay=main_w/2-overlay_w/2:main_h/2-overlay_h/2",
             output,
         ]
     )
     return output
benchmarks/500.scientific/5xx.channel_flow_jax_npbench/input.py (1)

1-17: generate_input contract and size presets look consistent; address Ruff ARG001 via config, not signature changes

The size_generators mapping and generate_input implementation match the pattern used in other JAX NPBench inputs (e.g., 5xx.compute_jax_npbench/input.py), so behavior is consistent. Ruff correctly notes that data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, and nosql_func are unused, but these parameters are part of the common generate_input interface used across benchmarks. If this file is linted, consider suppressing ARG001 at the function or file level (e.g., # ruff: noqa: ARG001) rather than altering the signature.

benchmarks/500.scientific/503.graph-bfs/input.py (1)

1-7: Deterministic size/seed configuration is correct; handle unused-arg linting consistently

size_generators plus a constant "seed": 42 matches the pattern in the pagerank and MST inputs, so graph generation for BFS will be deterministic and comparable across the graph benchmarks. As elsewhere, the generate_input parameters are unused but required by the shared interface; if Ruff runs on benchmarks, it’s better to ignore ARG001 for this function/file (or this directory) than to change the signature and risk diverging from the orchestrator’s expectations.

benchmarks/600.linearalgebra/602.axpy/input.py (1)

1-7: Input generator logic is fine; consider minimal handling of unused-arg lint.

The size_generators mapping and generate_input return shape are consistent with the other linear-algebra benchmarks and look correct. The unused parameters in generate_input are only there to satisfy the common benchmark interface, but they trigger Ruff ARG001 warnings.

If Ruff is enforced in CI, you could cheaply acknowledge these without changing the public API, for example:

 def generate_input(
     data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func
 ):
-    return {"size": size_generators[size], "seed": 42}
+    # Parameters are part of the common benchmark interface but not used here.
+    _ = (data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func)
+    return {"size": size_generators[size], "seed": 42}
benchmarks/500.scientific/5xx.compute_jax_npbench/input.py (1)

1-17: Consistent JAX size mapping; same note about unused-arg warnings.

The size_generators structure and generate_input behavior are aligned with the other npbench-style inputs and look good.

As with the other generators, Ruff flags the unused parameters in generate_input. If lint is enforced, you can keep the interface and quiet the warnings:

 def generate_input(
@@
-):
-    return {"size": size_generators[size]}
+):
+    # Unused parameters kept to match the common benchmark interface.
+    _ = (data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func)
+    return {"size": size_generators[size]}
benchmarks/600.linearalgebra/604.cholesky/input.py (1)

1-7: Matches the linear-algebra input pattern; optional lint-silencing for unused args.

This mirrors the other 600.linearalgebra input modules and is functionally fine.

If Ruff ARG001 is a concern, you can acknowledge the interface-only parameters:

 def generate_input(
     data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func
 ):
-    return {"size": size_generators[size], "seed": 42}
+    # Unused, required by the shared benchmark interface.
+    _ = (data_dir, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func)
+    return {"size": size_generators[size], "seed": 42}
benchmarks/400.inference/411.image-recognition/input.py (1)

8-15: Minor polish: unused loop variable, unused args, and a small edge-case in generate_input.

  • Line 10: dirs is never used; to satisfy Ruff B007 you can rename it to _dirs (or _) to make intent explicit:
    for root, _dirs, files in os.walk(data_dir):
  • Lines 29–31: size, output_paths, and nosql_func are unused but required by the shared interface. If Ruff is enforced, consider the same _ = (...) pattern used in other generators to quiet ARG001 without changing behavior.
  • Lines 37–48: input_images[0][0] assumes that val_map.txt is non-empty. If there’s any chance of an empty file in some runs, a simple guard would prevent an IndexError, e.g.:
    if not input_images:
        raise ValueError("val_map.txt did not contain any images")

Also applies to: 29-51

benchmarks/600.linearalgebra/603.jacobi2d/input.py (1)

1-7: Linear-algebra input looks good; same optional unused-arg handling as siblings.

This module is consistent with the other 600.linearalgebra inputs (601.matmul, 602.axpy, 604.cholesky) and the return structure is fine.

If you decide to quiet Ruff ARG001 in one of these files, consider applying the same _ = (...) pattern here for the unused parameters to keep the suite consistent.

benchmarks/400.inference/413.image-classification/python/function.py (2)

17-21: Inference preprocessing and label mapping look correct; consider minor robustness tweaks.

Overall, this follows a standard ResNet-50 ImageNet pipeline and looks sound:

  • imagenet_class_index.json is loaded once and idx2label is built as expected.
  • _prepare_tensor does resize→center-crop→normalize→NCHW with ImageNet mean/std; that’s correct for typical ResNet-50.

A few low-priority cleanups you might consider:

  • Line 18: wrap the JSON load in a context manager so the file handle is explicitly closed:
    with open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r") as f:
        class_idx = json.load(f)
  • Lines 92, 95, 102, 103: the outer int() around round(...) is redundant (Ruff RUF046) and can be dropped without changing behavior.
  • _prepare_tensor: if you expect occasional bad image paths, you might want to let the Image.open error bubble up with a clearer message or catch/log it, but for a controlled benchmark dataset the current behavior is acceptable.

Also applies to: 30-31, 109-117


34-85: Secure tar extraction and temp handling — well-established patterns exist; consider applying them.

Yes, well-established patterns do exist for this. Tar path-traversal vulnerabilities are real (documented in CVEs) and Python's tarfile module has known risks with extractall() on untrusted archives.

Recommended hardening approach:

  1. Replace hardcoded /tmp with tempfile.TemporaryDirectory() (context manager for automatic cleanup) or tempfile.mkstemp() for per-file temps. This avoids TOCTOU issues and makes intent clearer.

  2. Validate tar members before extraction instead of calling extractall() directly:

    • Compute resolved target path: resolved = (Path(extract_dir) / member.name).resolve()
    • Verify it stays within extract_dir using Path.is_relative_to() or prefix comparison
    • Reject absolute paths, .. segments, symlinks (member.issym()), device nodes (member.isdev()), and other special members
    • Use tar.extractfile() and write validated content manually rather than extractall()
  3. Example pattern (lines 46–60):

    with tempfile.TemporaryDirectory() as temp_dir:
        archive_path = os.path.join(temp_dir, archive_basename)
        # ... download to archive_path ...
        with tarfile.open(archive_path, "r:gz") as tar:
            for member in tar.getmembers():
                resolved = (Path(MODEL_DIRECTORY) / member.name).resolve()
                if not resolved.is_relative_to(Path(MODEL_DIRECTORY).resolve()):
                    raise ValueError(f"Unsafe member path: {member.name}")
                if member.issym() or member.isdev() or member.islnk():
                    raise ValueError(f"Unsafe member type: {member.name}")
                tar.extract(member, MODEL_DIRECTORY)  # or extractfile + manual write

For a controlled benchmark, the current code is functionally safe, but applying these patterns would make it safer to reuse and demonstrates defense-in-depth.

benchmarks/600.linearalgebra/603.jacobi2d/python/function.py (2)

44-50: Remove unused seed handling code.

The seed handling logic extracts a seed variable (lines 44-50), but it's never used. The initialize_torch function at line 53 doesn't accept a seed parameter, and the seed isn't referenced elsewhere in the handler. This is dead code.

Apply this diff to remove the unused code:

 def handler(event):
 
     size = event.get("size")
-    if "seed" in event:
-        import random
-
-        random.seed(event["seed"])
-
-        seed = event.get("seed", 42)
-        seed = int(seed)
 
     matrix_generating_begin = datetime.datetime.now()

57-57: Prefix unused variables with underscore.

The variables A_out and B_out are unpacked but never used. Apply the underscore prefix to indicate they are intentionally unused.

Apply this diff:

-    A_out, B_out, gpu_ms = kernel_jacobi2d(A, B, iters=50)
+    _, _, gpu_ms = kernel_jacobi2d(A, B, iters=50)
benchmarks/600.linearalgebra/601.matmul/python/function.py (1)

35-41: Remove unused seed handling code.

The seed handling logic (lines 35-41) extracts a seed variable, but it's never used. The initialize_torch function doesn't accept a seed parameter.

Apply this diff to remove the unused code:

 def handler(event):
 
     size = event.get("size")
-    if "seed" in event:
-        import random
-
-        random.seed(event["seed"])
-
-        seed = event.get("seed", 42)
-        seed = int(seed)
 
     matrix_generating_begin = datetime.datetime.now()
benchmarks/600.linearalgebra/605.lu/python/function.py (2)

53-59: Remove unused seed handling code.

The seed handling logic (lines 53-59) extracts a seed variable, but it's never used since initialize_torch doesn't accept a seed parameter.

Apply this diff to remove the unused code:

 def handler(event):
     size = event.get("size")
-    if "seed" in event:
-        import random
-
-        random.seed(event["seed"])
-
-        seed = event.get("seed", 42)
-        seed = int(seed)
 
     gen_begin = datetime.datetime.now()

66-66: Prefix unused variable with underscore.

The variable B is unpacked but never used. Apply the underscore prefix to indicate it's intentionally unused.

Apply this diff:

-    B, gpu_ms = kernel(A)
+    _, gpu_ms = kernel(A)
benchmarks/600.linearalgebra/604.cholesky/python/function.py (2)

36-42: Remove unused seed handling code.

The seed handling logic (lines 36-42) extracts a seed variable, but it's never used since initialize_torch doesn't accept a seed parameter.

Apply this diff to remove the unused code:

 def handler(event):
     size = event.get("size")
-    if "seed" in event:
-        import random
-
-        random.seed(event["seed"])
-
-        seed = event.get("seed", 42)
-        seed = int(seed)
 
     gen_begin = datetime.datetime.now()

49-49: Prefix unused variable with underscore.

The variable L is unpacked but never used. Apply the underscore prefix to indicate it's intentionally unused.

Apply this diff:

-    L, gpu_ms = kernel_cholesky(A)
+    _, gpu_ms = kernel_cholesky(A)
benchmarks/600.linearalgebra/606.spmv/python/function.py (1)

58-60: Clarify that the SpMV output is intentionally unused

y_out is never used after kernel_spmv returns, which triggers the Ruff warning and may confuse readers.

You can explicitly mark the result as unused:

-    y_out, gpu_ms = kernel_spmv(A, x, reps=100)
+    _, gpu_ms = kernel_spmv(A, x, reps=100)

This keeps behavior identical while silencing the unused-variable warning.

benchmarks/000.microbenchmarks/010.sleep/input.py (1)

1-11: Input generator matches project conventions; unused args are expected

size_generators and generate_input follow the common input.py pattern (fixed sizes, only size used; other parameters kept for a uniform interface). Unused-argument warnings here can be safely ignored since the benchmark framework supplies these arguments but this benchmark only needs size. Based on learnings, inputs are maintainer-controlled, so additional validation isn’t required.

benchmarks/400.inference/413.image-classification/input.py (1)

18-26: Update the outdated docstring to match image‑classification behavior

The docstring still mentions “compression test”, which no longer matches this benchmark’s purpose and can mislead future readers.

Consider updating it along these lines:

-"""
-    Generate test, small and large workload for compression test.
+"""
+    Generate test, small and large workloads for the image classification benchmark.

The rest of the logic (model/image uploads and input_config layout) looks consistent with other inference benchmarks.

benchmarks/400.inference/413.recommendation/python/function.py (1)

54-63: Use MODEL_CACHE consistently for temporary model files

MODEL_CACHE is created but the downloaded checkpoint is written directly under "/tmp" instead of that directory. Using the cache directory for tmp_path makes the intent clearer and keeps related files grouped.

-    os.makedirs(MODEL_CACHE, exist_ok=True)
-    tmp_path = os.path.join("/tmp", f"{uuid.uuid4()}-{MODEL_FILE}")
+    os.makedirs(MODEL_CACHE, exist_ok=True)
+    tmp_path = os.path.join(MODEL_CACHE, f"{uuid.uuid4()}-{MODEL_FILE}")

Behavior (download, load, remove) stays the same, but the cache directory is now actually used.

benchmarks/400.inference/412.language-bert/python/function.py (4)

31-45: Harden model archive handling and temp file usage

Current model download/extraction uses hard-coded /tmp paths and tarfile.extractall directly on an archive that comes from external storage. That opens you up to classic temp-file and tar path‑traversal issues (e.g., a malicious archive writing outside MODEL_DIRECTORY).

Consider:

  • Using tempfile.NamedTemporaryFile / TemporaryDirectory instead of manual /tmp paths.
  • Validating archive members before extraction and refusing entries that would escape MODEL_DIRECTORY.

For example:

-import tarfile
+import tarfile
+import tempfile
...
-            archive_path = os.path.join("/tmp", f"{uuid.uuid4()}-{MODEL_ARCHIVE}")
+            fd, archive_path = tempfile.mkstemp(suffix=f"-{MODEL_ARCHIVE}")
+            os.close(fd)
...
-        with tarfile.open(archive_path, "r:gz") as tar:
-            tar.extractall(MODEL_DIRECTORY)
+        with tarfile.open(archive_path, "r:gz") as tar:
+            base = os.path.realpath(MODEL_DIRECTORY)
+            for member in tar.getmembers():
+                dest = os.path.realpath(os.path.join(MODEL_DIRECTORY, member.name))
+                if not dest.startswith(base + os.sep):
+                    raise RuntimeError(f"Refusing to extract outside {MODEL_DIRECTORY}: {member.name}")
+            tar.extractall(MODEL_DIRECTORY)

72-79: Clarify model timing semantics

model_download_time and model_process_time are measured in microseconds and can legitimately be zero when the model is already initialized. That’s fine, but for consistency with other benchmarks (which generally report milliseconds), consider normalizing units or documenting that these are µs so downstream tooling doesn’t misinterpret values.


107-118: Validate required event fields before composing storage paths

bucket, model_prefix, text_prefix, and text_key are assumed present; if any are missing/None, os.path.join or client.download will raise confusing TypeErrors.

Consider an explicit validation up front:

 def handler(event):
-    bucket = event.get("bucket", {}).get("bucket")
-    model_prefix = event.get("bucket", {}).get("model")
-    text_prefix = event.get("bucket", {}).get("text")
-    text_key = event.get("object", {}).get("input")
+    bucket_info = event.get("bucket", {})
+    object_info = event.get("object", {})
+    bucket = bucket_info.get("bucket")
+    model_prefix = bucket_info.get("model")
+    text_prefix = bucket_info.get("text")
+    text_key = object_info.get("input")
+
+    if not all([bucket, model_prefix, text_prefix, text_key]):
+        raise ValueError(f"Missing required bucket/text fields in event: {event}")

This keeps failures deterministic and easier to debug when wiring new benchmarks.


133-144: Use zip(..., strict=True) to catch length mismatches early

Static analysis is right that an explicit strict helps detect subtle bugs if sentences and probabilities ever diverge in length.

-    for sentence, probs in zip(sentences, probabilities):
+    for sentence, probs in zip(sentences, probabilities, strict=True):

That will raise immediately instead of silently dropping or duplicating elements.

benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py (2)

13-36: Convolution logic is correct but could use lax.conv_general_dilated for performance

The hand‑rolled conv2d using nested lax.scan and dynamic_slice/dynamic_update_slice is functionally correct for stride‑1 valid convolution, but it will be significantly slower than using lax.conv_general_dilated on GPUs/TPUs.

If you care about maximizing throughput for this benchmark, consider switching to:

def conv2d(x, w):
    return lax.conv_general_dilated(
        x,
        w,
        window_strides=(1, 1),
        padding="VALID",
        dimension_numbers=("NHWC", "HWIO", "NHWC"),
    )

68-79: Prefer NumPy/JAX dtypes explicitly in RNG initialization

You’re passing dtype=jnp.float32 into numpy.random.default_rng().random. That’s usually convertible, but being explicit avoids surprises:

-    input = rng.random((N, H, W, C1), dtype=jnp.float32)
-    conv1 = rng.random((1, 1, C1, C2), dtype=jnp.float32)
-    conv2 = rng.random((3, 3, C2, C2), dtype=jnp.float32)
-    conv3 = rng.random((1, 1, C2, C1), dtype=jnp.float32)
+    import numpy as np
+    input = rng.random((N, H, W, C1), dtype=np.float32)
+    conv1 = rng.random((1, 1, C1, C2), dtype=np.float32)
+    conv2 = rng.random((3, 3, C2, C2), dtype=np.float32)
+    conv3 = rng.random((1, 1, C2, C1), dtype=np.float32)

JAX will still happily accept these arrays and move them to the device on first use.

benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py (1)

237-279: Catch specific transfer exceptions instead of broad except Exception

The core concern is valid: swallowing all exceptions hides real issues. However, based on jax.device_get behavior, TypeError is not the primary exception you'd encounter. Transfer guard violations raise RuntimeError, and backend/XLA failures surface as runtime errors like XlaRuntimeError. Catching only TypeError would miss these cases. Consider:

-    try:
-        results = jax.device_get(results)
-    except Exception:
-        pass
+    try:
+        results = jax.device_get(results)
+    except (RuntimeError, TypeError):
+        # Transfer failed or __array__ not available; keep device array.
+        pass

Or remove the catch entirely if device/host transfer failure should propagate.

Comment on lines +35 to +37
- name: Install Python dependencies (via install.py)
run: |
python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Activate virtualenv before installing dependencies.

The python3 install.py command runs without activating the virtualenv first. This causes dependencies to be installed in the system Python instead of the cached virtualenv, making the cache ineffective and causing subsequent steps to fail if they expect tools in the venv.

Activate the virtualenv before running install.py:

       - name: Install Python dependencies (via install.py)
         run: |
+          . python-venv/bin/activate
           python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local

Alternatively, use the Python binary from the venv directly:

       - name: Install Python dependencies (via install.py)
         run: |
-          python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
+          python-venv/bin/python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Install Python dependencies (via install.py)
run: |
python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
- name: Install Python dependencies (via install.py)
run: |
. python-venv/bin/activate
python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
Suggested change
- name: Install Python dependencies (via install.py)
run: |
python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
- name: Install Python dependencies (via install.py)
run: |
python-venv/bin/python3 install.py --no-aws --no-azure --no-gcp --no-openwhisk --no-local
🤖 Prompt for AI Agents
In .github/workflows/lint.yml around lines 35-37, the workflow runs python3
install.py without activating the cached virtualenv so packages are installed
into the system Python; before running install.py either activate the virtualenv
created/checked-out by the job (e.g. source the venv activation script for
Linux/macOS or the Scripts\Activate script on Windows) and then run install.py,
or invoke the Python binary from the virtualenv directly (e.g. use the venv's
python executable to run install.py) so dependencies are installed into and
cached with the venv.

Comment on lines 19 to 20
data_dir, size, benchmarks_bucket, input_buckets, output_buckets, upload_func, nosql_func
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify: Breaking function signature change across call sites.

The function signature changed parameter names from input_paths, output_paths to input_buckets, output_buckets. This breaks any code calling this function with positional arguments or keyword arguments using the old names. Existing call sites throughout the codebase must be updated to match.

Additionally, this signature diverges from the standard pattern used in other benchmarks (e.g., 110.dynamic-html, 030.clock-synchronization, 411.image-recognition), which use input_paths and output_paths. Confirm this is intentional and not an incomplete refactor.


🏁 Script executed:

#!/bin/bash
# Description: Find and verify all call sites to generate_input in the uploader benchmark module

# Search for calls to generate_input that might still use old parameter names
rg -n 'generate_input' --type py -- benchmarks/100.webapps/120.uploader/ -A 3 -B 3

# Search for imports or references to this module's generate_input
rg -n 'from.*120.uploader.*import\|import.*120.uploader' --type py

Length of output: 237


🏁 Script executed:

#!/bin/bash
# Corrected: Find all call sites to generate_input with proper rg syntax

# Search for all calls to generate_input in the entire codebase
rg -n 'generate_input\s*\(' --type py -A 5 -B 2

# Search for imports from 120.uploader module
rg -n 'from.*120\.uploader|import.*120\.uploader' --type py

Length of output: 18273


🏁 Script executed:

#!/bin/bash
# Check if 130.crud-api is part of the same change
git log -1 --oneline -- benchmarks/100.webapps/130.crud-api/input.py

# Check whether abstract method signature matches the two changed implementations
grep -A 7 "def generate_input" sebs/benchmark.py | head -20

Length of output: 335


Critical: Abstract method signature mismatch with implementation.

The generate_input function signature in benchmarks/100.webapps/120.uploader/input.py:18-20 uses parameter names input_buckets, output_buckets, but the abstract method definition at sebs/benchmark.py:804-809 still declares input_paths, output_paths. This same inconsistency also exists in benchmarks/100.webapps/130.crud-api/input.py:8-10, while ~30 other benchmark implementations use the standard input_paths, output_paths signature.

Required fix: Update the abstract method signature in sebs/benchmark.py:804-809 to use input_buckets, output_buckets to match these implementations, or revert both 120.uploader and 130.crud-api to use the standard parameter names. The interface contract must be consistent across all implementations.

🧰 Tools
🪛 Ruff (0.14.5)

19-19: Unused function argument: data_dir

(ARG001)


19-19: Unused function argument: input_buckets

(ARG001)


19-19: Unused function argument: upload_func

(ARG001)


19-19: Unused function argument: nosql_func

(ARG001)

Comment on lines +29 to 31
class_idx = json.load(open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r"))
idx2label = [class_idx[str(k)][1] for k in range(len(class_idx))]
model = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix Tensor‑index bug when mapping prediction to label (and tighten JSON loading)

torch.max returns a Tensor index; using it directly to index a Python list (idx2label[index]) will raise a TypeError. Convert the index to a plain integer first. While here, it’s also cleaner to load imagenet_class_index.json via a context manager.

-SCRIPT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__)))
-class_idx = json.load(open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r"))
+SCRIPT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__)))
+with open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r") as f:
+    class_idx = json.load(f)
@@
-    input_batch = input_tensor.unsqueeze(0)  # create a mini-batch as expected by the model
-    output = model(input_batch)
-    _, index = torch.max(output, 1)
-    ret = idx2label[index]
+    input_batch = input_tensor.unsqueeze(0)  # create a mini-batch as expected by the model
+    output = model(input_batch)
+    _, index = torch.max(output, 1)
+    idx = index.item()
+    ret = idx2label[idx]

This removes the runtime error and keeps behavior otherwise unchanged.

Also applies to: 78-79

🤖 Prompt for AI Agents
In benchmarks/400.inference/411.image-recognition/python/function.py around
lines 29-31 (and similarly lines 78-79) the code loads imagenet_class_index.json
without a context manager and later uses a torch tensor index directly to index
a Python list; open the JSON with a with-statement (using open(..., "r") as f
and json.load(f)) and when mapping a prediction to a label convert the tensor
index to an int (e.g., index = int(index_tensor) or index_tensor.item()) before
using idx2label[index]; keep all other logic unchanged.

Comment on lines +3 to +4
torch==2.5.1
torchvision==0.20.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify torch 2.5.1 is the latest secure version.

torch 2.5.1 is newer than the vulnerable 2.4.1 used elsewhere in this PR, which is good. However, please verify:

  • If 2.5.1 is the latest version or if a newer patch release is available
  • Whether 2.5.1 addresses all the security vulnerabilities found in 2.4.1
  • If torchvision 0.20.1 is compatible and secure

Run the following script to verify:


🏁 Script executed:

#!/bin/bash
# Check latest torch and torchvision versions

echo "=== Latest torch version ==="
curl -s https://pypi.org/pypi/torch/json | jq '.info.version'

echo "=== Latest torchvision version ==="
curl -s https://pypi.org/pypi/torchvision/json | jq '.info.version'

echo "=== Security advisories for torch 2.5.x ==="
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: PIP, package: "torch") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}' | jq '.data.securityVulnerabilities.nodes[] | select(.vulnerableVersionRange | contains("2.5"))'

Length of output: 124


I need to gather additional information to complete the verification. Let me check security advisories for torch 2.4.1 vs 2.5.1 and version compatibility.

Upgrade torch and torchvision to versions with critical security patch.

torch 2.5.1 contains a critical Remote Code Execution vulnerability (CVE-2025-32434) in the torch.load() function when used with weights_only=True. Update to PyTorch 2.6.0 or higher, and correspondingly upgrade torchvision to 0.21.0 or later. torch 2.5.1 is compatible with torchvision 0.20.1, so current compatibility is not the issue, but the security vulnerability is critical and requires immediate patching.

@@ -0,0 +1,4 @@
numpy>=2.0
pillow>=10.0
torch==2.5.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify torch 2.5.1 is the latest secure version and standardize across files.

This file uses torch==2.5.1, which differs from torch==2.4.1 used in other requirements files in this PR. Ensure that 2.5.1 is the latest patched version and update all other requirements files to match this version for consistency.


🌐 Web query:

What is the latest stable version of PyTorch and does version 2.5.1 have any known security vulnerabilities?

💡 Result:

  • Latest stable PyTorch (as of Nov 23, 2025): PyTorch 2.9 (released Oct 2025). [1]

  • Does PyTorch 2.5.1 have known security vulnerabilities? Yes — PyTorch ≤ 2.5.1 is affected by a critical RCE vulnerability (CVE-2025-32434) in torch.load(...) when using weights_only=True; the issue was fixed in PyTorch 2.6.0. Update to ≥2.6.0 (or the current stable release) to mitigate. [2][3]

Sources:

  1. PyTorch 2.9 release announcement. [PyTorch blog]. [1]
  2. NVD / CVE-2025-32434 details (lists affected versions and fix in 2.6.0). [2]
  3. Additional vendor advisories summarizing the issue (e.g., INCIBE/IBM). [3]


Update torch to 2.6.0 or later immediately—torch 2.5.1 contains a critical RCE vulnerability.

PyTorch 2.5.1 is affected by a critical RCE vulnerability (CVE-2025-32434) in torch.load() when using weights_only=True, which was fixed in PyTorch 2.6.0. The latest stable version is PyTorch 2.9 (released Oct 2025). Update this file and all other requirements files using torch 2.4.1 to a secure version (≥2.6.0, preferably 2.9).

🤖 Prompt for AI Agents
In
benchmarks/400.inference/413.image-classification/python/requirements.txt.3.12
around line 3, the file pins torch to 2.5.1 which is vulnerable
(CVE-2025-32434); update the pin to a secure release (at minimum >=2.6.0,
preferably the latest stable such as 2.9.x) across this file and any other
requirements that reference torch 2.4.x/2.5.x, run dependency resolution and
tests to ensure compatibility, and if needed adjust any torch-specific code or
extra pins (CUDA builds, torchvision) to match the chosen secure torch version.

Comment on lines +77 to +110
@partial(jax.jit, static_argnums=(0,))
def pressure_poisson_periodic(nit, p, dx, dy, b):
def body_func(p, q):
pn = p.copy()
p = p.at[1:-1, 1:-1].set(
((pn[1:-1, 2:] + pn[1:-1, 0:-2]) * dy**2 + (pn[2:, 1:-1] + pn[0:-2, 1:-1]) * dx**2)
/ (2 * (dx**2 + dy**2))
- dx**2 * dy**2 / (2 * (dx**2 + dy**2)) * b[1:-1, 1:-1]
)

# Periodic BC Pressure @ x = 2
p = p.at[1:-1, -1].set(
((pn[1:-1, 0] + pn[1:-1, -2]) * dy**2 + (pn[2:, -1] + pn[0:-2, -1]) * dx**2)
/ (2 * (dx**2 + dy**2))
- dx**2 * dy**2 / (2 * (dx**2 + dy**2)) * b[1:-1, -1]
)

# Periodic BC Pressure @ x = 0
p = p.at[1:-1, 0].set(
(
((pn[1:-1, 1] + pn[1:-1, -1]) * dy**2 + (pn[2:, 0] + pn[0:-2, 0]) * dx**2)
/ (2 * (dx**2 + dy**2))
- dx**2 * dy**2 / (2 * (dx**2 + dy**2)) * b[1:-1, 0]
)
)

# Wall boundary conditions, pressure
p = p.at[-1, :].set(p[-2, :]) # dp/dy = 0 at y = 2
p = p.at[0, :].set(p[1, :]) # dp/dy = 0 at y = 0

return p, None

p, _ = lax.scan(body_func, p, jnp.arange(nit))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

pressure_poisson_periodic never returns the updated pressure field

pressure_poisson_periodic runs a lax.scan over nit iterations and computes an updated p, but the function doesn’t return it. As written, it returns None, and channel_flow ignores the result:

b = build_up_b(...)
pressure_poisson_periodic(nit, p, dx, dy, b)  # result unused
# p is not mutated in-place; JAX arrays are immutable

So p in channel_flow is never updated by the Poisson solve, which is a correctness bug.

You likely want:

 @partial(jax.jit, static_argnums=(0,))
 def pressure_poisson_periodic(nit, p, dx, dy, b):
-    def body_func(p, q):
+    def body_func(p, _):
         pn = p.copy()
         ...
-        return p, None
+        return p, None
@@
-    p, _ = lax.scan(body_func, p, jnp.arange(nit))
-
+    p, _ = lax.scan(body_func, p, jnp.arange(nit))
+    return p

And in channel_flow:

-        b = build_up_b(rho, dt, dx, dy, u, v)
-        pressure_poisson_periodic(nit, p, dx, dy, b)
+        b = build_up_b(rho, dt, dx, dy, u, v)
+        p = pressure_poisson_periodic(nit, p, dx, dy, b)

This makes the pressure updates actually feed into the momentum equations.

🧰 Tools
🪛 Ruff (0.14.5)

79-79: Unused function argument: q

(ARG001)

🤖 Prompt for AI Agents
In benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py
around lines 77-110, pressure_poisson_periodic runs lax.scan to compute the
updated pressure but never returns it, so callers keep the old p; change the
function to return the final p from the scan (e.g. capture the output p and
return it), and update channel_flow to assign its p to the returned value (p =
pressure_poisson_periodic(nit, p, dx, dy, b)) so the updated pressure feeds into
the momentum solve.

Comment on lines +32 to +59
def handler(event):
size = event.get("size")
if "seed" in event:
import random

random.seed(event["seed"])

seed = event.get("seed", 42)
seed = int(seed)

gen_begin = datetime.datetime.now()
alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed)
gen_end = datetime.datetime.now()

comp_begin = datetime.datetime.now()
y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100)
comp_end = datetime.datetime.now()

gen_us = (gen_end - gen_begin) / datetime.timedelta(microseconds=1)
comp_us = (comp_end - comp_begin) / datetime.timedelta(microseconds=1)

return {
"measurement": {
"generating_time": gen_us,
"compute_time": comp_us,
"gpu_time": gpu_ms,
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix undefined variable when seed is not provided.

Line 43 references seed, but this variable is only defined inside the if "seed" in event block (lines 34-40). If the event does not contain a "seed" key, seed will be undefined, causing a NameError at runtime.

Apply this diff to fix the issue:

 def handler(event):
     size = event.get("size")
+    seed = 42  # default seed
     if "seed" in event:
         import random
 
         random.seed(event["seed"])
 
-        seed = event.get("seed", 42)
-        seed = int(seed)
+        seed = int(event["seed"])
 
     gen_begin = datetime.datetime.now()
     alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed)

Note: The random.seed() call on line 37 appears redundant since PyTorch's RNG is seeded within initialize_torch.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def handler(event):
size = event.get("size")
if "seed" in event:
import random
random.seed(event["seed"])
seed = event.get("seed", 42)
seed = int(seed)
gen_begin = datetime.datetime.now()
alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed)
gen_end = datetime.datetime.now()
comp_begin = datetime.datetime.now()
y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100)
comp_end = datetime.datetime.now()
gen_us = (gen_end - gen_begin) / datetime.timedelta(microseconds=1)
comp_us = (comp_end - comp_begin) / datetime.timedelta(microseconds=1)
return {
"measurement": {
"generating_time": gen_us,
"compute_time": comp_us,
"gpu_time": gpu_ms,
}
}
def handler(event):
size = event.get("size")
seed = 42 # default seed
if "seed" in event:
import random
random.seed(event["seed"])
seed = int(event["seed"])
gen_begin = datetime.datetime.now()
alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed)
gen_end = datetime.datetime.now()
comp_begin = datetime.datetime.now()
y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100)
comp_end = datetime.datetime.now()
gen_us = (gen_end - gen_begin) / datetime.timedelta(microseconds=1)
comp_us = (comp_end - comp_begin) / datetime.timedelta(microseconds=1)
return {
"measurement": {
"generating_time": gen_us,
"compute_time": comp_us,
"gpu_time": gpu_ms,
}
}
🧰 Tools
🪛 Ruff (0.14.5)

47-47: Unpacked variable y_out is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/602.axpy/python/function.py around lines 32 to
59, the variable `seed` can be undefined when the incoming event lacks a "seed"
key; to fix, initialize `seed` unconditionally at the top (e.g., seed =
int(event.get("seed", 42))) before any conditional, remove the redundant
import/random.seed usage inside the if-block (or only keep the import if needed
elsewhere), and ensure all references use this initialized `seed` so NameError
cannot occur when "seed" is omitted.

@@ -0,0 +1 @@
torch==2.4.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Critical security vulnerabilities in torch 2.4.1.

Static analysis has flagged 5 critical security vulnerabilities in torch 2.4.1, including remote code execution risks (GHSA-53q9-r3pm-6pq6), denial of service (GHSA-3749-ghw9-m3mg), and resource management issues (GHSA-887c-mr87-cxwp).

Consider upgrading to torch 2.5.1 or later, which appears to be used elsewhere in this PR (see benchmarks/400.inference/413.image-classification/python/requirements.txt.3.12). If torch 2.4.1 is required for specific compatibility reasons, document the rationale and ensure the vulnerable functionality is not exposed in the benchmark.


🌐 Web query:

What are the security vulnerabilities in PyTorch 2.4.1 and what is the latest patched version?

💡 Result:

Short answer

  • Known security issues affecting PyTorch 2.4.1:

    1. CVE-2024-48063 — deserialization (RemoteModule / torch.distributed.rpc) can lead to remote code execution (disputed as “intended” for distributed RPC, but tracked as a security issue). [1][2][3]
    2. Earlier heap buffer-overflow bug (CVE-2024-31580) affected versions before v2.2.0 and is not present in 2.4.1 (it was fixed in earlier 2.x releases). [4]
  • More recent related RCE (reported Apr 2025) — GHSA/CVE-2025-32434: torch.load(..., weights_only=True) RCE — affects <=2.5.1 and was patched in 2.6.0. That issue therefore also implies 2.4.1 is vulnerable to that finding. [5]

What to upgrade to

  • Minimum patched release for the GHSA/CVE-2025-32434 fix: 2.6.0. [5]
  • PyTorch’s current “stable” release (as shown on the PyTorch site) is 2.7.0 — upgrade to the latest stable (or at least ≥2.6.0) to obtain the patches and additional fixes. [6]

Sources

  • NVD / CVE-2024-48063 (deserialization RCE, affects <=2.4.1). [1]
  • Multiple vulnerability trackers / advisories summarizing CVE-2024-48063 (Wiz, Snyk, Rapid7, ResolvedSecurity). [2][3][7]
  • CVE-2024-31580 (heap buffer overflow — fixed before 2.4.1). [4]
  • PyTorch GitHub security advisory GHSA-53q9-r3pm-6pq6 / CVE-2025-32434 (torch.load weights_only=True RCE) — patched in 2.6.0. [5]
  • PyTorch official site (current stable release listing). [6]

If you want, I can:

  • give the exact pip/conda upgrade command for your environment, or
  • fetch the specific upstream commits/PRs that backport each fix.

I need to verify the specific CVEs mentioned in the review and check if torch 2.5.1 is actually used elsewhere in the codebase.

Based on the search results, I can now identify critical inaccuracies in the original review comment:

Key Findings:

  1. Incorrect CVE severity and count: The review claims "5 critical security vulnerabilities" but GHSA-3749-ghw9-m3mg and GHSA-887c-mr87-cxwp are both classified with LOW severity (CVSS 3.3), not critical.

  2. Wrong affected versions: GHSA-3749-ghw9-m3mg (CVE-2025-2953) is described as found in PyTorch 2.6.0+cu124 and GHSA-887c-mr87-cxwp (CVE-2025-3730) is found in PyTorch 2.6.0, suggesting they may not affect 2.4.1 at all.

  3. Critical security issue acknowledged: CVE-2024-48063 (the RemoteModule deserialization RCE) affects PyTorch ≤2.4.1 with CVSS 9.8 Critical. However, this is disputed by multiple parties because this is intended behavior in PyTorch distributed computing.

  4. Incorrect upgrade recommendation: CVE-2025-32434 (torch.load RCE) affects PyTorch versions 2.5.1 and earlier, with users advised to update to version 2.6.0 or later. Upgrading to 2.5.1 as suggested in the review would not resolve this vulnerability.


Upgrade to torch 2.6.0 or later to address the confirmed RCE vulnerabilities affecting 2.4.1 and 2.5.1. Note that CVE-2024-48063 (the primary RCE in 2.4.1) is disputed as intended behavior for distributed RPC—verify whether RemoteModule deserialization is actually used in this benchmark.

🧰 Tools
🪛 OSV Scanner (2.2.4)

[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2024-259)


[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2025-41)


[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service

(GHSA-3749-ghw9-m3mg)


[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution

(GHSA-53q9-r3pm-6pq6)


[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability

(GHSA-887c-mr87-cxwp)

🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/604.cholesky/python/requirements.txt lines 1-1:
the review's upgrade advice and CVE counts are incorrect and the project may
still be vulnerable; verify CVE applicability and then either upgrade or
document mitigations. Search the repo for any occurrences of "torch==2.5.1" or
other pinned torch versions and for uses of torch.load, RemoteModule, RPC or
RemoteModule deserialization in this benchmark and related code; if any code
uses remote deserialization or torch.load on untrusted data, change the
requirement to torch>=2.6.0 (or unpinned >=2.6.0) to include fixes, run tests to
confirm compatibility, and add a comment in requirements or README explaining
the CVE mapping and why the chosen version was selected if you choose not to
upgrade.

@@ -0,0 +1 @@
torch==2.4.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical security vulnerabilities in torch 2.4.1.

This file uses torch==2.4.1, which has the same critical security vulnerabilities flagged in other requirements files in this PR. Upgrade to torch 2.5.1 or the latest patched version to address these security concerns and maintain consistency with requirements.txt.3.12.

🧰 Tools
🪛 OSV Scanner (2.2.4)

[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2024-259)


[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2025-41)


[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service

(GHSA-3749-ghw9-m3mg)


[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution

(GHSA-53q9-r3pm-6pq6)


[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability

(GHSA-887c-mr87-cxwp)

🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/605.lu/python/requirements.txt around lines 1 to
1, the file pins torch==2.4.1 which contains critical security vulnerabilities;
update the requirement to torch==2.5.1 (or the latest patched release used
elsewhere in the repo, e.g., requirements.txt.3.12) to resolve the
vulnerabilities, ensure consistency across requirements files, and run the test
suite/CI to verify no compatibility regressions.

@@ -0,0 +1 @@
torch==2.4.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Critical security vulnerabilities in torch 2.4.1.

The static analysis tool has flagged five critical security vulnerabilities in torch 2.4.1, including remote code execution via torch.load and denial of service issues. These are legitimate security concerns that should be addressed before release.

Please verify the latest patched version of PyTorch and update this dependency:


🌐 Web query:

What is the latest stable version of PyTorch and are there security patches for torch 2.4.1 vulnerabilities GHSA-53q9-r3pm-6pq6, GHSA-3749-ghw9-m3mg, GHSA-887c-mr87-cxwp, PYSEC-2024-259, and PYSEC-2025-41?

💡 Result:

Short answer

  • Latest stable PyTorch (as of Nov 23, 2025): PyTorch 2.9.0. [1]

Status of the listed vulnerabilities (impacting torch 2.4.1)

  • GHSA-53q9-r3pm-6pq6 / PYSEC-2025-41 (CVE-2025-32434): RCE via torch.load(..., weights_only=True). Patched in 2.6.0 (i.e., fixed in releases after 2.4.1). [2][3]
  • PYSEC-2024-259: Deserialization RCE in RemoteModule affecting torch ≤ 2.4.1. Reported fixed in later releases (fixed in <2.5.0 / addressed in 2.5.x+). [4]
  • GHSA-3749-ghw9-m3mg (CVE-2025-2953): local DoS in torch.mkldnn_max_pool2d. Patched in the 2.7.1 release candidate / fixed in 2.7.1+. [5]
  • GHSA-887c-mr87-cxwp (CVE-2025-3730): improper resource shutdown (ctc_loss) — fixed in 2.8.0 (patched in the upstream commits and released in 2.8+). [6]

Recommendation

  • Upgrade to the current stable release (2.9.0) or at minimum to the first fixed release that addresses the specific advisory relevant to you (2.6.0+ for the RCE advisories; 2.7.1+ and 2.8.0+ for the DoS/resource-release issues). Verify your deployment’s exact wheel/CUDA variant and check PyTorch release notes/SECURITY.md for any platform-specific patching instructions. [1][2][5][6]

Sources
[1] PyTorch 2.9 release blog (Oct 15, 2025). (pytorch.org)
[2] OSV / PYSEC-2025-41 / GHSA-53q9-r3pm-6pq6 (shows fixed version 2.6.0). (osv.dev)
[3] Debian / security tracker for CVE-2025-32434 (notes fix in 2.6.0). (security-tracker.debian.org)
[4] PYSEC-2024-259 advisory / resolved-security summary (affects <=2.4.1; fixed in later releases). (resolvedsecurity.com)
[5] GitHub Advisory GHSA-3749-ghw9-m3mg (CVE-2025-2953) — patched in 2.7.1-rc1 / fixed in 2.7.1+. (github.com)
[6] GHSA-887c-mr87-cxwp / CVE-2025-3730 (ctc_loss) — fixed in 2.8.0 per advisory/patch. (vulnerability.circl.lu)

If you want, I can (a) give the exact pip install commands to upgrade to 2.9.0 for your OS/CUDA, or (b) scan a requirements.txt to show whether your environment includes any of the vulnerable versions.

Citations:


Update torch from 2.4.1 to address five critical security vulnerabilities.

The five flagged vulnerabilities are confirmed as critical and affecting torch 2.4.1: GHSA-53q9-r3pm-6pq6 / PYSEC-2025-41 (RCE via torch.load; fixed in 2.6.0), PYSEC-2024-259 (deserialization RCE; fixed in 2.5.x+), GHSA-3749-ghw9-m3mg (local DoS; fixed in 2.7.1+), and GHSA-887c-mr87-cxwp (resource shutdown issue; fixed in 2.8.0+).

Update benchmarks/600.linearalgebra/606.spmv/python/requirements.txt to PyTorch 2.9.0 (the latest stable version) or at minimum 2.8.0 to address all five vulnerabilities in a single upgrade.

🧰 Tools
🪛 OSV Scanner (2.2.4)

[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2024-259)


[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2025-41)


[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service

(GHSA-3749-ghw9-m3mg)


[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution

(GHSA-53q9-r3pm-6pq6)


[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability

(GHSA-887c-mr87-cxwp)

🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/606.spmv/python/requirements.txt lines 1-1, the
pinned torch version 2.4.1 contains multiple critical security vulnerabilities;
update the torch entry to a secure release (preferably torch==2.9.0, or at
minimum torch==2.8.0) to obtain the fixes for the listed CVEs, then run
pip-compile / pip install -r requirements.txt or update your lockfile and CI
dependency checks accordingly to ensure the new version is used in builds.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (14)
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.8 (1)

1-3: Consider pinning torch/torchvision for reproducible vector-add results on Python 3.8.

You already pin numpy to 1.24.0; leaving torch and torchvision floating can make benchmark behavior depend on whatever PyPI exposes at install time, and may break if future releases change CUDA/ABI expectations. It’s worth deciding and documenting specific torch/torchvision versions for this env (even if only in a comment) and pinning accordingly.

benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.10 (1)

1-3: Align dependency strategy across Python versions and consider pinning torch/torchvision.

Here numpy is pinned to 1.26.4 (vs 1.24.0 in the 3.7–3.9 manifests), while torch/torchvision are left unpinned. Please confirm:

  • That the numpy 1.24 vs 1.26 split across Python versions is intentional for host-device-copy benchmarks.
  • Whether you want fixed torch/torchvision versions for 3.10 to keep results comparable over time.
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.9 (1)

1-3: Reproducibility and cross-version numpy choices for vector-add (Python 3.9).

As with the 3.8 manifest, numpy is pinned (1.24.0) while torch/torchvision are not. Please:

  • Consider pinning torch/torchvision to explicit versions used in your benchmark environment.
  • Double-check that keeping numpy at 1.24.0 for 3.9 but 1.26.x for newer Pythons is an intentional trade-off for this microbenchmark.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.11 (1)

1-3: Confirm host-device-copy dependency policy for Python 3.11.

This mirrors the 3.10 setup (numpy==1.26.4, unpinned torch/torchvision). To avoid surprises:

  • Verify that 1.26.4 is the desired baseline numpy for 3.11 benchmarks.
  • Consider explicitly pinning torch/torchvision to the versions you actually benchmark, so future PyPI updates don’t silently change performance characteristics.
benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt (1)

1-2: Clarify role of this base requirements file vs the Python-versioned manifests.

Since the 3.x-specific requirement files pin numpy and (presumably) target specific environments, it would help to:

  • Document whether this base file is only for ad-hoc/local runs where floating torch/torchvision versions are acceptable.
  • Or, if it’s part of CI/benchmarking as well, consider pinning torch/torchvision here too for consistency.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt (1)

1-2: Document intended usage of unversioned host-device-copy requirements.

Given that the Python-versioned files introduce numpy pins and target specific runtimes, please clarify whether this unversioned requirements.txt is:

  • Just a convenience for local experimentation (where floating torch/torchvision is fine), or
  • Also used for official runs, in which case pinning torch/torchvision would improve reproducibility.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.7 (1)

1-3: Validate 3.7 support and pinning choices for host-device-copy.

For a Python 3.7–specific env, please double-check:

  • That numpy==1.24.0, torch, and torchvision still have supported/available wheels for 3.7 in your target platforms.
  • Whether you want to keep 3.7 as a supported benchmark target at all, or drop it if the stack is no longer easily installable.
    As elsewhere, consider pinning torch/torchvision to concrete versions once you settle on a known-good combo.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.9 (1)

1-3: Host-device-copy on Python 3.9: confirm numpy split and pin torch/torchvision if needed.

This matches the 3.7/3.8 manifests (numpy==1.24.0, unpinned torch/torchvision). Please confirm:

  • That keeping 3.9 on numpy 1.24.x while newer Pythons use 1.26.x is intentional for your comparisons.
  • Whether you want to fix torch/torchvision versions here to avoid future PyPI changes silently affecting benchmark numbers.
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.8 (1)

1-3: Consider pinning torch and torchvision versions for reproducible benchmarks

numpy==1.24.0 is fine for a Python 3.8 environment, but leaving torch and torchvision unpinned means behavior and performance characteristics of this microbenchmark can change over time as upstream releases evolve (and may introduce silent incompatibilities).

For a benchmarking suite, it’s usually worth:

  • Pinning specific torch / torchvision versions known to work well with numpy==1.24.0 and your CUDA stack.
  • Keeping those pins aligned across the various versioned requirements.txt.* files where feasible.

Please verify compatible version triples (Python 3.8, NumPy 1.24.0, torch, torchvision) against the current library docs/release notes before pinning.

benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.11 (1)

1-3: Consider pinning torch and torchvision versions for reproducibility.

While numpy is pinned, torch and torchvision are unpinned. For benchmark reproducibility across runs and environments, consider pinning specific versions.

benchmarks/000.microbenchmarks/050.peak-performance/python/function.py (1)

17-24: Simplify redundant seed retrieval.

Line 21 uses event.get("seed") when we already know "seed" in event is true. Also, the retrieved value is immediately overwritten by converting to int on line 22.

     if "seed" in event:
         import random

-        random.seed(event["seed"])
-        seed = event.get("seed")
-        seed = int(seed)
+        seed = int(event["seed"])
+        random.seed(seed)
         torch.manual_seed(seed)
         torch.cuda.manual_seed_all(seed)
benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py (3)

55-59: Unused sync variable.

The sync function from get_device_and_sync is never used. Prefix with underscore to indicate intentional discard.

 def initialize(size, device_str="cuda"):
-    device, sync = get_device_and_sync(device_str)
+    device, _ = get_device_and_sync(device_str)
     x = torch.randn(size, device=device, dtype=torch.float32)
     y = torch.randn(size, device=device, dtype=torch.float32)
     return x, y

8-19: Duplicated get_device_and_sync function.

This function is identical to the one in benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py. Consider extracting to a shared utility module to reduce duplication.


119-120: Use English for code comments.

The comment "回退到 CPU" should be in English for consistency with the rest of the codebase.

-                # 回退到 CPU
+                # Fall back to CPU
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5093a8 and 284c6c8.

📒 Files selected for processing (28)
  • benchmarks/000.microbenchmarks/050.peak-performance/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/050.peak-performance/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/050.peak-performance/python/function.py (1 hunks)
  • benchmarks/000.microbenchmarks/050.peak-performance/python/requirements.txt (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/init.sh (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/package.sh (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.10 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.11 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.6 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.7 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.8 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.9 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/init.sh (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/package.sh (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.10 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.11 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.6 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.7 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.8 (1 hunks)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.9 (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.6
  • benchmarks/000.microbenchmarks/0xx.vector-add/config.json
  • benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.10
  • benchmarks/000.microbenchmarks/0xx.host-device-copy/python/requirements.txt.3.6
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T16:15:01.811Z
Learnt from: mcopik
Repo: spcl/serverless-benchmarks PR: 99
File: benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp:11-0
Timestamp: 2025-06-19T16:15:01.811Z
Learning: In the serverless-benchmarks project, input validation for benchmark functions is not required since the benchmarks are invoked directly by the maintainers who control the input values. This applies to functions like the sleep benchmark in benchmarks/000.microbenchmarks/010.sleep/cpp/main.cpp.

Applied to files:

  • benchmarks/000.microbenchmarks/050.peak-performance/input.py
🧬 Code graph analysis (2)
benchmarks/000.microbenchmarks/0xx.vector-add/input.py (1)
benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py (2)
  • buckets_count (4-5)
  • generate_input (8-11)
benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py (1)
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py (5)
  • get_device_and_sync (5-16)
  • sync (13-14)
  • initialize (19-24)
  • _run_once (27-70)
  • handler (73-117)
🪛 OSV Scanner (2.3.0)
benchmarks/000.microbenchmarks/050.peak-performance/python/requirements.txt

[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2024-259)


[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2025-41)


[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service

(GHSA-3749-ghw9-m3mg)


[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution

(GHSA-53q9-r3pm-6pq6)


[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability

(GHSA-887c-mr87-cxwp)

🪛 Ruff (0.14.8)
benchmarks/000.microbenchmarks/0xx.vector-add/input.py

9-9: Unused function argument: data_dir

(ARG001)


9-9: Unused function argument: benchmarks_bucket

(ARG001)


9-9: Unused function argument: input_paths

(ARG001)


9-9: Unused function argument: output_paths

(ARG001)


9-9: Unused function argument: upload_func

(ARG001)


9-9: Unused function argument: nosql_func

(ARG001)

benchmarks/000.microbenchmarks/050.peak-performance/input.py

9-9: Unused function argument: data_dir

(ARG001)


9-9: Unused function argument: benchmarks_bucket

(ARG001)


9-9: Unused function argument: input_paths

(ARG001)


9-9: Unused function argument: output_paths

(ARG001)


9-9: Unused function argument: upload_func

(ARG001)


9-9: Unused function argument: nosql_func

(ARG001)

benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py

9-9: Unused function argument: data_dir

(ARG001)


9-9: Unused function argument: benchmarks_bucket

(ARG001)


9-9: Unused function argument: input_paths

(ARG001)


9-9: Unused function argument: output_paths

(ARG001)


9-9: Unused function argument: upload_func

(ARG001)


9-9: Unused function argument: nosql_func

(ARG001)

benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py

75-75: Avoid specifying long messages outside the exception class

(TRY003)

benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py

32-32: Unused function argument: num_elems

(ARG001)


56-56: Unpacked variable sync is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


97-97: Avoid specifying long messages outside the exception class

(TRY003)

🪛 Shellcheck (0.11.0)
benchmarks/000.microbenchmarks/0xx.host-device-copy/python/package.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

benchmarks/000.microbenchmarks/0xx.vector-add/python/package.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🔇 Additional comments (9)
benchmarks/000.microbenchmarks/0xx.host-device-copy/config.json (1)

1-6: Verify configuration values are appropriate for the host-device-copy benchmark workload.

The JSON structure is valid and follows the expected format. However, the configuration values require verification:

  • Memory (line 3): 128 (units assumed MB?) seems low for a host-device-copy benchmark. Verify this is sufficient for the intended data transfer workload.
  • Languages (line 4): NodeJS support for a GPU host-device-copy benchmark is atypical. Confirm NodeJS is necessary or if this should be Python-only or include compiled languages like C/C++.

Also note: This file is a host-device-copy microbenchmark config, which appears unrelated to the PR title "video watermarking." Confirm this file is intentionally part of this PR.

benchmarks/000.microbenchmarks/0xx.vector-add/python/init.sh (1)

1-3: No-op init script looks good

Shebang is present and making this a no-op script with a clear comment is perfectly fine for this benchmark.

benchmarks/000.microbenchmarks/0xx.host-device-copy/python/init.sh (1)

1-3: Init hook is a clean no-op

The bash shebang plus explanatory comment is sufficient here; no issues from a correctness or maintainability standpoint.

benchmarks/000.microbenchmarks/050.peak-performance/config.json (1)

1-6: Config JSON is well-formed and consistent

The timeout, memory, languages, and modules fields are valid and align with the usual microbenchmark config structure. Nothing blocking here.

benchmarks/000.microbenchmarks/0xx.vector-add/input.py (1)

1-11: LGTM!

The implementation follows the established pattern for input generators in this benchmark suite. The unused function parameters are part of a common interface contract used across benchmarks.

benchmarks/000.microbenchmarks/050.peak-performance/input.py (1)

1-11: LGTM!

The implementation follows the established input generator pattern. The "sleep" key appropriately reflects the benchmark's purpose.

benchmarks/000.microbenchmarks/0xx.host-device-copy/input.py (1)

1-11: LGTM!

Consistent with other input generators in the benchmark suite.

benchmarks/000.microbenchmarks/0xx.host-device-copy/python/function.py (2)

36-38: Transfer loop allocates new tensors each iteration.

The .to() call returns a new tensor each iteration, which gets discarded. This measures allocation + transfer overhead rather than pure transfer bandwidth. If pure transfer is intended, consider using copy_() instead:

for _ in range(iters):
    device_tensor.copy_(host_tensor, non_blocking=True)

73-116: LGTM on handler logic.

The CUDA fallback handling with specific error message detection and graceful CPU fallback is well-structured.

Comment on lines +34 to +36
# Warm up
for _ in range(10):
_ = torch.matmul(A, B)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing synchronization after warm-up may skew timing measurements.

The warm-up loop doesn't have a sync() call after it. This means warm-up operations may still be in-flight when the timed section begins, causing measurement inaccuracy.

Apply this diff:

     # Warm up
     for _ in range(10):
         _ = torch.matmul(A, B)
+    sync()

     matmul_begin = datetime.datetime.now()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Warm up
for _ in range(10):
_ = torch.matmul(A, B)
# Warm up
for _ in range(10):
_ = torch.matmul(A, B)
sync()
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/050.peak-performance/python/function.py around
lines 34 to 36, the warm-up loop issues asynchronous CUDA ops that can still be
in-flight when timing starts; add a synchronization after the warm-up (e.g.,
call torch.cuda.synchronize() when CUDA is available) so all GPU work completes
before the timed section begins; if you need portability, guard the call with
torch.cuda.is_available() (or a no-op for CPU) to avoid errors.

@@ -0,0 +1 @@
torch==2.4.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GHSA-3749-ghw9-m3mg GHSA-53q9-r3pm-6pq6 GHSA-887c-mr87-cxwp PyTorch 2.4.1

💡 Result:

Summary — impact on PyTorch 2.4.1

  • GHSA-3749-ghw9-m3mg (CVE-2025-2953): local Denial-of-Service in torch.mkldnn_max_pool2d. Affects versions older than 2.7.1-rc1 — so 2.4.1 is vulnerable. Patch in 2.7.1-rc1. [1]

  • GHSA-53q9-r3pm-6pq6 (CVE-2025-32434): remote code execution when loading models with torch.load(..., weights_only=True). Patched in 2.6.0; versions ≤2.5.1 (and therefore 2.4.1) are vulnerable. Upgrade immediately. [2]

  • GHSA-887c-mr87-cxwp (CVE-2025-3730): local Denial-of-Service in torch.nn.functional.ctc_loss. Affects versions before 2.8.0 — 2.4.1 is vulnerable. Fixed in 2.8.0. [3]

Recommendation: Do not run PyTorch 2.4.1 on untrusted inputs/models. Upgrade to a patched release (at least 2.8.0 to cover GHSA-887c, and ≥2.6.0 to cover the RCE GHSA-53q9). Audit any models you load from untrusted sources and follow PyTorch security guidance.

Sources: [1] GitHub advisory GHSA-3749-ghw9-m3mg; [2] GitHub advisory GHSA-53q9-r3pm-6pq6; [3] GitHub advisory GHSA-887c-mr87-cxwp.


🌐 Web query:

PyTorch 2.4.1 security vulnerabilities critical

💡 Result:

Short answer: Yes — PyTorch 2.4.1 is affected by multiple security issues including at least one critical RCE-class vulnerability; you should treat 2.4.1 as vulnerable and upgrade or apply mitigations.

Key items and actions

  • CVE-2025-32434 / GHSA-53q9-r3pm-6pq6 — Remote code execution when loading models with torch.load(..., weights_only=True). Affects PyTorch ≤ 2.5.1 (so includes 2.4.1). Patched in 2.6.0. Published Apr 18, 2025. Mitigation: upgrade to PyTorch ≥ 2.6.0 or avoid loading untrusted models; follow the upstream advisory. [1][2]

  • CVE-2024-48063 — RemoteModule deserialization RCE reported for PyTorch ≤ 2.4.1 (disputed by some as “intended” distributed behavior). Treat as high risk for exposed distributed setups; follow vendor guidance or upgrade. (reported Oct 29, 2024). [3]

  • CVE-2024-31580 and related memory/out-of-bounds issues — earlier DoS/out-of-bounds bugs were fixed in releases around 2.4.1 or later; ensure you’re on a fixed release series. [4]

What to do now (concise)

  1. If possible, upgrade to the latest PyTorch release (at least 2.6.0 to address the torch.load RCE). Verify release notes/security advisories from the PyTorch repo before upgrade. [1][2]
  2. If you cannot upgrade immediately, do not load models from untrusted sources; disable or restrict distributed RPC/RemoteModule features and torch.load(weights_only=True) usage where feasible. [1][3]
  3. Monitor official PyTorch security advisories and your OS/vendor security bulletins for backported fixes (Debian/Ubuntu/RHEL may have different package versions). [2][4]

Sources

  1. PyTorch GitHub security advisory / GHSA for torch.load RCE (CVE-2025-32434). [1]
  2. IBM security bulletin referencing CVE-2025-32434 (lists affected versions and patched version 2.6.0). [2]
  3. CVE entries / vulnerability databases for CVE-2024-48063 (RemoteModule deserialization RCE). [3]
  4. Debian / Snyk trackers for memory/DoS CVEs (e.g., CVE-2024-31580) and fixed-package notes. [4]

Would you like me to: (A) fetch the exact PyTorch release notes and advisory links so you can upgrade safely, or (B) check whether your OS/package manager has a backport available?


🏁 Script executed:

# Check if torch.load is used in the benchmark code
find benchmarks -type f -name "*.py" | head -20

Repository: spcl/serverless-benchmarks

Length of output: 1271


🏁 Script executed:

# Search for torch.load usage in benchmarks
rg "torch\.load" benchmarks/ --type py -A 2

# Also search for ctc_loss usage
rg "ctc_loss" benchmarks/ --type py -A 2

# Check the peak-performance benchmark specifically
cat benchmarks/000.microbenchmarks/050.peak-performance/python/function.py

Repository: spcl/serverless-benchmarks

Length of output: 2444


Upgrade torch==2.4.1 to a patched version due to confirmed critical vulnerabilities

All three reported GHSA IDs are confirmed:

While the peak-performance benchmark itself does not directly use torch.load or the affected APIs, pinning a vulnerable version is poor security practice. Upgrade to at least 2.6.0 (to address the critical RCE), or preferably 2.8.0 (to address all three). Also verify and update torch pins in other benchmarks (e.g., 400.inference/411 and 413) that do use torch.load.

🧰 Tools
🪛 OSV Scanner (2.3.0)

[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2024-259)


[CRITICAL] 1-1: torch 2.4.1: undefined

(PYSEC-2025-41)


[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service

(GHSA-3749-ghw9-m3mg)


[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution

(GHSA-53q9-r3pm-6pq6)


[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability

(GHSA-887c-mr87-cxwp)

🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/050.peak-performance/python/requirements.txt
around line 1, the pinned torch==2.4.1 contains known critical vulnerabilities;
update the pin to at least 2.6.0 (preferably 2.8.0 to address all reported CVEs)
and run any tests to ensure compatibility, and also search for and update other
benchmark requirements that pin torch (e.g., benchmarks/400.inference/411 and
413) to the same safe minimum version.

Comment on lines +1 to +35
# Stripping package code is based on https://github.com/ryfeus/lambda-packs repo

PACKAGE_DIR=$1
echo "Original size $(du -sh $1 | cut -f1)"

CUR_DIR=$(pwd)
cd $1
# cleaning libs
rm -rf external
find . -type d -name "tests" -exec rm -rf {} +
find . -type d -name "test" -exec rm -rf {} +
find . -type d -name "bin" -not -path "*/torch/*" -exec rm -rf {} +

# cleaning
# stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure
find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip
find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip

rm -r pip >/dev/null
rm -r pip-* >/dev/null
rm -r wheel >/dev/null
rm -r wheel-* >/dev/null
rm easy_install.py >/dev/null
find . -name \*.pyc -delete
cd ${CUR_DIR}
echo "Stripped size $(du -sh $1 | cut -f1)"

TORCH_DIR=".python_packages/lib/site-packages/torch"
if [ -d "$1/${TORCH_DIR}" ]; then
cd $1
zip -qr torch.zip ${TORCH_DIR}
rm -rf ${TORCH_DIR}
cd ${CUR_DIR}
echo "Torch-zipped size $(du -sh $1 | cut -f1)"
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Apply the same hardening as in 0xx.vector-add/python/package.sh (cd guards, quoting, safer find|xargs)

This script has the same structure and issues as the 0xx.vector-add package.sh:

  • Guard all cd calls (cd "$1" || exit 1, cd "$CUR_DIR" || exit 1) so that failure to change into the package directory cannot cause rm -rf external / find ... -exec rm -rf to run in the wrong place.
  • Either use PACKAGE_DIR=$1 consistently instead of $1 or remove the unused variable.
  • Quote variable expansions used as paths and arguments.
  • Prefer find ... -exec strip {} + over find ... | xargs strip to handle spaces/newlines in filenames and avoid calling strip with no args.
  • Optionally add a shebang and tighten error silencing on the rm calls, as in the earlier comment.

You might also consider sharing a single common packaging script between these microbenchmarks to avoid duplicating this logic. As per static analysis hints, this file surfaces the same shellcheck warnings.

🧰 Tools
🪛 Shellcheck (0.11.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.host-device-copy/python/package.sh lines
1-35, the script lacks safe cd guards, inconsistent variable use, unquoted
expansions, and uses find | xargs which breaks on spaces and can call strip with
no args; update to use PACKAGE_DIR consistently, add guards like cd
"$PACKAGE_DIR" || exit 1 and cd "$CUR_DIR" || exit 1, quote all path and
variable expansions, replace piped xargs invocations with find ... -exec strip
{} + (and similarly for *.so.*), tighten rm error silencing (e.g. >&/dev/null
only for expected failures) or check existence before rm, and optionally add a
shebang at the top; these changes mirror the hardening in
0xx.vector-add/python/package.sh.

Comment on lines +22 to +29
@triton.jit
def vector_add_kernel(x_ptr, y_ptr, out_ptr, n, BLOCK_SIZE: tl.constexpr):
pid = tl.program_id(0)
offsets = pid * BLOCK_SIZE + tl.arange(0, BLOCK_SIZE)
mask = offsets < n
x = tl.load(x_ptr + offsets, mask=mask)
y = tl.load(y_ptr + offsets, mask=mask)
tl.store(out_ptr + offsets, x + y, mask=mask)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Triton kernel is defined but never invoked.

The vector_add_kernel is a decorated Triton JIT kernel that's never called. The actual computation in vector_add() at line 45 uses PyTorch's native x + y instead. Either remove the dead code or integrate the Triton kernel.

If Triton execution is intended, invoke the kernel in vector_add():

def vector_add(num_elems, iters=100, device_str="cuda", x=None, y=None):
    device = torch.device(device_str)
    x = x.to(device)
    y = y.to(device)
    out = torch.empty_like(x)

    BLOCK_SIZE = 1024
    grid = (triton.cdiv(num_elems, BLOCK_SIZE),)

    if device.type == "cuda":
        torch.cuda.synchronize()

    for _ in range(iters):
        vector_add_kernel[grid](x, y, out, num_elems, BLOCK_SIZE)

    if device.type == "cuda":
        torch.cuda.synchronize()

    return out
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.vector-add/python/function.py around lines
22 to 29, the Triton JIT kernel vector_add_kernel is defined but never used;
update the vector_add function to call this kernel when running on CUDA (compute
grid = (triton.cdiv(num_elems, BLOCK_SIZE),) and launch
vector_add_kernel[grid](x, y, out, num_elems, BLOCK_SIZE) inside the timing loop
with proper torch.cuda.synchronize() before and after the loop), or if Triton is
not intended, remove the unused kernel definition to eliminate dead code.

Comment on lines +1 to +35
# Stripping package code is based on https://github.com/ryfeus/lambda-packs repo

PACKAGE_DIR=$1
echo "Original size $(du -sh $1 | cut -f1)"

CUR_DIR=$(pwd)
cd $1
# cleaning libs
rm -rf external
find . -type d -name "tests" -exec rm -rf {} +
find . -type d -name "test" -exec rm -rf {} +
find . -type d -name "bin" -not -path "*/torch/*" -exec rm -rf {} +

# cleaning
# stripping some of the numpy libs - libgfortran-2e0d59d6.so.5.0.0 - causes issues on Azure
find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip
find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip

rm -r pip >/dev/null
rm -r pip-* >/dev/null
rm -r wheel >/dev/null
rm -r wheel-* >/dev/null
rm easy_install.py >/dev/null
find . -name \*.pyc -delete
cd ${CUR_DIR}
echo "Stripped size $(du -sh $1 | cut -f1)"

TORCH_DIR=".python_packages/lib/site-packages/torch"
if [ -d "$1/${TORCH_DIR}" ]; then
cd $1
zip -qr torch.zip ${TORCH_DIR}
rm -rf ${TORCH_DIR}
cd ${CUR_DIR}
echo "Torch-zipped size $(du -sh $1 | cut -f1)"
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden package.sh: guard cd, quote args, and make find|xargs strip safer

The overall logic matches other packaging scripts, but there are a few robustness/safety issues:

  • cd $1 (Line 7) and later cd calls (Lines 30, 33) run without || exit. If $1 is missing/invalid or cd fails, the subsequent rm -rf external and find ... -exec rm -rf {} will operate in the wrong directory, which is dangerous. Prefer e.g.:
    PACKAGE_DIR=$1
    cd "$PACKAGE_DIR" || exit 1
    ...
    cd "$CUR_DIR" || exit 1
  • PACKAGE_DIR=$1 (Line 3) is currently unused; either use it consistently instead of $1 or drop the assignment.
  • Several parameter expansions should be quoted for safety: du -sh "$PACKAGE_DIR" / "$1", cd "$PACKAGE_DIR", cd "$CUR_DIR", zip -qr torch.zip "$TORCH_DIR", etc.
  • find ... | xargs strip (Lines 16–17) will misbehave with paths containing spaces or newlines and may call strip with no arguments. Consider:
    find . -name '*.so'    -not -path '*/PIL/*' -not -path '*/Pillow.libs/*' -not -path '*libgfortran*' -exec strip {} +
    find . -name '*.so.*'  -not -path '*/PIL/*' -not -path '*/Pillow.libs/*' -not -path '*libgfortran*' -exec strip {} +
  • Optional: add a shebang (#!/bin/bash) so tools and callers know which shell is expected, and silence rm errors fully with 2>/dev/null if you don’t care about missing paths.

Addressing the cd handling is the most important to avoid accidental deletion outside the target package. As per static analysis hints, these are the same issues shellcheck is surfacing.

🧰 Tools
🪛 Shellcheck (0.11.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


[warning] 3-3: PACKAGE_DIR appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 16-16: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 17-17: Use 'find .. -print0 | xargs -0 ..' or 'find .. -exec .. +' to allow non-alphanumeric filenames.

(SC2038)


[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 30-30: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


[warning] 33-33: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.vector-add/python/package.sh around lines
1 to 35, the script uses unguarded cd calls, unquoted parameters, and unsafe
find|xargs strips which can cause destructive operations in the wrong directory
or fail on paths with spaces; update to use the PACKAGE_DIR variable everywhere
(e.g. cd "$PACKAGE_DIR" || exit 1 and du -sh "$PACKAGE_DIR"), quote all
parameter expansions and paths (e.g. "$CUR_DIR", "$TORCH_DIR"), replace find |
xargs strip with find ... -exec strip {} + to handle spaces safely, and ensure
returning to CUR_DIR uses cd "$CUR_DIR" || exit 1; optionally add a shebang
(#!/bin/bash) at the top and redirect rm errors to /dev/null (rm -r pip
2>/dev/null) if you want silent removals.

Comment on lines +1 to +3
numpy==1.24.0
torch
torchvision No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

last NumPy version supports Python 3.7 official compatibility matrix

💡 Result:

The last NumPy release series that officially supported Python 3.7 was NumPy 1.21 (latest patch: 1.21.6). NumPy 1.22.0 dropped Python 3.7 support. [1][2][3]

Sources:

  • NumPy 1.21 release notes (supports Python 3.7–3.9). [1]
  • NumPy 1.21.6 release notes (last 1.21 patch). [2]
  • NumPy 1.22.0 release notes (Python 3.7 dropped; supports 3.8–3.10). [3]

🌐 Web query:

torch torchvision Python 3.7 compatibility latest versions

💡 Result:

Short answer: The newest PyTorch/TorchVision that officially support Python 3.7 are from the 1.13.x series — e.g. torch 1.13 (1.13.0/1.13.1) with the matching torchvision 0.14.x; PyTorch 2.0+ and the 2.x torchvision releases require Python >=3.8/3.9. [1][2][3]

Sources:

  • PyTorch release compatibility matrix (shows 1.13 supports Python >=3.7, 2.x require >=3.8/3.9+). [1]
  • torchvision release metadata / PyPI (shows torchvision 0.14.x pairs with torch 1.13 and requires Python >=3.7.2 for that pairing). [2]
  • PyTorch “Get Started” / installation docs (current 2.x builds list supported Python versions >=3.8/3.9+). [3]

If you want, tell me your exact Python 3.7 micro version (e.g. 3.7.9) and OS/CUDA target and I’ll give the exact pip/conda install command for the latest 1.13.x-wheel that matches it.


numpy==1.24.0 is incompatible with Python 3.7 — downgrade and pin torch/torchvision

  • numpy==1.24.0 dropped support for Python 3.7 in favor of 3.8+. The last NumPy version supporting Python 3.7 is 1.21.6. Using 1.24.0 in a Python 3.7 environment will cause installation failures.
  • torch and torchvision are unpinned. For Python 3.7 compatibility, pin to torch==1.13.1 with torchvision==0.14.1 (PyTorch 2.0+ requires Python 3.8+). Unpinned versions will pull incompatible releases and break reproducibility.

Update to:

numpy==1.21.6
torch==1.13.1
torchvision==0.14.1
🤖 Prompt for AI Agents
In benchmarks/000.microbenchmarks/0xx.vector-add/python/requirements.txt.3.7
around lines 1-3, numpy==1.24.0 is incompatible with Python 3.7 and
torch/torchvision are unpinned; replace the three lines so the file pins the
last NumPy supporting 3.7 and compatible PyTorch builds by changing to
numpy==1.21.6, torch==1.13.1, and torchvision==0.14.1 to ensure successful
installs and reproducible environments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (2)

34-38: NaN fps values will bypass the fallback check.

As flagged in the previous review, cap.get(cv2.CAP_PROP_FPS) can return NaN with certain containers or drivers. The condition if not fps or fps <= 0 will not catch NaN, causing int(fps * duration) to fail at runtime.

Apply this diff to add NaN validation:

+import math
 ...
     fps = cap.get(cv2.CAP_PROP_FPS)
-    if not fps or fps <= 0:
+    if not fps or fps <= 0 or math.isnan(fps):
         fps = 25.0  # fallback

149-149: Critical: Path traversal vulnerability via user-controlled key.

As flagged in the previous review, constructing download_path = f"/tmp/{key}" without sanitization allows directory traversal attacks. If key contains path separators (../, etc.), it can escape /tmp or create nested directories, leading to file overwrites or crashes.

Apply this diff to sanitize the key:

-    download_path = f"/tmp/{key}"
+    # Restrict to basename to prevent directory traversal
+    safe_key = os.path.basename(key)
+    download_path = os.path.join("/tmp", safe_key)
🧹 Nitpick comments (2)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (2)

41-47: Loop variable i is unused.

The loop counter is never referenced in the loop body. Use _ to indicate an intentionally unused variable.

Apply this diff:

-    for i in range(max_frames):
+    for _ in range(max_frames):
         ret, frame_bgr = cap.read()

56-56: Unpacked variables T and C are unused.

Prefix unused variables with underscores to clarify they're intentionally ignored.

Apply this diff:

-    T, H, W, C = video_np.shape
+    _T, H, W, _C = video_np.shape
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 284c6c8 and 5766b50.

📒 Files selected for processing (1)
  • benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (1)
benchmarks/100.webapps/120.uploader/nodejs/function.js (3)
  • bucket (18-18)
  • output_prefix (19-19)
  • download_path (22-22)
🪛 Ruff (0.14.8)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py

32-32: Avoid specifying long messages outside the exception class

(TRY003)


41-41: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


52-52: Avoid specifying long messages outside the exception class

(TRY003)


55-55: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)


56-56: Unpacked variable T is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


56-56: Unpacked variable C is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


87-87: Probable insecure usage of temporary file or directory: "/tmp/processed-"

(S108)


92-92: Avoid specifying long messages outside the exception class

(TRY003)


145-147: Avoid specifying long messages outside the exception class

(TRY003)


149-149: Probable insecure usage of temporary file or directory: "/tmp/"

(S108)

🔇 Additional comments (4)
benchmarks/200.multimedia/220.video-processing-gpu/python/function.py (4)

61-80: LGTM! GPU processing logic is well-structured.

The device selection, configurable iteration count, and use of torch.no_grad() for inference are all appropriate. The GPU workload design effectively simulates a realistic compute-heavy FaaS scenario.


104-110: LGTM! Operation registry provides clean extensibility.

The operation mapping design allows easy addition of new processing functions in the future.


153-159: FFmpeg chmod handling is defensive and appropriate.

The graceful error handling (via try/except OSError) ensures the code works whether or not the ffmpeg binary exists, which is good for maintaining compatibility across different deployment configurations.


161-200: LGTM! Download/compute/upload orchestration is well-implemented.

The three-phase workflow with separate timing measurements provides excellent observability for benchmarking. The microsecond-precision timing and size tracking are appropriate for performance analysis.


import datetime
import os
import stat # can also be removed if you drop ffmpeg entirely
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Misleading comment about removing stat import.

The comment suggests stat can be removed if ffmpeg is dropped, but lines 155-156 use stat.S_IEXEC for the ffmpeg chmod logic. Either update the comment to be accurate or remove the chmod block entirely if ffmpeg is no longer needed.

Apply this diff to clarify:

-import stat  # can also be removed if you drop ffmpeg entirely
+import stat  # used for ffmpeg binary chmod
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
line 5, the comment on the import of stat is misleading because later code
(lines ~155-156) uses stat.S_IEXEC for ffmpeg chmod logic; either update the
comment to state that stat is required by the chmod block, or if ffmpeg and its
chmod block are being removed, delete the chmod block (and then remove the stat
import). Ensure the chosen change keeps imports and code consistent: if keeping
ffmpeg code, change the comment to reflect stat is needed for chmod; if removing
ffmpeg, remove the chmod lines and then safely remove the stat import and any
related constants.

Comment on lines +85 to +99
# Encode processed video with OpenCV (CPU)
base = os.path.splitext(os.path.basename(video_path))[0]
out_path = f"/tmp/processed-{base}.mp4"

fourcc = cv2.VideoWriter_fourcc(*"mp4v")
writer = cv2.VideoWriter(out_path, fourcc, fps, (W, H))
if not writer.isOpened():
raise RuntimeError(f"Could not open VideoWriter for: {out_path}")

for frame_rgb in video_np_out:
frame_bgr = cv2.cvtColor(frame_rgb, cv2.COLOR_RGB2BGR)
writer.write(frame_bgr)

writer.release()
return out_path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Insecure /tmp path construction remains unresolved.

As flagged in the previous review, constructing predictable /tmp paths exposes the code to security issues. The current implementation doesn't address the S108 static analysis warning or the security concerns raised earlier.

At minimum, use os.path.join for path construction:

     base = os.path.splitext(os.path.basename(video_path))[0]
-    out_path = f"/tmp/processed-{base}.mp4"
+    out_path = os.path.join("/tmp", f"processed-{base}.mp4")

For better security, consider using tempfile.NamedTemporaryFile:

import tempfile
...
with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False, dir="/tmp") as tmp:
    out_path = tmp.name
🧰 Tools
🪛 Ruff (0.14.8)

87-87: Probable insecure usage of temporary file or directory: "/tmp/processed-"

(S108)


92-92: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing-gpu/python/function.py around
lines 85 to 99, the code creates a predictable /tmp filename which is a security
risk; replace the fixed path with a securely created temporary file (e.g., use
tempfile.NamedTemporaryFile or tempfile.mkstemp) to obtain a unique,
non-guessable out_path with the .mp4 suffix, pass dir="/tmp" only if required,
set delete=False so OpenCV can open/write to it, close the temp handle before
returning, and remove the manual string concatenation (os.path.join is
acceptable for any explicit directory joins).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants