-
Notifications
You must be signed in to change notification settings - Fork 95
Jessie/monte carlo #269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Jessie/monte carlo #269
Conversation
Feature/bert inference
Hotfix/code quality on benchmarks
sync: image-classification requirements + add 605.lu benchmark - Resolve conflicts in 413.image-classification/python/requirements* - Drop py3.6/py3.7 variants removed upstream; keep/update 3.8–3.11 - Add new 600.linearalgebra/605.lu benchmark (config, input, function, reqs) - Rename local_deployment.tmp -> 600.linearalgebra/605.lu/config.json - Update local_deployment.json; add out_benchmark*.json; update out_storage.json
Removed configuration details for MinIO and ScyllaDB.
Removed sensitive SSH private key from eval command.
Merge benchmarks by Russell and Yuxuan into Development
This reverts commit 4fca4aa.
added recommender benchmark
This document provides detailed instructions for running the GPU Cache Latency benchmark, including system requirements, build steps, and example output.
This readme provides detailed instructions for running the GPU Cache Latency benchmark, including system requirements, build steps, and example commands.
WalkthroughThis PR transitions CI from CircleCI to GitHub Actions, updates repository configuration, normalizes code formatting across benchmark files to use double quotes and consistent line wrapping, adds configuration for development tools (VSCode, pre-commit, mypy), updates the benchmarks-data submodule URL, and introduces ten new benchmark implementations across multiple categories including Monte Carlo Pi, BERT language inference, image classification, recommendation systems, scientific computing (JAX-based), and linear algebra operations. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
benchmarks/300.utilities/311.compression/python/function.py (1)
20-44: Harden temporary directory handling and address S108 warningThe main concern in this block is how the temporary working directory is created and managed:
- Line 26 builds
download_pathas"/tmp/{}-{}".format(key, uuid.uuid4())and then callsos.makedirs(download_path).- Static analysis (Ruff S108) correctly flags this as a probable insecure temp dir pattern: you are composing a path in
/tmpmanually and including a value (key) that may ultimately come from external input.- The directory is never explicitly cleaned up, so in warm containers you can accumulate data under
/tmpover many invocations.Even if
uuid.uuid4()makes the directory name hard to guess, this is still weaker than using the standard library’s safe temp-dir helpers, and mixing user-controlled strings into filesystem paths is something to avoid unless you are certain of constraints.I recommend switching to
tempfile.TemporaryDirectory, which will:
- Create the directory securely.
- Automatically clean it up at the end of the block.
- Address the S108 warning and improve your security posture.
A minimal refactor preserving your current behavior would look like:
@@ -import datetime -import os -import shutil -import uuid +import datetime +import os +import shutil +import uuid +import tempfile @@ def handler(event): @@ - bucket = event.get("bucket").get("bucket") - input_prefix = event.get("bucket").get("input") - output_prefix = event.get("bucket").get("output") - key = event.get("object").get("key") - download_path = "/tmp/{}-{}".format(key, uuid.uuid4()) - os.makedirs(download_path) - - s3_download_begin = datetime.datetime.now() - client.download_directory(bucket, os.path.join(input_prefix, key), download_path) - s3_download_stop = datetime.datetime.now() - size = parse_directory(download_path) - - compress_begin = datetime.datetime.now() - shutil.make_archive(os.path.join(download_path, key), "zip", root_dir=download_path) - compress_end = datetime.datetime.now() - - s3_upload_begin = datetime.datetime.now() - archive_name = "{}.zip".format(key) - archive_size = os.path.getsize(os.path.join(download_path, archive_name)) - key_name = client.upload( - bucket, os.path.join(output_prefix, archive_name), os.path.join(download_path, archive_name) - ) - s3_upload_stop = datetime.datetime.now() + bucket = event.get("bucket").get("bucket") + input_prefix = event.get("bucket").get("input") + output_prefix = event.get("bucket").get("output") + key = event.get("object").get("key") + + with tempfile.TemporaryDirectory(dir="/tmp") as download_path: + s3_download_begin = datetime.datetime.now() + client.download_directory(bucket, os.path.join(input_prefix, key), download_path) + s3_download_stop = datetime.datetime.now() + size = parse_directory(download_path) + + compress_begin = datetime.datetime.now() + shutil.make_archive(os.path.join(download_path, key), "zip", root_dir=download_path) + compress_end = datetime.datetime.now() + + s3_upload_begin = datetime.datetime.now() + archive_name = "{}.zip".format(key) + archive_size = os.path.getsize(os.path.join(download_path, archive_name)) + key_name = client.upload( + bucket, + os.path.join(output_prefix, archive_name), + os.path.join(download_path, archive_name), + ) + s3_upload_stop = datetime.datetime.now()Optional follow-ups (good to consider, but not strictly required for this PR):
- If
event["object"]["key"]can contain path separators or.., you may want to either validate/sanitize it before using it in a local filesystem path or restrict it toos.path.basename(key)for the archive name.- The nested
.get()calls (event.get("bucket").get("bucket")) don’t actually make the code more robust; if the outer key is missing, you still getAttributeErroron.getofNone. If you want defensive parsing, consider defaults (event.get("bucket", {}).get("bucket")) or just use indexing if the shape is guaranteed.Based on static analysis hints (Ruff S108).
benchmarks/000.microbenchmarks/020.network-benchmark/python/function.py (1)
11-57: Fix undefinedkeyon repeated failures and consider safer temp-file handlingIf Line 35’s
consecutive_failures == 5condition trips, you break out of the loop withconsecutive_failures == 5, skip theif consecutive_failures != 5:block, and then hitreturn {"result": key}on Line 57 withkeynever defined. That will throw at runtime on repeated timeouts.A minimal fix is to initialize
keybefore theifand still return something sensible when all attempts fail:- if consecutive_failures != 5: - with open("/tmp/data.csv", "w", newline="") as csvfile: + key = None + if consecutive_failures != 5: + with open("/tmp/data.csv", "w", newline="") as csvfile: writer = csv.writer(csvfile, delimiter=",") writer.writerow(["id", "client_send", "client_rcv"]) for row in times: writer.writerow(row) client = storage.storage.get_instance() filename = "results-{}.csv".format(request_id) key = client.upload(output_bucket, os.path.join(output_prefix, filename), "/tmp/data.csv") return {"result": key}Separately, both Line 47 and Line 55 rely on a fixed
/tmp/data.csvpath. For concurrent invocations in the same runtime, that’s prone to clobbering. If this benchmark may run concurrently, consider using a per-request filename (e.g., includerequest_id) ortempfile.NamedTemporaryFile(delete=False)to avoid collisions.benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py (1)
38-41: Logic mismatch:consecutive_failurescheck uses wrong value.The loop breaks when
consecutive_failures == 7(line 38), but the post-loop check usesconsecutive_failures != 5(line 62). This condition will always be true when the connection fails, potentially causing the code to attempt writing an empty CSV. The check should use7to match the break condition.- if consecutive_failures != 5: + if consecutive_failures != 7:Also applies to: 62-73
benchmarks/500.scientific/504.dna-visualisation/python/function.py (1)
1-49: Harden temporary file handling and tidy lint issuesTwo concrete improvements here:
- Safer
/tmppath and file handling (Ruff S108)
download_path = "/tmp/{}".format(key)takeskeyverbatim andopen(download_path, "r").read()leaves the handle for GC. This is inconsistent with patterns already established in similar handlers; several other benchmarks sanitize the path and use context managers:- key = event.get("object").get("key") - download_path = "/tmp/{}".format(key) + key = event.get("object").get("key") + safe_name = os.path.basename(key) + download_path = os.path.join("/tmp", safe_name) @@ - data = open(download_path, "r").read() + with open(download_path, "r") as f: + data = f.read()
- Executable file without shebang (Ruff EXE002)
This file is marked executable in git (100755) but lacks a shebang. Since this is a library-style module, drop the executable bit rather than adding a shebang.The rest of the handler (download/transform/upload timing and return structure) looks consistent with other benchmarks.
♻️ Duplicate comments (5)
benchmarks/000.microbenchmarks/010.sleep/input.py (1)
1-11: Same unused-argument pattern as server-reply inputThis file follows the same structure as
040.server-reply/input.py: onlysizeis used, others are present to satisfy the harness signature. Any Ruff ARG001 warnings here can be handled the same way (underscore prefix or# noqa: ARG001on the function) if you decide to keep the linter strict.benchmarks/400.inference/413.image-classification/python/requirements.txt.3.10 (1)
1-4: Dependencies look consistent with other Python 3.9+ requirements files.The dependency versions are consistent with other requirements files for Python 3.9+ versions. The verification of torch/torchvision compatibility requested in requirements.txt.arm.3.9 applies here as well.
benchmarks/400.inference/413.image-classification/python/requirements.txt.3.12 (1)
1-4: Dependencies are consistent across Python 3.9+ versions.The dependency versions match other requirements files for Python 3.9+ versions, maintaining good consistency across the benchmark configurations.
benchmarks/600.linearalgebra/605.lu/input.py (1)
1-7: Same pattern as other linear algebra inputs; apply the same Ruff fixThis follows the same
size_generators+{"size", "seed"}pattern as603.jacobi2d/input.pyand looks correct.You’ll see the same ARG001 unused-arg hints here; the same remedy applies (function-level
# noqa: ARG001or_-prefixed param names) if you want a clean lint run.benchmarks/600.linearalgebra/606.spmv/input.py (1)
1-7: SPMV config looks good; same note on unused argumentsThe
size/seed/densitystructure is clear and appropriate for SPMV.To avoid repeated ARG001 warnings for the unused parameters in
generate_input, you can mirror the approach used in other files (function-level# noqa: ARG001or_-prefixed parameter names) without changing the external interface.
🟠 Major comments (15)
benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py-84-90 (1)
84-90: Missing fallback when "size" key is absent causes NameError.If
eventdoesn't contain"size", the variablesN,W,H,C1,C2remain undefined, causing aNameErroron line 94. Other benchmarks in the project have the same pattern, but consider adding validation or default values.+def handler(event): + if "size" not in event: + raise ValueError("Missing required 'size' key in event") + + size = event["size"] + N = size["N"] + W = size["W"] + H = size["H"] + C1 = size["C1"] + C2 = size["C2"] -def handler(event): - - if "size" in event: - size = event["size"] - N = size["N"] - W = size["W"] - H = size["H"] - C1 = size["C1"] - C2 = size["C2"]Committable suggestion skipped: line range outside the PR's diff.
benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/requirements.txt-1-1 (1)
1-1: Pin the JAX version for benchmark reproducibility.Unpinned dependencies can introduce breaking changes and make benchmark results non-reproducible across different installations or time periods. For scientific benchmarks, version stability is essential.
Apply this diff to pin to a specific version:
-jax[cuda12] +jax[cuda12]==0.4.35Note: Verify the appropriate version for your use case. To check the latest stable version, run:
#!/bin/bash curl -s https://pypi.org/pypi/jax/json | jq -r '.info.version'benchmarks/400.inference/413.image-classification/python/package.sh-1-4 (1)
1-4: Add shebang and fix shell script robustness issues.The script is missing a shebang line, and
PACKAGE_DIRis assigned but never used (you reference$1directly throughout). Additionally,cdcommands should handle failures to prevent operations in the wrong directory.+#!/bin/bash +set -euo pipefail + # Stripping package code is based on https://github.com/ryfeus/lambda-packs repo PACKAGE_DIR=$1 echo "Original size $(du -sh $1 | cut -f1)"Also consider using
$PACKAGE_DIRinstead of$1throughout, or remove the unused variable.benchmarks/400.inference/412.language-bert/python/package.sh-1-7 (1)
1-7: Add shebang and handlecdfailures.Same issues as
413.image-classification/python/package.sh: missing shebang, unusedPACKAGE_DIRvariable, andcdcommands without error handling.+#!/bin/bash +set -euo pipefail + # Stripping package code is based on https://github.com/ryfeus/lambda-packs repo PACKAGE_DIR=$1 echo "Original size $(du -sh $1 | cut -f1)" CUR_DIR=$(pwd) -cd $1 +cd "$1" || exit 1benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/requirements.txt-1-1 (1)
1-1: Pin the JAX version for reproducibility.The unpinned JAX dependency can lead to non-reproducible benchmark results across different installations or over time.
Apply this diff to pin the JAX version:
-jax[cuda12] +jax[cuda12]==0.4.35Note: Verify the appropriate JAX version for your CUDA 12 setup and update accordingly.
benchmarks/400.inference/413.image-classification/python/init.sh-10-10 (1)
10-10: Quote shell variables to prevent word splitting and globbing.Unquoted variables
${path}and${DIR}can cause failures if they contain spaces or special characters.Apply this diff:
-cp ${path} ${DIR} +cp "${path}" "${DIR}"benchmarks/100.webapps/120.uploader/python/function.py-17-19 (1)
17-19: Unsafe chained.get()calls could raise AttributeError.If
event.get("bucket")returnsNone, the subsequent.get("bucket")call will raise anAttributeError. The same issue applies to lines 18-19.Apply this diff to safely handle missing keys:
- bucket = event.get("bucket").get("bucket") - output_prefix = event.get("bucket").get("output") - url = event.get("object").get("url") + bucket_info = event.get("bucket", {}) + bucket = bucket_info.get("bucket") + output_prefix = bucket_info.get("output") + url = event.get("object", {}).get("url")benchmarks/600.linearalgebra/603.jacobi2d/python/function.py-41-53 (1)
41-53: Dead code:seedvariable is set but never used.Same issue as in
604.cholesky/python/function.py- theseedvariable is assigned butinitialize_torch()doesn't accept a seed parameter. Remove the dead code or add seed support for consistency.def handler(event): - size = event.get("size") - if "seed" in event: - import random - - random.seed(event["seed"]) - - seed = event.get("seed", 42) - seed = int(seed) matrix_generating_begin = datetime.datetime.now()benchmarks/600.linearalgebra/604.cholesky/python/function.py-34-45 (1)
34-45: Dead code:seedvariable is set but never used.The
seedvariable is assigned inside theifblock butinitialize_torch()in this file doesn't accept aseedparameter. Either remove the dead code or add seed support toinitialize_torch()for consistency with other benchmarks (e.g., 602.axpy, 607.fw).def handler(event): size = event.get("size") - if "seed" in event: - import random - - random.seed(event["seed"]) - - seed = event.get("seed", 42) - seed = int(seed) gen_begin = datetime.datetime.now() A = initialize_torch(size, dtype=torch.float32, device="cuda")Alternatively, if deterministic initialization is desired, add a
seedparameter toinitialize_torch()similar to the pattern inbenchmarks/600.linearalgebra/607.fw/python/function.py.benchmarks/600.linearalgebra/605.lu/python/function.py-51-59 (1)
51-59: Same issues as601.matmul: potentialUnboundLocalErrorand unusedseed.The handler has the same problems noted in the matmul benchmark:
- If
"size"is missing,sizeisNoneand line 62 fails.seedis extracted but never used for reproducibility.See the fix suggested for
601.matmul/python/function.py.benchmarks/500.scientific/5xx.compute_jax_npbench/python/function.py-24-33 (1)
24-33: PotentialUnboundLocalErrorifsizekey is missing.If the event dict doesn't contain
"size", variablesM,N, andsizeremain undefined, causing a runtime error at line 33 or 59. Either raise an explicit error for missing input or provide defaults.def handler(event): - - if "size" in event: - size = event["size"] - M = size["M"] - N = size["N"] + size = event.get("size") + if size is None: + raise ValueError("Missing required 'size' in event") + M = size["M"] + N = size["N"] generate_begin = datetime.datetime.now()benchmarks/600.linearalgebra/601.matmul/python/function.py-32-44 (1)
32-44: PotentialUnboundLocalErrorand unusedseedvariable.Two issues:
- If
"size"is missing from event, thesizevariable will beNoneand unpacking at line 44 will fail.- The
seedvariable is assigned but never used—unlike602.axpywhich passes it toinitialize_torch.def handler(event): - size = event.get("size") + if size is None: + raise ValueError("Missing required 'size' in event") + + seed = 42 if "seed" in event: - import random - - random.seed(event["seed"]) - seed = event.get("seed", 42) seed = int(seed) + torch.manual_seed(seed) + torch.cuda.manual_seed_all(seed)benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py-239-246 (1)
239-246:NameErrorif"size"key is missing from event.If
"size"is not in the event, the variablesny,nx,nit,rho,nu,F, andsizeare never assigned, causing aNameErroron line 250 or 276. Consider providing defaults or raising a descriptive error.- if "size" in event: - size = event["size"] - ny = size["ny"] - nx = size["nx"] - nit = size["nit"] - rho = size["rho"] - nu = size["nu"] - F = size["F"] + if "size" not in event: + raise ValueError("Event must contain 'size' with keys: ny, nx, nit, rho, nu, F") + size = event["size"] + ny = size["ny"] + nx = size["nx"] + nit = size["nit"] + rho = size["rho"] + nu = size["nu"] + F = size["F"]benchmarks/400.inference/411.image-recognition/python/function.py-14-14 (1)
14-14:ZipFilenot properly closed and unsafeextractall().The
ZipFileobject should use a context manager to ensure proper cleanup. Additionally,extractall()can be vulnerable to path traversal (zip slip) attacks.- zipfile.ZipFile("function/torch.zip").extractall("/tmp/") + with zipfile.ZipFile("function/torch.zip") as zf: + for name in zf.namelist(): + if name.startswith("/") or ".." in name: + raise ValueError(f"Unsafe path in archive: {name}") + zf.extractall("/tmp/")benchmarks/400.inference/413.image-classification/python/function.py-56-57 (1)
56-57: Unsafetarfile.extractall()usage — path traversal vulnerability.
extractall()without validation can be exploited by malicious archives containing path traversal sequences. Since this archive is sourced from user-provided storage (line 53), validate member paths before extraction.The Python 3.12
filterparameter should not be relied upon as sole mitigation—recent CVEs (June 2025) demonstrate bypasses in the filter implementation. Use manual path validation instead:with tarfile.open(archive_path, "r:gz") as tar: + for member in tar.getmembers(): + if member.name.startswith("/") or ".." in member.name: + raise ValueError(f"Unsafe path in archive: {member.name}") tar.extractall(MODEL_DIRECTORY)
🟡 Minor comments (14)
benchmarks/400.inference/413.recommendation/python/package.sh-3-4 (1)
3-4: Use the assigned PACKAGE_DIR variable.Line 3 assigns
PACKAGE_DIR=$1, but Line 4 uses$1directly instead of$PACKAGE_DIR. This makes the variable assignment pointless.Apply this diff:
PACKAGE_DIR=$1 -echo "DLRM GPU package size $(du -sh $1 | cut -f1)" +echo "DLRM GPU package size $(du -sh "$PACKAGE_DIR" | cut -f1)"Note: Also added quotes around the variable to handle paths with spaces.
.gitmodules-6-6 (1)
6-6: Update documentation reference to match the new submodule URL.The new repository at
https://github.com/McLavish/serverless-benchmarks-data-dphpc.gitis accessible and the change is already reflected ininstall.py. However,docs/modularity.mdstill references the oldhttps://github.com/spcl/serverless-benchmarks-dataURL, which will confuse contributors. Update the documentation to point to the new repository URL for consistency.benchmarks/400.inference/412.language-bert/python/requirements.txt.3.11-1-3 (1)
1-3: Consider upgrading onnxruntime-gpu to a more recent stable version.Version 1.16.3 has known distribution and bug issues. While no specific CVE directly affects 1.16.3, related path-traversal vulnerabilities were patched in 1.16.2, and there are reported thread-safety and installation problems with 1.16.3. Upgrading to the latest stable release would address these maintenance and stability concerns.
The numpy 1.24.4 and tokenizers 0.13.3 versions have no published security advisories. The
.3.11filename extension is a deliberate naming convention used consistently across this project for Python version-specific requirements files.benchmarks/500.scientific/504.dna-visualisation/input.py-9-20 (1)
9-20: Guard against missing.fastafiles before usingdata
datais first assigned inside thefor file in glob.glob(...):loop (Line 13) but is used after the loop to populateinput_config["object"]["key"](Line 17). If there are no*.fastafiles indata_dir,datawill be undefined and this function will crash.Consider explicitly tracking whether any file was found, e.g.:
-def generate_input( - data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func -): - - for file in glob.glob(os.path.join(data_dir, "*.fasta")): - data = os.path.relpath(file, data_dir) - upload_func(0, data, file) +def generate_input( + data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func +): + data = None + for file in glob.glob(os.path.join(data_dir, "*.fasta")): + data = os.path.relpath(file, data_dir) + upload_func(0, data, file) + + if data is None: + raise FileNotFoundError(f"No .fasta files found in {data_dir!r}") input_config = {"object": {}, "bucket": {}} input_config["object"]["key"] = dataThis keeps current behavior when data exists and fails fast with a clearer error when the dataset is missing.
benchmarks/300.utilities/311.compression/input.py-17-31 (1)
17-31: Orphaned docstring not attached to the function.The docstring at lines 18-26 is floating as a module-level string literal and is not associated with
generate_input. It should be placed immediately after the function definition.- -""" - Generate test, small and large workload for compression test. - - :param data_dir: directory where benchmark data is placed - :param size: workload size - :param input_buckets: input storage containers for this benchmark - :param output_buckets: - :param upload_func: upload function taking three params(bucket_idx, key, filepath) -""" - - def generate_input( data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func ): + """Generate test, small and large workload for compression test. + + :param data_dir: directory where benchmark data is placed + :param size: workload size + :param input_buckets: input storage containers for this benchmark + :param output_buckets: + :param upload_func: upload function taking three params(bucket_idx, key, filepath) + """benchmarks/100.webapps/120.uploader/input.py-18-25 (1)
18-25: Align parameter names with the standard convention across benchmarks.This function uses
input_bucketsandoutput_buckets, while virtually all other benchmarks (compression, image-recognition, recommendation, thumbnailer, video-processing, linear-algebra, and others) useinput_pathsandoutput_paths. Additionally,input_bucketsis not used in the function body. Consider renaming to match the standard convention for consistency, or document the intentional semantic difference if buckets are meant to convey different semantics than paths.benchmarks/400.inference/413.image-classification/python/package.sh-28-32 (1)
28-32: Align torch packaging logic between the two benchmark scripts.The scripts use fundamentally different approaches for conditional torch packaging despite similar structure. Script 413 uses environment variables (
PLATFORM,PYTHON_VERSION) while 412 checks directory existence ([ -d "$1/${TORCH_DIR}" ]). Additionally, they zip different paths ($1/torchvs.python_packages/lib/site-packages/torch), use different working directories, and report sizes differently:
- 413 does not
cdinto$1before zipping and reports size from the original working directory- 412
cds into$1before zipping and reports size from the package directoryThis inconsistency in conditional logic and path handling makes the scripts harder to maintain. Standardize both scripts to use a consistent approach (prefer directory existence checks over environment variables, as 412 does—this is more robust) and ensure they handle paths and size reporting uniformly.
benchmarks/300.utilities/320.monte-carlo-pi/input.py-1-1 (1)
1-1: Stale/incorrect path in comment.The comment references
000.microbenchmarks/060.monte-carlo-pibut this file is located at300.utilities/320.monte-carlo-pi.-# benchmarks/000.microbenchmarks/060.monte-carlo-pi/input.py +# benchmarks/300.utilities/320.monte-carlo-pi/input.pybenchmarks/600.linearalgebra/601.matmul/python/function.py-56-62 (1)
56-62: Missinggpu_timein return—inconsistent with peer benchmarks.The
gpu_mstiming from CUDA events is computed but not returned. Other benchmarks in600.linearalgebra(cholesky, lu, axpy, jacobi2d, fw) all includegpu_timein their measurement output.return { - # "result": result[0], "measurement": { "generating_time": matrix_generating_time, "compute_time": matmul_time, + "gpu_time": gpu_ms, }, }benchmarks/300.utilities/320.monte-carlo-pi/python/function.py-1-1 (1)
1-1: Incorrect file path in comment.The comment references
000.microbenchmarks/060.monte-carlo-pibut the actual file path is300.utilities/320.monte-carlo-pi.-# benchmarks/000.microbenchmarks/060.monte-carlo-pi/python/function.py +# benchmarks/300.utilities/320.monte-carlo-pi/python/function.pybenchmarks/400.inference/411.image-recognition/python/function.py-29-29 (1)
29-29: File handle not closed.Same issue as the other benchmark file.
-class_idx = json.load(open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r")) +with open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r") as f: + class_idx = json.load(f)benchmarks/400.inference/413.image-classification/python/function.py-18-18 (1)
18-18: File handle not closed.Using
open()without a context manager leaves the file handle open. Use awithstatement.-class_idx = json.load(open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r")) +with open(os.path.join(SCRIPT_DIR, "imagenet_class_index.json"), "r") as f: + class_idx = json.load(f)benchmarks/400.inference/411.image-recognition/python/function.py-36-40 (1)
36-40:AttributeErrorif event keys are missing.Unlike the 413.image-classification handler, this uses
.get("bucket").get(...)without a default, which will raiseAttributeErrorif the outer key is missing.- bucket = event.get("bucket").get("bucket") - input_prefix = event.get("bucket").get("input") - model_prefix = event.get("bucket").get("model") - key = event.get("object").get("input") - model_key = event.get("object").get("model") + bucket_info = event.get("bucket") or {} + object_info = event.get("object") or {} + bucket = bucket_info.get("bucket") + input_prefix = bucket_info.get("input") + model_prefix = bucket_info.get("model") + key = object_info.get("input") + model_key = object_info.get("model")benchmarks/400.inference/413.image-classification/python/function.py-139-144 (1)
139-144: PotentialAttributeErroron missing event keys.Chained
.get().get()will raiseAttributeErrorif the first.get()returnsNone. Use defensive access or validate the event structure.- bucket = event.get("bucket", {}).get("bucket") - input_prefix = event.get("bucket", {}).get("input") - model_prefix = event.get("bucket", {}).get("model") - key = event.get("object", {}).get("input") - model_key = event.get("object", {}).get("model") + bucket_info = event.get("bucket") or {} + object_info = event.get("object") or {} + bucket = bucket_info.get("bucket") + input_prefix = bucket_info.get("input") + model_prefix = bucket_info.get("model") + key = object_info.get("input") + model_key = object_info.get("model")
🧹 Nitpick comments (39)
benchmarks/000.microbenchmarks/010.sleep/python/function.py (1)
4-9: Formatting change looks good; consider validatingsleepinputThe switch to double quotes is fine and behavior-preserving. However,
event.get("sleep")can returnNone(or a non‑numeric value), which will causesleep()to raise aTypeErrorat runtime. If you want a clearer error and stricter contract for this microbenchmark, you could validate the parameter before callingsleep:def handler(event): # start timing - sleep_time = event.get("sleep") - sleep(sleep_time) - return {"result": sleep_time} + sleep_time = event.get("sleep") + if sleep_time is None: + raise ValueError("Missing 'sleep' key in event") + + try: + sleep_time = float(sleep_time) + except (TypeError, ValueError) as exc: + raise ValueError("'sleep' must be a number") from exc + + sleep(sleep_time) + return {"result": sleep_time}benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/input.py (1)
8-17: Handle Ruff ARG001 unused-argument warnings ingenerate_input(optional)The full parameter list is needed to match the common benchmark input interface, so leaving these arguments unused is fine. If you want to keep Ruff quiet, you could either (a) prefix unused parameters with
_or (b) add a# noqa: ARG001on the function definition; both are purely cosmetic and don’t affect behavior.benchmarks/500.scientific/5xx.deep_learning_resnet_jax_npbench/python/function.py (3)
13-36: Consider using JAX's built-in convolution operations.The manual implementation using nested
lax.scanis functional but less efficient than JAX's optimizedlax.conv_general_dilatedorjax.scipy.signal.correlate2d. If the goal is benchmarking convolution performance rather than the scan pattern itself, built-in ops would better represent real-world usage.
68-79: Minor: Prefer explicit numpy dtype for numpy operations.Using
jnp.float32with numpy's random generator works (since it resolves tonumpy.float32), but usingnp.float32or the string"float32"would be clearer and avoid confusion about library boundaries.def initialize(N, W, H, C1, C2): from numpy.random import default_rng + import numpy as np rng = default_rng(42) # Input - input = rng.random((N, H, W, C1), dtype=jnp.float32) + input = rng.random((N, H, W, C1), dtype=np.float32) # Weights - conv1 = rng.random((1, 1, C1, C2), dtype=jnp.float32) - conv2 = rng.random((3, 3, C2, C2), dtype=jnp.float32) - conv3 = rng.random((1, 1, C2, C1), dtype=jnp.float32) + conv1 = rng.random((1, 1, C1, C2), dtype=np.float32) + conv2 = rng.random((3, 3, C2, C2), dtype=np.float32) + conv3 = rng.random((1, 1, C2, C1), dtype=np.float32) return (input, conv1, conv2, conv3)
109-112: Broad exception handling silently swallows errors.Catching bare
Exceptionand passing silently can mask real issues. Consider logging the exception or catching a more specific exception type (e.g.,jax.errors.TracerArrayConversionError).try: results = jax.device_get(results) - except Exception: - pass + except (AttributeError, TypeError): + # results may already be a host array + passbenchmarks/400.inference/413.image-classification/python/requirements.txt.3.11 (1)
1-4: Consider pinning all versions for benchmark reproducibility.The mixed versioning strategy (>= for numpy/pillow, == for torch/torchvision) may reduce benchmark reproducibility. Additionally, verify that torch 2.5.1 and torchvision 0.20.1 are compatible with each other and with Python 3.11.
Consider pinning specific versions for numpy and pillow:
-numpy>=2.0 -pillow>=10.0 +numpy==2.1.3 +pillow==11.0.0 torch==2.5.1 torchvision==0.20.1To verify torch/torchvision compatibility:
Are PyTorch 2.5.1 and torchvision 0.20.1 compatible with each other?benchmarks/400.inference/412.language-bert/python/requirements.txt.3.10 (1)
1-3: No security vulnerabilities found; consider updating to newer versions.Verification confirms that numpy 1.24.4, onnxruntime-gpu 1.16.3, and tokenizers 0.13.3 have no known CVEs and support Python 3.10 (numpy requires >=3.8). However, significantly newer versions are available: numpy 2.3.5, onnxruntime-gpu 1.23.2, and tokenizers 0.22.1. While the current pinned versions are not vulnerable, updating to newer releases would improve compatibility and maintenance.
.pre-commit-config.yaml (1)
1-30: Pre-commit hook setup looks consistent and well-scopedLocal flake8/black hooks with pinned versions and
files: ^(sebs/|benchmarks/)are a solid setup; nothing blocking here. Only minor nit: theflake8 (project env)name is slightly misleading givenlanguage: python(pre-commit venv), but that’s purely cosmetic.benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
1-11: generate_input matches suite pattern; consider handling unused-arg lintingThe
size_generatorsmapping and{"sleep": size_generators[size]}shape are consistent with other microbenchmark inputs. All parameters exceptsizeare intentionally unused here, so tools like Ruff (ARG001) will complain. If you plan to enforce those checks, consider either prefixing unused parameters with_or adding a# noqa: ARG001on the function definition, rather than changing the signature shape expected by the harness.benchmarks/400.inference/413.image-classification/python/package.sh (1)
16-17: Use null-delimited find/xargs for filenames with special characters.Files with spaces or special characters will cause issues with the current approach.
-find -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip -find -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" | xargs strip +find . -name "*.so" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 -r strip +find . -name "*.so.*" -not -path "*/PIL/*" -not -path "*/Pillow.libs/*" -not -path "*libgfortran*" -print0 | xargs -0 -r strip.github/workflows/lint.yml (1)
3-5: Consider adding branch filters to reduce unnecessary CI runs.The workflow triggers on all pushes and pull requests. You may want to filter to specific branches (e.g.,
master,main) to avoid running on every feature branch push.on: - push: - pull_request: + push: + branches: [master, main] + pull_request: + branches: [master, main]benchmarks/400.inference/412.language-bert/python/package.sh (1)
25-34: Quote variables and handlecdfailures in the torch zipping section.-cd ${CUR_DIR} +cd "${CUR_DIR}" || exit 1 echo "Stripped size $(du -sh $1 | cut -f1)" TORCH_DIR=".python_packages/lib/site-packages/torch" if [ -d "$1/${TORCH_DIR}" ]; then - cd $1 - zip -qr torch.zip ${TORCH_DIR} - rm -rf ${TORCH_DIR} - cd ${CUR_DIR} + cd "$1" || exit 1 + zip -qr torch.zip "${TORCH_DIR}" + rm -rf "${TORCH_DIR}" + cd "${CUR_DIR}" || exit 1 echo "Torch-zipped size $(du -sh $1 | cut -f1)" fibenchmarks/500.scientific/5xx.compute_jax_npbench/python/requirements.txt (1)
1-1: Consider pinning JAX and confirming CUDA12 is available where this runs.Using
jax[cuda12]without a version means future JAX releases (and CUDA/CuDNN constraints) can change behavior over time, which hurts reproducibility. It also assumes CUDA 12 drivers/libs are present on all runners that execute this benchmark. If that assumption doesn’t always hold, you may want a CPU-only installation for those environments instead, or at least pin a tested JAX version in a constraints file.benchmarks/400.inference/413.image-classification/python/requirements.txt.3.9 (1)
1-4: Consider tightening numpy/pillow versions and verify PyTorch stack compatibilityYou pin
torch/torchvisionexactly but allownumpy>=2.0andpillow>=10.0. For a benchmark repo, you may want either exact pins or at least upper bounds (e.g.,<3.0,<11.0) to avoid silent breakage from future major releases, and to keep results reproducible. Also please double‑check thattorch==2.5.1/torchvision==0.20.1and the chosen NumPy/Pillow versions are available and supported for your target Python 3.9 + CUDA/runtime combo.benchmarks/400.inference/413.image-classification/python/imagenet_class_index.json (1)
1-1: Consider formatting JSON for readability.The entire 1000-entry JSON is on a single line, making it difficult to read, diff, or maintain. Consider pretty-printing this file with proper indentation.
You can reformat using:
#!/bin/bash # Pretty-print the JSON file python -m json.tool benchmarks/400.inference/413.image-classification/python/imagenet_class_index.json > temp.json && mv temp.json benchmarks/400.inference/413.image-classification/python/imagenet_class_index.jsonbenchmarks/500.scientific/503.graph-bfs/input.py (1)
7-7: Consider adding KeyError handling for invalid size values.If an invalid
sizeis passed,size_generators[size]will raise aKeyError. Adding validation or a default would improve robustness.- return {"size": size_generators[size], "seed": 42} + if size not in size_generators: + raise ValueError(f"Invalid size '{size}'. Valid options: {list(size_generators.keys())}") + return {"size": size_generators[size], "seed": 42}benchmarks/600.linearalgebra/604.cholesky/input.py (1)
7-7: Consider adding KeyError handling for invalid size values.If an invalid
sizeis passed,size_generators[size]will raise aKeyError. Adding validation improves robustness.- return {"size": size_generators[size], "seed": 42} + if size not in size_generators: + raise ValueError(f"Invalid size '{size}'. Valid options: {list(size_generators.keys())}") + return {"size": size_generators[size], "seed": 42}benchmarks/500.scientific/5xx.channel_flow_jax_npbench/input.py (1)
17-17: Consider adding KeyError handling for invalid size values.If an invalid
sizeis passed,size_generators[size]will raise aKeyError. Adding validation improves robustness.- return {"size": size_generators[size]} + if size not in size_generators: + raise ValueError(f"Invalid size '{size}'. Valid options: {list(size_generators.keys())}") + return {"size": size_generators[size]}benchmarks/600.linearalgebra/601.matmul/input.py (1)
7-7: Consider adding KeyError handling for invalid size values.If an invalid
sizeis passed,size_generators[size]will raise aKeyError. Adding validation improves robustness.- return {"size": size_generators[size], "seed": 42} + if size not in size_generators: + raise ValueError(f"Invalid size '{size}'. Valid options: {list(size_generators.keys())}") + return {"size": size_generators[size], "seed": 42}benchmarks/400.inference/413.image-classification/python/init.sh (1)
1-10: Consider adding error handling for missing files.The script doesn't check if
imagenet_class_index.jsonexists before attempting to copy it, which could lead to silent failures.#!/bin/bash +set -e DIR=$1 VERBOSE=$2 SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )" path="${SCRIPT_DIR}/imagenet_class_index.json" + +if [ ! -f "${path}" ]; then + echo "Error: ${path} not found" >&2 + exit 1 +fi + if [ "$VERBOSE" = true ]; then echo "Update ${DIR} with json ${path}" fi cp "${path}" "${DIR}"benchmarks/400.inference/411.image-recognition/input.py (1)
10-15: Tighten loop var naming and consider silencing unused-arg lints
- Line 10:
dirsis never used; Ruff’s B007 is valid. Renaming to_dirs(or_) keeps intent clear and silences the warning:- for root, dirs, files in os.walk(data_dir): + for root, _dirs, files in os.walk(data_dir):
- Lines 29-31:
size,output_paths, andnosql_funcare unused but are part of the commongenerate_inputinterface. If you want Ruff clean while keeping the signature, you can either prefix them with_or add a function-level noqa:-def generate_input( - data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func -): +def generate_input( # noqa: ARG001 + data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func +):Optionally, if you intend to exercise more than the first image, you might later extend
"object"]["input"]beyondinput_images[0][0], but that’s non-blocking here.benchmarks/600.linearalgebra/603.jacobi2d/input.py (1)
1-7: Interface looks good; consider addressing unused-arg warningsThe
size_generatorslookup and{"size": ..., "seed": 42}shape match other linear algebra benchmarks and look fine.To keep Ruff happy without changing the public interface, you can mark the function as intentionally ignoring most parameters:
-def generate_input( - data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func -): +def generate_input( # noqa: ARG001 + data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func +):(or alternatively prefix unused parameters with
_).benchmarks/100.webapps/110.dynamic-html/input.py (1)
1-9: Input generator is fine; align unused-arg handling with the restBehavior here is simple and clear:
random_lencomes fromsize_generators[size]and a fixed"username"is set; that matches patterns in other webapp benchmarks.As with other
generate_inputhelpers, most parameters are unused. To avoid repeated Ruff ARG001 warnings while keeping the shared interface, consider:-def generate_input( - data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func -): +def generate_input( # noqa: ARG001 + data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func +):benchmarks/600.linearalgebra/602.axpy/input.py (1)
1-7: AXPY input helper matches the common pattern; consider a shared approach to ARG001This matches the other linear algebra inputs (
matmul,jacobi2d, etc.): sizes viasize_generatorsand a fixed seed, which looks correct.Since this same
generate_inputsignature is reused across many benchmarks and most arguments are intentionally unused, you might standardize how you silence Ruff, e.g.:
- Add
# noqa: ARG001on eachgenerate_inputdefinition, or- Configure Ruff to ignore
ARG001for these files/patterns if you prefer to keep signatures untouched, or- Prefix the unused parameters with
_.Any of these will keep the interface consistent while avoiding noisy lints.
benchmarks/500.scientific/5xx.compute_jax_npbench/input.py (1)
1-17: Consistent NPBench sizing; same minor unused-arg considerationThe
"test"/"small"/"large"shapes forM/Nalign with the other JAX NPBench inputs, andgenerate_inputreturning{"size": ...}is consistent.As with other
generate_inputfunctions, the extra parameters are unused by design. If Ruff ARG001 is enforced here too, consider a function-level# noqa: ARG001or_-prefixing the unused parameters to keep lint noise down without changing the call contract.benchmarks/400.inference/412.language-bert/python/function.py (3)
62-66: Consider supporting CPU fallback for environments without CUDA.Requiring
CUDAExecutionProviderwill cause the function to fail in CPU-only environments. If this is intentional for GPU benchmarking, consider documenting this constraint in the config or adding a more informative error message.
134-144: Addstrict=Truetozip()for safer iteration.Using
strict=Trueensuressentencesandprobabilitieshave matching lengths, catching potential mismatches early.- for sentence, probs in zip(sentences, probabilities): + for sentence, probs in zip(sentences, probabilities, strict=True):
107-119: Missing validation for required event keys.If
bucket,model_prefix,text_prefix, ortext_keyareNone, the function will fail with an unclear error during download. Consider adding explicit validation.def handler(event): bucket = event.get("bucket", {}).get("bucket") model_prefix = event.get("bucket", {}).get("model") text_prefix = event.get("bucket", {}).get("text") text_key = event.get("object", {}).get("input") + + if not all([bucket, model_prefix, text_prefix, text_key]): + raise ValueError("Missing required event keys: bucket, model, text, or input") download_begin = datetime.datetime.now()benchmarks/600.linearalgebra/602.axpy/python/function.py (1)
47-48: Prefix unused variable with underscore.The
y_outresult is unused. Prefix it to signal intentional discard.- y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100) + _y_out, gpu_ms = kernel_axpy(alpha, x, y, reps=100)benchmarks/600.linearalgebra/604.cholesky/python/function.py (1)
49-50: Prefix unused variable with underscore.The
Lresult fromkernel_choleskyis unused. Prefix it to signal intentional discard.- L, gpu_ms = kernel_cholesky(A) + _L, gpu_ms = kernel_cholesky(A)benchmarks/600.linearalgebra/603.jacobi2d/python/function.py (2)
56-63: Misleading variable names:matmul_begin/matmul_endfor Jacobi iteration.The variable names suggest matrix multiplication, but this is a Jacobi 2D stencil operation. Rename for clarity.
- matmul_begin = datetime.datetime.now() + compute_begin = datetime.datetime.now() A_out, B_out, gpu_ms = kernel_jacobi2d(A, B, iters=50) - matmul_end = datetime.datetime.now() + compute_end = datetime.datetime.now() matrix_generating_time = (matrix_generating_end - matrix_generating_begin) / datetime.timedelta( microseconds=1 ) - matmul_time = (matmul_end - matmul_begin) / datetime.timedelta(microseconds=1) + compute_time = (compute_end - compute_begin) / datetime.timedelta(microseconds=1)
57-58: Prefix unused variables with underscores.
A_outandB_outare unused. Prefix them to signal intentional discard.- A_out, B_out, gpu_ms = kernel_jacobi2d(A, B, iters=50) + _A_out, _B_out, gpu_ms = kernel_jacobi2d(A, B, iters=50)benchmarks/400.inference/413.image-classification/input.py (2)
18-27: Misplaced docstring—move inside the function.The docstring appears as a standalone string literal at module level rather than as the function's actual docstring. Move it inside
generate_inputimmediately after thedefline.-""" - Generate test, small and large workload for compression test. - - :param data_dir: directory where benchmark data is placed - :param size: workload size - :param input_buckets: input storage containers for this benchmark - :param output_buckets: - :param upload_func: upload function taking three params(bucket_idx, key, filepath) -""" - - def generate_input( data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func ): - + """ + Generate test, small and large workload for compression test. + + :param data_dir: directory where benchmark data is placed + :param size: workload size + :param input_buckets: input storage containers for this benchmark + :param output_buckets: + :param upload_func: upload function taking three params(bucket_idx, key, filepath) + """ # upload model
8-15: Preferos.path.joinfor path consistency.Line 13 uses string concatenation (
prefix + "/" + file) while line 14 correctly usesos.path.join. This inconsistency could cause issues on non-Unix systems and reduces readability.def upload_files(data_root, data_dir, upload_func): - - for root, dirs, files in os.walk(data_dir): + for root, _, files in os.walk(data_dir): prefix = os.path.relpath(root, data_root) for file in files: - file_name = prefix + "/" + file + file_name = os.path.join(prefix, file) filepath = os.path.join(root, file) upload_func(0, file_name, filepath)benchmarks/500.scientific/5xx.compute_jax_npbench/python/function.py (1)
48-51: Silent exception swallowing obscures failures.The bare
except Exception: passhides potential issues. Ifdevice_getfails for reasons other than "already on host," the error is silently ignored. Consider catching a more specific exception or logging when an exception occurs.try: results = jax.device_get(results) - except Exception: - pass + except (TypeError, AttributeError): + # results already on host or not a device array + passbenchmarks/600.linearalgebra/605.lu/python/function.py (1)
18-26: Python loops defeat GPU acceleration.
_kernel_luuses explicit Pythonforloops with element-wise tensor indexing. This executes on CPU with GPU round-trips, negating any CUDA performance benefit. For GPU-accelerated LU, usetorch.linalg.lu_factoror similar.-def _kernel_lu(B: torch.Tensor) -> torch.Tensor: - n = B.shape[0] - for i in range(n): - for j in range(i): - B[i, j] = B[i, j] - (B[i, :j] @ B[:j, j]) - B[i, j] = B[i, j] / B[j, j] - for j in range(i, n): - B[i, j] = B[i, j] - (B[i, :i] @ B[:i, j]) - return B +def _kernel_lu(A: torch.Tensor) -> torch.Tensor: + LU, pivots = torch.linalg.lu_factor(A) + return LUIf the custom algorithm is intentional for educational purposes, consider documenting this and using CPU tensors to avoid the GPU overhead.
benchmarks/400.inference/413.recommendation/python/function.py (1)
121-131: Addstrict=Truetozipfor safety.If
payloadsandscoreshave mismatched lengths due to a bug, the current code silently truncates. Usingstrict=True(Python 3.10+) would catch this.- for req, score in zip(payloads, scores): + for req, score in zip(payloads, scores, strict=True):benchmarks/400.inference/411.image-recognition/python/function.py (1)
55-55: Deprecatedpretrainedparameter.The
pretrainedparameter is deprecated in recent torchvision versions. Consider usingweights=Noneinstead.- model = resnet50(pretrained=False) + model = resnet50(weights=None)benchmarks/500.scientific/5xx.channel_flow_jax_npbench/python/function.py (1)
265-268: Bareexceptclause suppresses all errors.Catching
Exceptionand silently passing can hide bugs. Ifdevice_getisn't needed for this result type, consider removing the try-except or logging the exception.try: results = jax.device_get(results) - except Exception: - pass + except TypeError: + # results is already a Python scalar, not a JAX array + pass
| """ | ||
| Generate test, small and large workload for thumbnailer. | ||
| :param data_dir: directory where benchmark data is placed | ||
| :param size: workload size | ||
| :param input_buckets: input storage containers for this benchmark | ||
| :param output_buckets: | ||
| :param upload_func: upload function taking three params(bucket_idx, key, filepath) | ||
| ''' | ||
| def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func): | ||
| for file in glob.glob(os.path.join(data_dir, '*.mp4')): | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix docstring: incorrect benchmark name and outdated parameter documentation.
The docstring was not properly updated from the thumbnailer benchmark. It references "thumbnailer" and documents parameters as input_buckets / output_buckets, but the actual function uses input_paths and output_paths.
Apply this diff to correct the docstring:
-"""
- Generate test, small and large workload for thumbnailer.
-
- :param data_dir: directory where benchmark data is placed
- :param size: workload size
- :param input_buckets: input storage containers for this benchmark
- :param output_buckets:
- :param upload_func: upload function taking three params(bucket_idx, key, filepath)
-"""
+"""
+ Generate test, small and large workload for video processing.
+
+ :param data_dir: directory where benchmark data is placed
+ :param size: workload size
+ :param benchmarks_bucket: bucket name for benchmark data
+ :param input_paths: input storage paths for this benchmark
+ :param output_paths: output storage paths
+ :param upload_func: upload function taking three params(bucket_idx, key, filepath)
+ :param nosql_func: NoSQL function (currently unused)
+"""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """ | |
| Generate test, small and large workload for thumbnailer. | |
| :param data_dir: directory where benchmark data is placed | |
| :param size: workload size | |
| :param input_buckets: input storage containers for this benchmark | |
| :param output_buckets: | |
| :param upload_func: upload function taking three params(bucket_idx, key, filepath) | |
| ''' | |
| def generate_input(data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func): | |
| for file in glob.glob(os.path.join(data_dir, '*.mp4')): | |
| """ | |
| """ | |
| Generate test, small and large workload for video processing. | |
| :param data_dir: directory where benchmark data is placed | |
| :param size: workload size | |
| :param benchmarks_bucket: bucket name for benchmark data | |
| :param input_paths: input storage paths for this benchmark | |
| :param output_paths: output storage paths | |
| :param upload_func: upload function taking three params(bucket_idx, key, filepath) | |
| :param nosql_func: NoSQL function (currently unused) | |
| """ |
🤖 Prompt for AI Agents
In benchmarks/200.multimedia/220.video-processing/input.py around lines 9 to 17,
the docstring still refers to the "thumbnailer" benchmark and lists parameters
input_buckets/output_buckets which are outdated; update the docstring to
reference the video-processing benchmark (or the correct benchmark name),
replace input_buckets/output_buckets with input_paths/output_paths, and ensure
the param descriptions match the actual function signature (data_dir, size,
input_paths, output_paths, upload_func) and their meanings.
| with tarfile.open(archive_path, "r:gz") as tar: | ||
| tar.extractall(MODEL_DIRECTORY) | ||
| os.remove(archive_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
find . -name "function.py" -path "*412.language-bert*" | head -5Repository: spcl/serverless-benchmarks
Length of output: 196
🏁 Script executed:
cat -n "benchmarks/400.inference/412.language-bert/python/function.py" | sed -n '35,50p'Repository: spcl/serverless-benchmarks
Length of output: 974
🌐 Web query:
Python tarfile extractall filter parameter security best practice
💡 Result:
Short answer — treat tarfile.extractall(filter=...) as unsafe for untrusted archives until patched. Do not rely on the built‑in "data"/"tar" filters alone; instead validate/sanitize members yourself, extract into a temporary directory, and enforce path/symlink checks. See Python security advisories and CVE notes. [2][3][0]
Recommended minimal best practices
- Avoid extracting untrusted tarballs when possible; prefer not accepting them at all. [3]
- If you must extract, extract to an empty temp directory and only move vetted files into place after checks. [3][0]
- Do not rely solely on the built‑in filter strings ("data"/"tar"); they have had bypasses (CVE advisories). [0][3][6]
- Skip/deny special members: absolute paths, path traversal ("../"), hardlinks, symlinks, device nodes, and members that would change metadata outside target. [2][8]
- Verify final extracted path stays inside intended directory using realpath/resolve checks before writing. [2]
Minimal safe extraction pattern (concept)
- Open tar file (prefer streaming).
- For each TarInfo member:
- compute dest_path = (target_dir / member.name).resolve()
- if not str(dest_path).startswith(str(target_dir.resolve()) + os.sep): skip/raise (prevents traversal).
- if member.issym() or member.isdev() or member.islnk(): skip/deny.
- for regular files: extract to a temp file, set safe permissions, then atomically rename into place.
- After all members vetted and extracted to temp dir, move accepted files into final destination.
(These steps implement the checks the filter parameter was intended to help with but must be done explicitly.) [2][8][0]
Example (short pseudo/Python sketch — adapt for your use):
- Use tarfile.open(..., "r:*") and iterate tar.getmembers(). For each member:
- target = (target_dir / member.name).resolve()
- if not str(target).startswith(str(target_dir.resolve()) + os.sep): continue # skip traversal
- if member.issym() or member.isdev() or member.islnk(): continue # skip dangerous types
- if member.isreg(): with tar.extractfile(member) as src, write to a temp file, fsync, then rename.
This pattern enforces path and type checks before creating filesystem entries. [2][8]
Patching and monitoring
- Track Python security releases and apply patches that fix tarfile filter bypasses; CVEs were published for these issues. If a Python release explicitly fixes the issue, upgrade. Meanwhile, implement the manual checks above. [3][0][5]
Sources
- Python tarfile docs (extraction filters, defaults). [2]
- NVD / CVE advisory for tarfile extraction filter bypass. [3]
- Public writeups/CVE summaries describing filter bypasses and recommended mitigations. [0][6]
- PEP 706 (design of filter parameter / how filters are intended to work). [8]
If you want, I can produce a ready-to-run safe-extract Python function (with tests) that implements the checks above and handles symlinks/permissions correctly.
Add comprehensive validation to tarfile.extractall() to prevent path traversal and other extraction attacks.
Using extractall() without member validation allows malicious archives to write files outside the target directory via absolute paths, .. sequences, symlinks, or device files. Simply adding filter="data" is insufficient—security advisories document bypasses of the built-in filters.
The recommended approach is to extract to an empty temporary directory, validate each member (rejecting absolute paths, .. traversal, symlinks, device nodes), use realpath() checks to confirm extracted paths stay within the target directory, and move only validated files into place.
import tarfile
import tempfile
import os
from pathlib import Path
def safe_extract_tar(archive_path, target_dir):
with tempfile.TemporaryDirectory() as temp_dir:
with tarfile.open(archive_path, "r:gz") as tar:
for member in tar.getmembers():
# Reject dangerous member types
if member.issym() or member.isdev() or member.islnk():
continue
# Validate member path stays within target
target_path = (Path(target_dir) / member.name).resolve()
if not str(target_path).startswith(str(Path(target_dir).resolve()) + os.sep):
continue # Skip path traversal attempts
# Extract to temp location first
tar.extract(member, temp_dir)
# Move validated files to target
temp_path = Path(temp_dir)
for item in temp_path.rglob("*"):
rel_path = item.relative_to(temp_path)
dest = Path(target_dir) / rel_path
dest.parent.mkdir(parents=True, exist_ok=True)
if item.is_file():
item.rename(dest)🧰 Tools
🪛 Ruff (0.14.8)
43-43: Uses of tarfile.extractall()
(S202)
🤖 Prompt for AI Agents
In benchmarks/400.inference/412.language-bert/python/function.py around lines
42-44, the current tarfile.extractall() usage is vulnerable to path traversal
and extraction attacks; replace it with a safe extraction flow: open the archive
into a temporary empty directory (use tempfile.TemporaryDirectory()), iterate
tar.getmembers(), skip/ignore dangerous members (absolute paths, members with
leading "..", symlinks, hard links, and device nodes), for each allowed member
compute the intended target Path(target_dir)/member.name and call resolve() to
ensure its realpath starts with the resolved target_dir path before extracting
that member into the temp dir, then after all validated members are extracted
move/rename only those validated files from the temp dir into MODEL_DIRECTORY
(creating parent dirs as needed); ensure proper cleanup and preserve file
permissions where appropriate.
| echo "Original size $(du -sh $1 | cut -f1)" | ||
|
|
||
| CUR_DIR=$(pwd) | ||
| cd $1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle cd failure to prevent destructive operations in wrong directory.
If cd $1 fails, subsequent rm -rf and find -delete commands will execute in the current directory, potentially deleting unintended files.
-cd $1
+cd "$1" || exit 1📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cd $1 | |
| cd "$1" || exit 1 |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
🤖 Prompt for AI Agents
In benchmarks/400.inference/413.image-classification/python/package.sh around
line 7, the script unconditionally runs "cd $1" which, if it fails, allows
subsequent destructive rm -rf and find -delete commands to run in the wrong
directory; update the script to quote the argument and guard the cd by testing
its exit status (e.g., if ! cd "$1"; then print an error to stderr and exit with
non‑zero status) or enable safe failure (set -e) before the cd so the script
exits immediately on failure.
| torch==2.5.1 | ||
| torchvision==0.20.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
PyTorch 2.5.1 torchvision 0.20.1 compatibility version matrix
💡 Result:
Short answer: PyTorch 2.5.1 is compatible with torchvision 0.20.1 (and torchaudio 2.5.1). Installation commands and supported CUDA/ROCm/CPU wheels are listed on the PyTorch previous-versions page. [1][2]
Relevant references:
- PyTorch docs — Previous versions (install commands for torch==2.5.1 + torchvision==0.20.1). [1]
- PyTorch GitHub — PyTorch versions compatibility matrix (maps torch 2.5.x to torchvision 0.20.x). [2]
Would you like the exact pip/conda install commands for CUDA 11.8, 12.1, 12.4, ROCm, or CPU?
🌐 Web query:
PyTorch 2.5.1 ARM architecture support availability
💡 Result:
Short answer: Official PyTorch 2.5.1 builds exist for ARM64 in some vendor/distribution channels (e.g., AWS Deep Learning ARM64 AMIs include PyTorch 2.5.1), plus community/third‑party ARM builds; however, PyTorch upstream does not ship broad platform PyPI/Windows‑ARM wheels for every ARM target — support is mixed and often provided via vendor images, conda packages, or community builds. [1][2][3]
Sources:
- AWS Deep Learning AMI release notes showing PyTorch 2.5.1 on ARM64 AMIs. [1]
- PyTorch docs on Apple Silicon (M1/M2) support (macOS/MPS). [2]
- Community third‑party ARM wheel project (example: maxisoft/pytorch-arm). [3]
🌐 Web query:
torchvision 0.20.1 ARM wheel available
💡 Result:
Yes — torchvision 0.20.1 has ARM wheels available:
- manylinux (aarch64) and macOS ARM64 wheels listed for the 0.20.1 release. [1]
- piwheels provides Raspberry Pi / aarch64 builds for 0.20.1. [2]
- conda packages for linux-aarch64 and osx-arm64 are available from the PyTorch/Anaconda channels. [3]
References:
[1] Release file list for torchvision 0.20.1 (manylinux2014_aarch64, macosx_11_0_arm64). (simple-repository.app.cern.ch)
[2] piwheels entry showing torchvision 0.20.1 (2024-10-29). (piwheels.org)
[3] PyTorch/Anaconda package files including linux-aarch64 and osx-arm64 builds. (anaconda.org)
If you want, I can give the exact pip install command or a direct wheel URL for your platform.
Citations:
- 1: https://simple-repository.app.cern.ch/project/torchvision/0.20.1/files?utm_source=openai
- 2: https://www.piwheels.org/project/torchvision?utm_source=openai
- 3: https://anaconda.org/pytorch/torchvision/files?utm_source=openai
PyTorch 2.5.1 is not officially available for ARM64 on PyPI.
While torch 2.5.1 and torchvision 0.20.1 are compatible, PyTorch 2.5.1 lacks official ARM64 wheels on PyPI. The upstream PyTorch project only provides vendor/community builds (e.g., AWS Deep Learning AMIs). torchvision 0.20.1 does have official manylinux2014_aarch64 and osx-arm64 wheels, but the torch dependency cannot be satisfied via pip on generic ARM systems. Consider using an ARM-compatible PyTorch version with available wheels, or specify an alternative installation method (e.g., conda, vendor-specific builds).
🤖 Prompt for AI Agents
In
benchmarks/400.inference/413.image-classification/python/requirements.txt.arm.3.9
around lines 3-4, the pinned torch==2.5.1 has no official ARM64 wheels on PyPI
so pip installs will fail on generic ARM systems; update the requirement to an
ARM-compatible PyTorch release that provides aarch64 wheels (e.g., a supported
manylinux2014_aarch64 build) or replace the pip pin with installation guidance:
either switch to a conda/Miniforge package line that installs torch for aarch64,
point to a vendor/community wheel URL, or add a comments/instructions block in
this file explaining the alternative installation method and the exact command
to install a compatible torch build for ARM64.
| def _select_device(): | ||
| if torch.cuda.is_available(): | ||
| return torch.device("cuda") | ||
| raise RuntimeError("CUDA is not available") | ||
| return torch.device("cpu") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable code after raise.
Line 51 (return torch.device("cpu")) is unreachable since line 50 unconditionally raises an exception. Either remove the dead code or restructure if CPU fallback was intended.
def _select_device():
if torch.cuda.is_available():
return torch.device("cuda")
- raise RuntimeError("CUDA is not available")
- return torch.device("cpu")
+ # Fallback to CPU if CUDA unavailable
+ return torch.device("cpu")Or if CUDA is truly required:
def _select_device():
if torch.cuda.is_available():
return torch.device("cuda")
raise RuntimeError("CUDA is not available")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _select_device(): | |
| if torch.cuda.is_available(): | |
| return torch.device("cuda") | |
| raise RuntimeError("CUDA is not available") | |
| return torch.device("cpu") | |
| def _select_device(): | |
| if torch.cuda.is_available(): | |
| return torch.device("cuda") | |
| # Fallback to CPU if CUDA unavailable | |
| return torch.device("cpu") |
🧰 Tools
🪛 Ruff (0.14.8)
50-50: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In benchmarks/400.inference/413.recommendation/python/function.py around lines
47 to 51, the return torch.device("cpu") on line 51 is unreachable because the
preceding raise RuntimeError("CUDA is not available") always aborts execution;
either remove the dead return or change the control flow so CPU is a valid
fallback (e.g., only raise when you truly require CUDA, otherwise return
torch.device("cpu") instead of raising), and ensure the function returns a
torch.device in all code paths.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
CVE-2025-32434 PyTorch torch.load vulnerability
💡 Result:
Summary — CVE-2025-32434
- What it is: A Remote Command Execution (RCE) deserialization vulnerability in PyTorch’s torch.load when called with weights_only=True that can execute attacker-supplied code from a malicious .pt file. [1][2][3]
- Affected versions: torch ≤ 2.5.1. Fixed in PyTorch 2.6.0. [1][2]
- Severity: Critical (CVSS ~9.8). No privileges or user interaction required; remote/network attack vector when loading untrusted model files. [1][2]
Mitigation / recommended actions
- Upgrade PyTorch to 2.6.0 or later as soon as possible. [1][2]
- Never call torch.load (even with weights_only=True) on model files from untrusted sources. Treat .pt files as untrusted binary input. [1][3]
- If you must inspect untrusted models, do so in an isolated environment (sandbox, container, VM) or use tools that extract tensors without executing pickled objects. Consider file-format checks/conversion to known-safe formats before loading. [1][3]
References
[1] NVD CVE-2025-32434 entry. (nvd.nist.gov)
[2] PyTorch / GHSA advisory and vendor notes referenced in Ubuntu advisory. (ubuntu.com)
[3] Writeups / technical analysis (example: CVE news writeup). (cve.news)
If you want, I can: (a) fetch the official PyTorch advisory link, (b) show a minimal example exploit and a safe tensor-only loader pattern, or (c) scan your requirements file for vulnerable torch versions. Which would you like?
Citations:
- 1: https://nvd.nist.gov/vuln/detail/CVE-2025-32434?utm_source=openai
- 2: https://ubuntu.com/security/CVE-2025-32434?utm_source=openai
- 3: https://www.cve.news/cve-2025-32434/?utm_source=openai
🌐 Web query:
PyTorch 2.4.1 security vulnerabilities CVE 2025
💡 Result:
Short answer — yes.
Known 2025 CVEs that affect PyTorch 2.4.1 (summary):
-
CVE-2025-32434 — Remote command execution when loading untrusted model files with torch.load(weights_only=True). Affects PyTorch up to 2.5.1 (so includes 2.4.1); patched in 2.6.0. [1][2]
-
CVE-2025-2953 — local Denial‑of‑Service in torch.mkldnn_max_pool2d; affects versions < 2.7.1-rc1 (so includes 2.4.1); patched in 2.7.1-rc1. [3]
Other 2025 CVEs reported against PyTorch (examples) target 2.6.0+ builds (e.g., CVE-2025-2148, CVE-2025-2149) and therefore do not apply to 2.4.1. [4][5]
Recommended action: upgrade to a patched PyTorch release (at minimum 2.6.0 to fix CVE-2025-32434 and 2.7.1-rc1 or later to fix CVE-2025-2953), and treat all model files from untrusted sources as hostile (validate/sandbox before loading). [1][3]
Sources:
[1] NVD / CVE-2025-32434 (PyTorch torch.load weights_only RCE).
[2] CVE news / writeups on CVE-2025-32434.
[3] GitHub Advisory / GHSA (CVE-2025-2953) and related NVD record.
[4] NVD CVE-2025-2148.
[5] NVD/CNA entries for other 2025 PyTorch CVEs.
🌐 Web query:
GHSA-53q9-r3pm-6pq6 PyTorch torch.load RCE
💡 Result:
Short summary
- Advisory GHSA-53q9-r3pm-6pq6 / CVE-2025-32434: torch.load(..., weights_only=True) in PyTorch ≤ 2.5.1 can still lead to remote code execution (deserialization RCE). The issue is fixed in PyTorch 2.6.0. [1][2]
Impact
- Loading a malicious .pt model with torch.load(weights_only=True) may execute arbitrary code; treat as critical. [1][2]
Mitigation / action
- Upgrade PyTorch to >= 2.6.0. [1][2]
- Do not load model files from untrusted sources; if you must, load them in an isolated/sandboxed environment (container/VM) or inspect/convert them offline. [1][3]
References
- PyTorch security advisory (GHSA-53q9-r3pm-6pq6). [1]
- OSV / PYSEC-2025-41 (CVE-2025-32434) vulnerability entry. [2]
- GitHub Advisory Database entry for CVE-2025-32434. [3]
Upgrade torch to version 2.6.0 or later to address critical RCE vulnerability.
PyTorch versions ≤2.5.1 are vulnerable to CVE-2025-32434 (GHSA-53q9-r3pm-6pq6), a critical remote code execution vulnerability in torch.load with weights_only=True that was fixed in version 2.6.0. The vulnerability carries a CVSS score of 9.8, posing severe risks to systems using this benchmark suite.
Apply this fix:
-torch==2.4.1
+torch==2.6.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| torch==2.4.1 | |
| torch==2.6.0 |
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/601.matmul/python/requirements.txt lines 1-1,
the pinned torch version is 2.4.1 which is vulnerable; update the requirement to
a non-vulnerable release by changing the spec to at least 2.6.0 (for example use
"torch>=2.6.0,<3.0" or pin to a specific safe patch like "torch==2.6.0" or
later), then run the benchmark's CI/tests to ensure compatibility and update any
code that relies on deprecated APIs if tests fail.
| def handler(event): | ||
| size = event.get("size") | ||
| if "seed" in event: | ||
| import random | ||
|
|
||
| random.seed(event["seed"]) | ||
|
|
||
| seed = event.get("seed", 42) | ||
| seed = int(seed) | ||
|
|
||
| gen_begin = datetime.datetime.now() | ||
| alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical bug: seed variable is undefined when "seed" is not in event.
When "seed" is not present in the event, the code skips the if block but still references seed on line 43, causing an UnboundLocalError.
def handler(event):
size = event.get("size")
+ seed = 42 # default seed
if "seed" in event:
import random
-
random.seed(event["seed"])
-
- seed = event.get("seed", 42)
seed = int(seed)
gen_begin = datetime.datetime.now()
alpha, x, y = initialize_torch(size, dtype=torch.float32, device="cuda", seed=seed)Note: The random.seed() call appears unused since the function uses torch.manual_seed() in initialize_torch(). Consider removing it if Python's random module isn't used elsewhere.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/602.axpy/python/function.py around lines 32 to
43, the variable `seed` is only set inside the `if "seed" in event:` block but
is referenced later unconditionally, causing an UnboundLocalError when the event
has no seed; fix by ensuring `seed` is always defined (e.g., set seed =
int(event.get("seed", 42)) before or outside the if-block), remove the unused
`random.seed(event["seed"])` and `import random` if Python's random module is
not used elsewhere, or otherwise only call and import random when it is needed.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check PyTorch version and security advisories
echo "Checking latest PyTorch version from PyPI..."
curl -s https://pypi.org/pypi/torch/json | jq -r '.info.version' 2>/dev/null || echo "Failed to fetch from PyPI"
echo -e "\nChecking available torch versions (last 20)..."
curl -s https://pypi.org/pypi/torch/json | jq -r '.releases | keys[]' | tail -20Repository: spcl/serverless-benchmarks
Length of output: 288
🌐 Web query:
PyTorch 2.4.1 security vulnerabilities CVE
💡 Result:
Short answer — yes.
Known CVEs affecting PyTorch 2.4.1 (or earlier) include:
- CVE-2024-48063 — RemoteModule deserialization RCE (affects torch <= 2.4.1). [1][3]
- CVE-2025-32434 — RCE when loading models with torch.load(..., weights_only=True) (affects <= 2.5.1; 2.4.1 is vulnerable). [2]
Recommendation: upgrade to a patched release (CVE-2024-48063 was addressed in 2.5.0+; CVE-2025-32434 was patched in 2.6.0). Verify changelogs and your deployment (especially any use of torch.distributed/RemoteModule or torch.load with untrusted inputs) before upgrading. [1][2][3]
Sources:
[1] NVD — CVE-2024-48063.
[2] NVD — CVE-2025-32434.
[3] CVE Details / vulnerability listings for CVE-2024-48063.
🌐 Web query:
PyTorch latest version 2025 current release
💡 Result:
The latest stable PyTorch release as of December 15, 2025 is PyTorch 2.9 (released October 27, 2025). [1]
Sources:
- PyTorch 2.9 release blog. [1] (pytorch.org)
Citations:
Upgrade PyTorch to 2.6.0 or later to address critical security vulnerabilities.
PyTorch 2.4.1 is outdated and contains two critical RCE vulnerabilities:
- CVE-2024-48063 (RemoteModule deserialization RCE) — patched in 2.5.0+
- CVE-2025-32434 (torch.load RCE with weights_only=True) — patched in 2.6.0
The latest version is 2.9.1. Update the pinned version to at least 2.6.0 (or later).
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/602.axpy/python/requirements.txt lines 1–1, the
pinned PyTorch version torch==2.4.1 contains critical RCE vulnerabilities;
update the requirement to a secure release (at least 2.6.0). Replace the exact
pin with a safe constraint such as torch>=2.6.0,<3.0 or pin to a specific
patched release (e.g., torch==2.9.1), then run your project's dependency
resolver/build/tests to ensure compatibility and update any lockfiles or CI
images that rely on the old version.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n benchmarks/600.linearalgebra/604.cholesky/python/requirements.txtRepository: spcl/serverless-benchmarks
Length of output: 89
🌐 Web query:
CVE-2025-32434 PyTorch torch.load vulnerability
💡 Result:
Short summary — CVE-2025-32434 (critical RCE)
- What it is: a remote code‑execution vulnerability in PyTorch’s torch.load when called with weights_only=True; legacy/alternate deserialization paths could still execute pickled Python objects. [1][2]
- Affected versions: PyTorch <= 2.5.1. Patched in 2.6.0. [1][2]
- Impact: loading a malicious .pt/.pth file with torch.load(..., weights_only=True) on a vulnerable install can execute arbitrary commands (full RCE). [1][2]
- Immediate mitigation: upgrade PyTorch to >= 2.6.0; do not load model artifacts from untrusted sources; if you must, inspect/convert files offline or load inside isolated/sandboxed environments (containers, VMs) and follow least privilege. [1][2]
Sources:
[1] PyTorch GitHub security advisory GHSA-53q9-r3pm-6pq6 (CVE-2025-32434).
[2] NVD entry for CVE-2025-32434.
🏁 Script executed:
# Find all requirements.txt files mentioning torch to check for consistency with other files
find . -name "requirements.txt" -type f | head -20 | xargs grep -l "torch" 2>/dev/nullRepository: spcl/serverless-benchmarks
Length of output: 556
🏁 Script executed:
grep "torch" ./benchmarks/300.utilities/320.monte-carlo-pi/python/requirements.txt ./benchmarks/400.inference/411.image-recognition/python/requirements.txt ./benchmarks/400.inference/413.image-classification/python/requirements.txt ./benchmarks/400.inference/413.recommendation/python/requirements.txt ./benchmarks/600.linearalgebra/601.matmul/python/requirements.txt ./benchmarks/600.linearalgebra/602.axpy/python/requirements.txt ./benchmarks/600.linearalgebra/603.jacobi2d/python/requirements.txtRepository: spcl/serverless-benchmarks
Length of output: 1092
🌐 Web query:
CVE-2025-32434 CVSS score v4
💡 Result:
CVE-2025-32434 — CVSS v4.0 base score: 9.3 (CRITICAL). Vector: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N. [1][2]
Sources: NVD and INCIBE. [1] NVD entry for CVE-2025-32434. [2] INCIBE advisory for CVE-2025-32434.
Upgrade torch to version 2.6.0 or later to address critical RCE vulnerability.
PyTorch versions ≤2.5.1 are vulnerable to CVE-2025-32434, a critical remote code execution vulnerability in torch.load that was fixed in version 2.6.0. The vulnerability carries a CVSS v4.0 score of 9.3, posing severe risks to systems using this benchmark suite.
-torch==2.4.1
+torch==2.6.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| torch==2.4.1 | |
| torch==2.6.0 |
🧰 Tools
🪛 OSV Scanner (2.3.0)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/604.cholesky/python/requirements.txt around line
1, the project pins torch to 2.4.1 which is vulnerable (CVE-2025-32434); update
the requirement to torch>=2.6.0 (or a specific 2.6.x/2.7.x release) to ensure
the patched version is used, run your dependency install/test matrix to confirm
compatibility, and update any CI/environment constraints or lockfiles so the new
minimum version is enforced.
| @@ -0,0 +1 @@ | |||
| torch==2.4.1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and read its content
cat benchmarks/600.linearalgebra/605.lu/python/requirements.txtRepository: spcl/serverless-benchmarks
Length of output: 82
🌐 Web query:
CVE-2025-32434 PyTorch torch.load RCE vulnerability
💡 Result:
Short summary
- CVE-2025-32434 is a critical RCE in PyTorch’s torch.load when called with weights_only=True; an attacker-supplied .pt/.pth (pickle/tar) file can execute arbitrary code during load. Published Apr 17–18, 2025. [1][2]
Affected versions
- torch / PyTorch <= 2.5.1. Patched in 2.6.0. [1][3]
Impact
- Remote code execution on any system that loads an attacker-controlled model with torch.load(weights_only=True). Exploitation requires only loading the crafted file (no privileges or user interaction). [1][2]
Mitigation / recommended actions
- Upgrade to PyTorch 2.6.0 or later immediately (pip/conda upgrade your torch package). [1][3]
- pip: pip install --upgrade torch
- Avoid loading model files from untrusted sources. Treat .pt/.pth/.tar files as untrusted pickles. [1][2]
- If you must inspect untrusted models, do so in isolated/sandboxed environments (containers, VMs) or convert/inspect files offline before loading. [2][3]
- Where feasible, validate model files or prefer secure, non-pickle formats and signature/verification workflows for third‑party models.
References
- PyTorch GitHub security advisory (GHSA-53q9-r3pm-6pq6). [1]
- NVD / CVE-2025-32434 entry. [2]
- Ubuntu / INCIBE advisories summarizing affected versions and CVSS. [3][4]
🌐 Web query:
PyTorch 2.6.0 security vulnerability fix torch.load
💡 Result:
Short answer: Yes — a critical RCE in torch.load(weights_only=True) (CVE-2025-32434) affecting torch < 2.6.0 was disclosed in April 2025 and was patched in PyTorch 2.6.0. Upgrade to >=2.6.0 (or avoid loading untrusted .pt/.pth files) to mitigate. [1][2][3]
Details and links:
- Vulnerability: torch.load with weights_only=True can be exploited for remote code execution (affects <=2.5.1). Disclosed Apr 17–18, 2025. [2][1]
- Patched version: 2.6.0. Upgrade (pip): pip install --upgrade torch. [2][3]
- Mitigations if you cannot upgrade immediately: do not load untrusted model files; use safer formats (e.g., safetensors) or isolate loading in a sandbox. [4][5]
Sources:
[1] GitHub Advisory / PyTorch security advisory (GHSA-53q9-r3pm-6pq6).
[2] CVE summary pages for CVE-2025-32434.
[3] IBM security bulletin noting the patch in 2.6.0.
[4] News coverage & vulnerability trackers recommending safetensors and upgrades.
[5] Vendor advisory / PyTorch security page.
Upgrade torch to version 2.6.0 or later to address critical RCE vulnerability.
PyTorch versions ≤2.5.1 are vulnerable to CVE-2025-32434, a critical remote code execution flaw in torch.load with weights_only=True that was patched in version 2.6.0. The vulnerability allows arbitrary code execution when loading attacker-controlled .pt/.pth files.
-torch==2.4.1
+torch==2.6.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| torch==2.4.1 | |
| torch==2.6.0 |
🧰 Tools
🪛 OSV Scanner (2.3.0)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2024-259)
[CRITICAL] 1-1: torch 2.4.1: undefined
(PYSEC-2025-41)
[CRITICAL] 1-1: torch 2.4.1: PyTorch susceptible to local Denial of Service
[CRITICAL] 1-1: torch 2.4.1: PyTorch: torch.load with weights_only=True leads to remote code execution
[CRITICAL] 1-1: torch 2.4.1: PyTorch Improper Resource Shutdown or Release vulnerability
🤖 Prompt for AI Agents
In benchmarks/600.linearalgebra/605.lu/python/requirements.txt around lines 1 to
1, the pinned dependency torch==2.4.1 is vulnerable; update the requirement to
torch>=2.6.0 (or a specific safe release like 2.6.0 or later) to address
CVE-2025-32434, then regenerate any dependency lockfiles/virtualenvs, run the
test suite to ensure compatibility, and update any CI images or docs referencing
the old version.
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.