Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ If the issue occurs while running a Python workload or involves a simulator cras

For example:
```
python3 tests/test_add.py
python3 tests/ops/elementwise/test_add.py
...
[SpikeSimulator] cmd> spike --isa rv64gcv --varch=vlen:256,elen:64 --vectorlane-size=128 \
-m0x80000000:0x1900000000,0x2000000000:0x1000000 \
Expand Down
90 changes: 45 additions & 45 deletions .github/workflows/pytorchsim_test.yml

Large diffs are not rendered by default.

18 changes: 9 additions & 9 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The pipeline runs in that order on every `torch.compile` invocation; you'll see
| `TOGSim/` | C++ TOGSim source. `src/Simulator.cc`, `Core.cc`, `Dram.cc`, `Interconnect.cc`, `L2Cache.cc`, `Tile.cc`, `TileGraph.cc` are the core models. Externals: ramulator2, booksim, stonneCore, onnx, protobuf, spdlog, yaml-cpp |
| `AsmParser/` | `tog_generator.py`, `onnx_utility.py` — TOG generation from ONNX/ASM |
| `configs/` | TOGSim hardware configs (YAML). The default is `systolic_ws_128x128_c1_simple_noc_tpuv3.yml`. Naming pattern: `systolic_ws_<size>_c<cores>_<noc>_<target>.yml` |
| `tests/` | ~36 op- and model-level tests. Subdirs `DeepSeek/`, `Diffusion/`, `Llama/`, `MLP/`, `Mixtral_8x7B/`, `MoE/`, `Yolov5/`, `Fusion/` for whole-model workloads |
| `tests/` | Op- and model-level tests organized under `ops/<family>/` (elementwise, reduce, gemm, conv, attention, view, sort, sparsity, misc, fusion), `models/<name>/` (Llama, Mixtral8x7B, DeepSeek, Diffusion, MoE, MLP, MobileNet, Yolov5) plus single-file model tests (test_resnet, test_transformer, test_vit, test_mlp, test_single_perceptron), and `system/` (scheduler, eager, hetro, stonne, vectorops). Shared helper: `tests/_utils.py` |
| `experiments/artifact/` | Paper reproduction scripts (`cycle_validation/run_cycle.sh`, `speedup/run_speedup.sh`) |
| `scripts/` | One-off experiment runners (CompilerOpt, ILS, batch, chiplet, sparsity, stonne, end2end). `build_from_source.sh` builds gem5/llvm/spike |
| `gem5_script/` | gem5 wrapper scripts called by `CycleSimulator` |
Expand All @@ -36,16 +36,16 @@ The pipeline runs in that order on every `torch.compile` invocation; you'll see
Most tests follow the same pattern: build CPU reference, compile via `torch.compile` on `npu:0`, compare with `torch.allclose` (rtol=atol=1e-4). They all have `if __name__ == "__main__"` blocks.

```bash
python tests/test_add.py # vector add (smoke test, fastest)
python tests/test_matmul.py # GEMM
python tests/test_mlp.py # MLP forward + backward (training path)
python tests/test_scheduler.py # multi-tenant launch_model
python tests/test_eager.py # eager-fallback registration
python tests/ops/elementwise/test_add.py # vector add (smoke test, fastest)
python tests/ops/gemm/test_matmul.py # GEMM
python tests/models/test_mlp.py # MLP forward + backward (training path)
python tests/system/test_scheduler.py # multi-tenant launch_model
python tests/system/test_eager.py # eager-fallback registration
```

Run a model from `tests/Llama/`, `tests/DeepSeek/`, etc. similarly.
Run a model from `tests/models/Llama/`, `tests/models/DeepSeek/`, etc. similarly.

**CI coverage:** the GitHub Actions workflow `.github/workflows/pytorchsim_test.yml` runs an **explicit allowlist** of `tests/*.py` files (~40 jobs, one Docker container per test). Adding a new file under `tests/` does *not* automatically gate PRs — register it in `pytorchsim_test.yml` if you want CI to exercise it. Conversely, files like `tests/test_gqa.py`, `tests/test_gqa_decode.py`, and `tests/test_eager.py` exist in the repo but are *not* in CI, so local validation is the only safety net for them.
**CI coverage:** the GitHub Actions workflow `.github/workflows/pytorchsim_test.yml` runs an **explicit allowlist** of `tests/*.py` files (~40 jobs, one Docker container per test). Adding a new file under `tests/` does *not* automatically gate PRs — register it in `pytorchsim_test.yml` if you want CI to exercise it. Conversely, files like `tests/ops/attention/test_gqa.py`, `tests/ops/attention/test_gqa_decode.py`, and `tests/system/test_eager.py` exist in the repo but are *not* in CI, so local validation is the only safety net for them.

**For fast iteration** (skip functional check):
```bash
Expand Down Expand Up @@ -123,7 +123,7 @@ Conan deps for TOGSim: `boost/1.79.0`, `robin-hood-hashing/3.11.5`, `spdlog/1.11
- **Adding a PyTorch device op:** `PyTorchSimDevice/csrc/aten/native/*` (Minimal/Extra split mirrors `torch_openreg`).
- **TOGSim hardware model changes:** `TOGSim/src/{Core,Dram,Interconnect,L2Cache,Tile,TileGraph}.cc` + matching `include/*.h`.
- **TOG generation:** `AsmParser/tog_generator.py` builds the raw graph and serializes it via `AsmParser/onnx_utility.py` to **ONNX, which is the on-disk TOG format** consumed by TOGSim.
- **Eager fallback registration:** `torch.npu.register_eager_to_compile([...])` — see `tests/test_eager.py`.
- **Eager fallback registration:** `torch.npu.register_eager_to_compile([...])` — see `tests/system/test_eager.py`.
- **Per-run results:** `togsim_results/<YYYYMMDD_HHMMSS_<hash>>.log` (stats) and `.trace` (instruction trace). The path is also printed at the end of every run.
- **Wrapper codegen path:** printed as `Wrapper Codegen Path = /tmp/torchinductor_<user>/<hash>/...py` — useful for inspecting generated kernel code and tensor names for `SRAM_BUFFER_PLAN_PATH`.

Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,15 +40,15 @@ PyTorchSim **supports**:
|---|:-:|:-:|---|
| ResNet-18 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | channel last format |
| ResNet-50 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | channel last format |
| MobileNet-v2 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | `tests/MobileNet/` (torchvision) |
| YOLOv5 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | `tests/Yolov5/` |
| MobileNet-v2 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | `tests/models/MobileNet/` (torchvision) |
| YOLOv5 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | `tests/models/Yolov5/` |
| BERT | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | |
| GPT-2 | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | |
| ViT | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | `tests/test_vit.py` |
| ViT | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | `tests/models/test_vit.py` |
| Mistral | <img src="https://avatars.githubusercontent.com/u/21003710?s=48&v=4" width="20"/> | ✅ | |
| Stable-diffusion v1 | 🤗 | ✅ | |
| Llama 2/3 | 🤗 | ✅ | `tests/Llama/` (blocks & decode-style paths) |
| DeepSeek-V3 (base) | 🤗 | ✅ | `tests/DeepSeek/` — several ops(e.g., gate ops) are not cycle-modeled |
| Llama 2/3 | 🤗 | ✅ | `tests/models/Llama/` (blocks & decode-style paths) |
| DeepSeek-V3 (base) | 🤗 | ✅ | `tests/models/DeepSeek/` — several ops(e.g., gate ops) are not cycle-modeled |
| Llama-4 | 🤗 | ⏳ | In development |
| Broader model support | — | ⏳ | In development |
<!-- ## Requirements
Expand Down Expand Up @@ -104,7 +104,7 @@ The script clones each dep at the tag pinned in [`thirdparty/github-releases.jso
### Run Examples
The `tests` directory contains several AI workload examples.
```bash
python tests/test_matmul.py
python tests/ops/gemm/test_matmul.py
```
The result is written to `${TORCHSIM_LOG_PATH}/togsim_result/XXX.log`. The log file contains detailed core, memory, and interconnect stats.

Expand Down Expand Up @@ -201,7 +201,7 @@ optimizer.zero_grad()
loss.backward()
compiled_step()
```
`tests/test_mlp.py` provides an example of MLP training.
`tests/models/test_mlp.py` provides an example of MLP training.

## One TOGSim session, one continuous log

Expand Down Expand Up @@ -243,7 +243,7 @@ with TOGSimulator(config_path=config):
Here `synchronize()` acts as a barrier: it does not return until every `launch_model` issued **above** it has finished in the simulator. The later pair of `launch_model` calls therefore runs only after those earlier models have fully completed—so the sync is the point in the timeline where **all preceding launches are done**.

```bash
python tests/test_scheduler.py
python tests/system/test_scheduler.py
```

Use a TOGSim config(`.yml`) that defines **partitions** when mapping queues to cores, for example:
Expand Down
60 changes: 30 additions & 30 deletions scripts/sparsity_experiment/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,48 +6,48 @@ export TORCHSIM_FORCE_TIME_N=8

OUTPUT_DIR="12GB"
export TOGSIM_CONFIG="/workspace/PyTorchSim/configs/systolic_ws_8x8_c1_12G_simple_noc.yml"
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8

OUTPUT_DIR="24GB"
export TOGSIM_CONFIG="/workspace/PyTorchSim/configs/systolic_ws_8x8_c1_24G_simple_noc.yml"
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8

OUTPUT_DIR="48GB"
export TOGSIM_CONFIG="/workspace/PyTorchSim/configs/systolic_ws_8x8_c1_48G_simple_noc.yml"
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8

OUTPUT_DIR="12GB_2core"
export TOGSIM_CONFIG="/workspace/PyTorchSim/configs/systolic_ws_8x8_c2_12G_simple_noc.yml"
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8

OUTPUT_DIR="24GB_2core"
export TOGSIM_CONFIG="/workspace/PyTorchSim/configs/systolic_ws_8x8_c2_24G_simple_noc.yml"
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8

OUTPUT_DIR="48GB_2core"
export TOGSIM_CONFIG="/workspace/PyTorchSim/configs/systolic_ws_8x8_c2_48G_simple_noc.yml"
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.0 > ${OUTPUT_DIR}/0.0
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.2 > ${OUTPUT_DIR}/0.2
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.4 > ${OUTPUT_DIR}/0.4
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.6 > ${OUTPUT_DIR}/0.6
python3 /workspace/PyTorchSim/tests/ops/sparsity/test_sparsity.py --sparsity 0.8 > ${OUTPUT_DIR}/0.8
6 changes: 3 additions & 3 deletions scripts/stonne_experiment/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
export TORCHSIM_FORCE_TIME_M=1024
export TORCHSIM_FORCE_TIME_K=1024
export TORCHSIM_FORCE_TIME_N=1024
python3 ../../tests/test_hetro.py --M 1024 --N 1024 --K 1024 --sparsity 0.9 --config stonne_big_c1_simple_noc.yml --mode 0 > hetero/big_sparse.log
python3 ../../tests/test_hetro.py --M 1024 --N 1024 --K 1024 --sparsity 0.9 --config systolic_ws_128x128_c1_simple_noc_tpuv3_half.yml --mode 1 > hetero/big.log
python3 ../../tests/test_hetro.py --M 1024 --N 1024 --K 1024 --sparsity 0.9 --config heterogeneous_c2_simple_noc.yml --mode 2 > hetero/hetero.log
python3 ../../tests/system/test_hetro.py --M 1024 --N 1024 --K 1024 --sparsity 0.9 --config stonne_big_c1_simple_noc.yml --mode 0 > hetero/big_sparse.log
python3 ../../tests/system/test_hetro.py --M 1024 --N 1024 --K 1024 --sparsity 0.9 --config systolic_ws_128x128_c1_simple_noc_tpuv3_half.yml --mode 1 > hetero/big.log
python3 ../../tests/system/test_hetro.py --M 1024 --N 1024 --K 1024 --sparsity 0.9 --config heterogeneous_c2_simple_noc.yml --mode 2 > hetero/hetero.log

echo "All processes completed!"
2 changes: 1 addition & 1 deletion scripts/stonne_experiment/run_trace.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

SCRIPT="/workspace/PyTorchSim/tests/test_stonne.py"
SCRIPT="/workspace/PyTorchSim/tests/system/test_stonne.py"

SIZES=(32 64 128)
SPARSITIES=(0.0 0.2 0.4 0.6 0.8)
Expand Down
44 changes: 44 additions & 0 deletions tests/_pytorchsim_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
"""Shared helpers for PyTorchSim test files.

Module name is unique (not ``tests._utils``) because ``ultralytics``
ships a top-level ``tests`` package in site-packages that would shadow it.

Import with:

import os, sys
sys.path.insert(0, os.path.join(
os.environ.get("TORCHSIM_DIR", default="/workspace/PyTorchSim"), "tests"))
from _pytorchsim_utils import test_result
"""

import sys

import torch


def test_result(name, out, expected, rtol=1e-4, atol=1e-4):
"""Compare ``out`` to ``expected``; exit 1 on mismatch."""
out_cpu = out.cpu() if hasattr(out, "cpu") else out
expected_cpu = expected.cpu() if hasattr(expected, "cpu") else expected

if torch.allclose(out_cpu, expected_cpu, rtol=rtol, atol=atol):
msg = f"|{name} Test Passed|"
bar = "-" * len(msg)
print(bar)
print(msg)
print(bar)
return

msg = f"|{name} Test Failed|"
bar = "-" * len(msg)
print(bar)
print(msg)
print(bar)
print("custom out: ", out_cpu)
print("cpu out: ", expected_cpu)
try:
max_diff = (out_cpu - expected_cpu).abs().max().item()
print(f"Max abs diff: {max_diff}")
except Exception:
pass
sys.exit(1)
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
import copy
from pathlib import Path
import torch
sys.path.insert(0, os.path.join(os.environ.get("TORCHSIM_DIR", default="/workspace/PyTorchSim"), "tests"))
from _pytorchsim_utils import test_result

# recursive compile for some ops that are caused by graph break
torch.npu.register_eager_to_compile([
Expand All @@ -18,28 +20,6 @@
])


def test_result(name, out, cpu_out, rtol=1e-4, atol=1e-4):
out_cpu = out.cpu()
max_diff = (out_cpu - cpu_out).abs().max().item()
mean_diff = (out_cpu - cpu_out).abs().mean().item()
if torch.allclose(out_cpu, cpu_out, rtol=rtol, atol=atol):
message = f"|{name} Test Passed|"
print("-" * len(message))
print(message)
print("-" * len(message))
print(f"Max absolute difference: {max_diff:.6f}")
print(f"Mean absolute difference: {mean_diff:.6f}")
else:
message = f"|{name} Test Failed|"
print("-" * len(message))
print(message)
print("-" * len(message))
print("NPU out: ", out_cpu)
print("CPU out: ", cpu_out)
print(f"Max absolute difference: {max_diff:.6f}")
print(f"Mean absolute difference: {mean_diff:.6f}")
exit(1)


def _extract_logits(output):
if isinstance(output, torch.Tensor):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,8 @@
from diffusers.models.upsampling import Upsample2D
from diffusers.models.resnet import ResnetBlock2D
from diffusers.models.embeddings import Timesteps

def test_result(name, out, cpu_out, rtol=1e-4, atol=1e-4):
if torch.allclose(out.cpu(), cpu_out, rtol=rtol, atol=atol):
message = f"|{name} Test Passed|"
print("-" * len(message))
print(message)
print("-" * len(message))
else:
message = f"|{name} Test Failed|"
print("-" * len(message))
print(message)
print("-" * len(message))
print("custom out: ", out.cpu())
print("cpu out: ", cpu_out)
diff = torch.max(torch.abs(out.cpu() - cpu_out)).item()
print(f"Max abs diff: {diff}")
exit(1)
sys.path.insert(0, os.path.join(os.environ.get("TORCHSIM_DIR", default="/workspace/PyTorchSim"), "tests"))
from _pytorchsim_utils import test_result

@torch.no_grad()
def test_unet_conditional(
Expand Down Expand Up @@ -636,7 +621,6 @@ def test_timesteps(
parser.add_argument("--prompt", type=str, default="a cat in a hat")
args = parser.parse_args()

sys.path.append(os.environ.get("TORCHSIM_DIR", "/workspace/PyTorchSim"))
device = torch.device("npu:0")

#test_upsample2d(device)
Expand Down
Loading