Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
51ed880
Cover all branches in SoilBiogeochemCompetition baseline by default
samsrabin Apr 28, 2026
bf72976
100 iterations
samsrabin May 5, 2026
502cb98
Make --fast canonical config MIMICS-off; split baseline files
samsrabin May 6, 2026
9cd407b
Add baseline_checksum_fast.txt for canonical --fast config
samsrabin May 6, 2026
7a351b0
Update SoilBiogeochemCompetition README for new canonical --fast
samsrabin May 6, 2026
7c69527
Replace mimics_decomp - 1 with named constant non_mimics_decomp
samsrabin May 6, 2026
fcc0ff9
Polish comments in SoilBiogeochemCompetition
samsrabin May 6, 2026
9b4fa8a
Add verify.sh helper for SoilBiogeochemCompetition perf testing
samsrabin May 6, 2026
4ad11d0
Add perf_timers_mod + INNER_TIMING build plumbing
samsrabin May 6, 2026
d9d4b40
Wrap canonical-path loops in SoilBiogeochemCompetition with timers
samsrabin May 6, 2026
f250de0
Wire perf_timer_print + dump_csv into driver
samsrabin May 6, 2026
801a3e0
Document INNER_TIMING in SoilBiogeochemCompetition README
samsrabin May 6, 2026
36f1286
Extract accum_sminn_tot helper from Loop 15
samsrabin May 6, 2026
f029e05
Extract compute_nuptake_prof helper from Loop 16
samsrabin May 6, 2026
8f35204
Extract compete_nh4 helper from Loop 17 (Step 3c-i)
samsrabin May 6, 2026
f10d10d
Extract compete_no3 helper from Loop 17 (Step 3c-ii)
samsrabin May 6, 2026
8119b2d
Extract compute_n2o_emissions helper from Loop 17 (Step 3c-iii)
samsrabin May 6, 2026
afce646
Extract apply_carbon_only_adjustment helper from Loop 17 (Step 3c-iv)
samsrabin May 6, 2026
79cea8c
Extract compute_competition_summary helper from Loop 17 (Step 3c-v)
samsrabin May 6, 2026
c453a38
Extract accum_sminn_to_plant helper from Loop 18 (Step 3d)
samsrabin May 6, 2026
67770a5
Extract residual-uptake math helpers from Loop NH4 (Step 3e-i)
samsrabin May 6, 2026
85246ee
Reuse residual-uptake math helpers in NO3 second pass (Step 3e-ii)
samsrabin May 6, 2026
b4a2588
Extract Loop 23 immobilization sum; generalize dz-weighted helper (St…
samsrabin May 6, 2026
48d22f7
Extract compute_fraction_or_one helper from Loop 24 (Step 3g)
samsrabin May 6, 2026
0200b7b
Document Step 3 helper structure in SoilBiogeochemCompetition README
samsrabin May 6, 2026
be71daf
Add GPU run plumbing + CPU/GPU measurement docs (Step 4)
samsrabin May 6, 2026
4180ec4
Fix run_gpu.sh quoting; preserve --fast inner-timing CSV
samsrabin May 6, 2026
d58cebe
Add OpenACC directives + local data region to accum_sminn_tot (Step 5a)
samsrabin May 6, 2026
91eaedf
Hoist accum_sminn_tot inputs to driver-level !$acc data region (Step 5a)
samsrabin May 6, 2026
28f93ff
Relocate accum_sminn_tot data hoist into the routine (Step 5a)
samsrabin May 6, 2026
4d498f6
GPU-ify compute_nuptake_prof; widen inner data region (Step 5b)
samsrabin May 6, 2026
aa01961
Switch CPU-parallel baseline from -acc=multicore to OpenMP (-mp)
samsrabin May 6, 2026
cd865f2
Switch PBS queue from tutorial to develop; add build.sh + debug_gpu.sh
samsrabin May 7, 2026
b1249ea
GPU-ify main_competition; collapse data regions; track GPU baselines …
samsrabin May 7, 2026
9804f18
Fix Step 5c: sync nlimit_*/sum_*_demand_scaled to host; restore CPU b…
samsrabin May 7, 2026
5d2a0da
GPU-ify sum_sminn_to_plant; relocate update self past it (Step 5d)
samsrabin May 7, 2026
f780c1e
GPU-ify residual_uptake_nh4 + residual_uptake_no3 (Step 5e)
samsrabin May 7, 2026
df045d0
GPU-ify sum_immobilization (Step 5f)
samsrabin May 7, 2026
c8258ff
GPU-ify compute_fpg_fpi; canonical path now fully on GPU (Step 5g)
samsrabin May 7, 2026
80e823a
Update README with Step 5 results and OpenACC data-clause discipline
samsrabin May 7, 2026
544a0de
README: drop misleading data-region-hoist suggestion
samsrabin May 7, 2026
9b1eccc
Gate update self(sum_*_demand_scaled) on the mimics_decomp branch
samsrabin May 7, 2026
6808663
debug_gpu.sh: Add commented-out command for nsys call.
samsrabin May 7, 2026
cae0d7d
Wrap perf_timer call sites and use lines in #ifdef INNER_TIMING
samsrabin May 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions perf_testing/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,9 @@
*.mod
driver
last_run.txt
last_run_timings.csv
last_run_timings_fast.csv
sbgc_gpu.o*
sbgc_gpu.e*
sbgc_dbg.o*
sbgc_dbg.e*
8 changes: 8 additions & 0 deletions perf_testing/Makefile.common
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,14 @@ ifeq ($(TIMING),1)
FFLAGS += -DPERF_TIMING
endif

# INNER_TIMING=1 turns on the per-loop perf_timers_mod instrumentation.
# Independent of TIMING/PERF_TIMING (which controls the driver-level
# system_clock around the whole iteration loop).
INNER_TIMING ?= 0
ifeq ($(INNER_TIMING),1)
FFLAGS += -DINNER_TIMING
endif

driver: $(OBJ)
$(FC) $(FFLAGS) $(LDFLAGS) -o $@ $(OBJ)

Expand Down
12 changes: 9 additions & 3 deletions perf_testing/SoilBiogeochemCompetition/Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
OBJ := SoilBiogeochemCompetition.o driver.o
# Pick up the shared perf_timers_mod from the parent perf_testing/ dir.
VPATH := ..

# driver.F90 uses the SoilBiogeochemCompetition_mod module
driver.o: SoilBiogeochemCompetition.o
OBJ := SoilBiogeochemCompetition.o perf_timers_mod.o driver.o

# Module-use ordering. driver.F90 uses SoilBiogeochemCompetition_mod;
# both driver.F90 and SoilBiogeochemCompetition.F90 use perf_timers_mod
# (start/stop calls inside the routine; print/dump calls in the driver).
driver.o: SoilBiogeochemCompetition.o perf_timers_mod.o
SoilBiogeochemCompetition.o: perf_timers_mod.o

include ../Makefile.common
298 changes: 258 additions & 40 deletions perf_testing/SoilBiogeochemCompetition/README.md

Large diffs are not rendered by default.

778 changes: 591 additions & 187 deletions perf_testing/SoilBiogeochemCompetition/SoilBiogeochemCompetition.F90

Large diffs are not rendered by default.

8 changes: 3 additions & 5 deletions perf_testing/SoilBiogeochemCompetition/baseline_checksum.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
mode all
ncol 8000
nlevdecomp 10
ndct 8
numfc 8000
niters 1
use_nitrif_denitrif T
carbon_only F
decomp_method 2
checksum 9.5970435393765438E+04
niters 100
checksum 7.6772246368780300E+07
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
mode fast
ncol 8000
nlevdecomp 10
ndct 8
numfc 8000
niters 100
checksum 9.5857105051752981E+06
21 changes: 21 additions & 0 deletions perf_testing/SoilBiogeochemCompetition/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash
# Source the project env file, then `make clean && make "$@"`.
# Use this instead of inlining `. ../env.sh && make ...` in shell commands —
# scripted entry points stay whitelistable across runs.
#
# Usage:
# ./build.sh # serial
# ./build.sh EXTRA_FFLAGS="-mp" # OpenMP
# ./build.sh EXTRA_FFLAGS="-acc=gpu -gpu=cc80" # GPU
# ./build.sh EXTRA_FFLAGS="-acc=gpu -gpu=cc80" INNER_TIMING=1
#
# Filter output (grep, tail, head, etc.) at the call site, not in here.

set -euo pipefail
cd "$(dirname "$0")"

# shellcheck disable=SC1091
. ../env.sh >/dev/null 2>&1

make clean
make "$@"
49 changes: 49 additions & 0 deletions perf_testing/SoilBiogeochemCompetition/debug_gpu.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash
# Debug helper: submit a PBS job that runs ./driver directly (no
# verify.sh grep filter) so any GPU runtime error is fully visible.
#
# Assumes ./driver was already built on the login node (e.g. via:
# make clean && make EXTRA_FFLAGS="-acc=gpu -gpu=cc80"
# ).
#
# Usage:
# ./debug_gpu.sh # runs ./driver --fast
# ./debug_gpu.sh --all # runs ./driver --all
# ./debug_gpu.sh --fast # explicit
#
# Output is written to ./sbgc_dbg.o<jobid> (gitignored) and cat'd here.

set -euo pipefail
cd "$(dirname "$0")"

driver_args="${*:---fast}"

job_id=$(qsub -W block=true \
-A ucsg0003 -q develop \
-l select=1:ncpus=1:ngpus=1 \
-l walltime=00:05:00 \
-N sbgc_dbg \
-j oe \
<<EOF
#!/bin/bash
cd "\$PBS_O_WORKDIR"
. ../env.sh
echo "=== nvidia-smi ==="
nvidia-smi || echo "(nvidia-smi failed)"
echo "=== ./driver $driver_args ==="
./driver $driver_args
# nsys profile -t cuda -o sbgcc_profile_report ./driver $driver_args
echo "=== exit status: \$? ==="
EOF
)

job_num=${job_id%%.*}
out_file="sbgc_dbg.o${job_num}"

echo "=== job: $job_id (output: $out_file) ==="
if [ -f "$out_file" ]; then
cat "$out_file"
else
echo "(output file $out_file not found)"
exit 1
fi
Loading