Skip to content

tests: support multi-op perf groups in test-backend-ops#22934

Open
zzzzwc wants to merge 1 commit into
ggml-org:masterfrom
zzzzwc:zwc/group-perf
Open

tests: support multi-op perf groups in test-backend-ops#22934
zzzzwc wants to merge 1 commit into
ggml-org:masterfrom
zzzzwc:zwc/group-perf

Conversation

@zzzzwc
Copy link
Copy Markdown
Contributor

@zzzzwc zzzzwc commented May 11, 2026

Overview

  • Refactor eval_perf() to support fused multi-op benchmarking: perf_group_size() declares how many trailing nodes form the effective operation; only those are duplicated per run, setup nodes run once.
  • Add --perf-duration and --threads CLI options.
  • Add test_rms_norm_mul perf test cases for broadcast and non-broadcast configs.

Additional information

Verified no regression on existing single-op perf tests.
The ADD operator shows same runs-per-loop and memory-per-run on both branches, with negligible performance diff.

master (389ff61):

ADD(type=f32,ne=[4096,1,1,1],nr=[1,1,1,1],nf=1,perm1=0,src_overlap=0):              343980 runs -     2.95 us/run -       48 kB/run -   15.50 GB/s
ADD(type=f32,ne=[4096,1,1,1],nr=[1,512,1,1],nf=1,perm1=0,src_overlap=0):              5464 runs -   194.47 us/run -    24576 kB/run -  120.52 GB/s

this PR:

ADD(type=f32,ne=[4096,1,1,1],nr=[1,1,1,1],nf=1,perm1=0,src_overlap=0):              335790 runs -     2.98 us/run -       48 kB/run -   15.37 GB/s
ADD(type=f32,ne=[4096,1,1,1],nr=[1,512,1,1],nf=1,perm1=0,src_overlap=0):              5464 runs -   195.81 us/run -    24576 kB/run -  119.70 GB/s

Requirements

@zzzzwc zzzzwc requested a review from ggerganov as a code owner May 11, 2026 07:09
- Refactor eval_perf() to support fused multi-op benchmarking: perf_group_size()
declares how many trailing nodes form the effective operation; only those are
duplicated per run, setup nodes run once.
- Add --perf-duration and --threads CLI options.
- Add test_rms_norm_mul perf test cases for broadcast and non-broadcast configs.
@zzzzwc zzzzwc changed the title tests: support multi-op perf groups and add RMS_NORM+MUL test tests: support multi-op perf groups in test-backend-ops May 11, 2026
@github-actions github-actions Bot added the testing Everything test related label May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant