Update the benchmark to launch one instance per (dataset, synthesizer) pair by R-Palazzo · Pull Request #611 · sdv-dev/SDGym

R-Palazzo · 2026-05-26T13:32:08Z

Resolve #605
86b9zz525

Tested for single-table here, it successfully launched 81 instances (9 synthesizers and 9 datasets).
Tested for multi-table here, it created 162 instances (54 datasets, 3 synthesizers).
We need to store the list of datasets and synthesizers to run the benchmarks on. I stored them in a Python file, but I can move them to YAML if we prefer. To include a new synthesizer/dataset, we would just need to update these lists.

codecov · 2026-05-26T13:39:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.79%. Comparing base (e3c2e90) to head (86d11d6).
⚠️ Report is 1 commits behind head on issue-604-2-private-bucket.

Additional details and impacted files

@@                      Coverage Diff                       @@
##           issue-604-2-private-bucket     #611      +/-   ##
==============================================================
- Coverage                       85.89%   85.79%   -0.11%     
==============================================================
  Files                              40       40              
  Lines                            3750     3723      -27     
==============================================================
- Hits                             3221     3194      -27     
  Misses                            529      529

Flag	Coverage Δ
integration	`44.18% <0.00%> (+0.31%)`	⬆️
unit	`81.57% <100.00%> (-0.14%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fealho

This runs the UniformSynthesizer in every instance. So for example, for the 81 single table runs, it ran 81 times. Is this intended?

Also, do we want to change this? I don't think it affects the monthly run, because the config returns before reaching that line, but if you pass datasets/synthesizers, it still creates only 1 instance.

And could you double check the CLI path still works? I think maybe it depended on the yaml files you deleted. Something like this:

python sdgym/_benchmark_launcher/script.py --modality single_table --output-destination s3://bucket/test --synthesizers CTGANSynthesizer

amontanez24

This looks good. I just want to confirm something. Does the instance that runs the workflow to launch the jobs download all the datasets for all jobs, or only the one for the job it launches?

I understand we are doing one job = 1 synthesizer and dataset, but I want to make sure the instance that launches the job isn't downloading too much

R-Palazzo · 2026-05-27T16:02:47Z

This looks good. I just want to confirm something. Does the instance that runs the workflow to launch the jobs download all the datasets for all jobs, or only the one for the job it launches?

I understand we are doing one job = 1 synthesizer and dataset, but I want to make sure the instance that launches the job isn't downloading too much

@amontanez24, yes, that was the case. The instance that runs the workflow was downloading all the datasets. I tried adding the rel-bench dataset and the workflow crashed when trying to download the data:

https://github.com/sdv-dev/SDGym/actions/runs/26508750022/job/78067938428

To avoid this in this PR I moved the dataset loading during execution so it's only the gcp instances that will load the data. I'm running some test and check that it works end-to-end with the rel-bench datasets:

R-Palazzo · 2026-05-28T10:55:01Z

This runs the UniformSynthesizer in every instance. So for example, for the 81 single table runs, it ran 81 times. Is this intended?

Also, do we want to change this? I don't think it affects the monthly run, because the config returns before reaching that line, but if you pass datasets/synthesizers, it still creates only 1 instance.

And could you double check the CLI path still works? I think maybe it depended on the yaml files you deleted. Something like this:
python sdgym/_benchmark_launcher/script.py --modality single_table --output-destination s3://bucket/test --synthesizers CTGANSynthesizer

Hi @fealho thanks for the review!

Yes, it is intended for the UniformSynthesizer. However, when we aggregate the results to generate the monthly leaderboard, only one UniformSynthesizer per dataset is retained and used to compute Adjusted_Total_Time and Adjusted_Quality_Score. So it’s consistent between all synthesizers and datasets.
Good catch for the scripty.py, I updated so it's always one instance per job
Tested the CLI here, running:

python sdgym/_benchmark_launcher/script.py --modality single_table --output-destination s3://sdgym-benchmark/Debug/Issue_605_CLI --synthesizers GaussianCopulaSynthesizer

R-Palazzo requested review from amontanez24 and fealho May 26, 2026 13:32

R-Palazzo requested a review from a team as a code owner May 26, 2026 13:32

R-Palazzo removed the request for review from a team May 26, 2026 13:32

fealho requested changes May 26, 2026

View reviewed changes

amontanez24 reviewed May 26, 2026

View reviewed changes

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from ea2f9bc to c378d46 Compare May 27, 2026 14:49

R-Palazzo force-pushed the issue-604-2-private-bucket branch from 95be93a to 9a91033 Compare May 27, 2026 14:53

R-Palazzo force-pushed the issue-605-single-job-per-instance branch 2 times, most recently from ba22fa7 to fdb0a93 Compare May 27, 2026 15:01

R-Palazzo force-pushed the issue-604-2-private-bucket branch from 725c22c to 5555441 Compare May 28, 2026 09:27

R-Palazzo added 6 commits May 28, 2026 11:00

def 605

fba2cf7

unit tests

e42ee3f

test on rel-bench dataset

e4126fe

pip install from the current branch

bd41677

fix script.py: always launch 1 instance per job in CLI

764c106

undo changes for testing

5cb84b0

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from 19c7c43 to 5cb84b0 Compare May 28, 2026 10:30

undo workflow change

86d11d6

R-Palazzo requested review from amontanez24 and fealho May 28, 2026 11:00

fealho approved these changes May 28, 2026

View reviewed changes

amontanez24 approved these changes May 28, 2026

View reviewed changes

R-Palazzo requested a review from sarahmish May 29, 2026 09:11

R-Palazzo changed the base branch from issue-604-2-private-bucket to main May 29, 2026 09:14

R-Palazzo changed the base branch from main to issue-604-2-private-bucket May 29, 2026 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the benchmark to launch one instance per (dataset, synthesizer) pair#611

Update the benchmark to launch one instance per (dataset, synthesizer) pair#611
R-Palazzo wants to merge 7 commits into
issue-604-2-private-bucketfrom
issue-605-single-job-per-instance

R-Palazzo commented May 26, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 26, 2026 •

edited

Loading

Uh oh!

fealho left a comment

Uh oh!

amontanez24 left a comment

Uh oh!

R-Palazzo commented May 27, 2026 •

edited

Loading

Uh oh!

R-Palazzo commented May 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

R-Palazzo commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fealho left a comment

Choose a reason for hiding this comment

Uh oh!

amontanez24 left a comment

Choose a reason for hiding this comment

Uh oh!

R-Palazzo commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

R-Palazzo commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

R-Palazzo commented May 26, 2026 •

edited

Loading

codecov Bot commented May 26, 2026 •

edited

Loading

R-Palazzo commented May 27, 2026 •

edited

Loading

R-Palazzo commented May 28, 2026 •

edited

Loading