Skip to content

Dask improvements#38

Merged
JamesMcClung merged 11 commits into
mainfrom
dask-improvements
May 20, 2026
Merged

Dask improvements#38
JamesMcClung merged 11 commits into
mainfrom
dask-improvements

Conversation

@JamesMcClung
Copy link
Copy Markdown
Owner

Add more dask config options. Also add a hamscan test and related bug fix.

JamesMcClung and others added 11 commits May 20, 2026 15:06
Now that PSC_PLOT_DASK_SCHEDULER=distributed is supported in cli.py,
distributed should be a real dependency, not pip-install-it-yourself.
The import in cli.py stays lazy so the threads-only default doesn't
pay the tornado/zict import cost on every startup.

Co-Authored-By: Claude <noreply@anthropic.com>
Extract the preprocess lambda to a module-level function so it survives
pickling for dask's processes scheduler.
New PSC_PLOT_DASK_SCHEDULER env var, threaded through CONFIG so cli.py
can call dask.config.set(scheduler=...) when set. Empty/unset leaves
dask's default in place.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
When PSC_PLOT_DASK_SCHEDULER=distributed, spin up a LocalCluster with
n_workers=num_workers, one thread per worker, real processes. Workers
are persistent across the run so the per-task spawn cost amortizes.

Requires 'distributed' to be pip-installed; lazy import so the import
error only fires when the scheduler is explicitly requested.

Co-Authored-By: Claude <noreply@anthropic.com>
prt-bin-time benches show ~2.5x wall speedup from 1 -> cpu_count threads
on the particle binning workload (the case that's actually CPU-bound).
No regression on field scenarios. Drops the nag-warning since most users
want the parallel default.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
@JamesMcClung JamesMcClung added bug Something isn't working enhancement New feature or request optimization Improves performance labels May 20, 2026
@JamesMcClung JamesMcClung merged commit c0846e4 into main May 20, 2026
2 checks passed
@JamesMcClung JamesMcClung deleted the dask-improvements branch May 20, 2026 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request optimization Improves performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant