feat(config): SLURM job-array submission for per-sample fan-out stages#503
Open
tpall wants to merge 1 commit intoWrightonLabCSU:devfrom
Open
feat(config): SLURM job-array submission for per-sample fan-out stages#503tpall wants to merge 1 commit intoWrightonLabCSU:devfrom
tpall wants to merge 1 commit intoWrightonLabCSU:devfrom
Conversation
Adds two params and a withName directive so multi-sample runs on SLURM
collapse N sbatches into one job array per stage, sparing fair-share
priority on shared clusters:
params.array_size (default 0 — disabled; local-executor safe)
params.queue_size (already existed; default 10)
process {
withName: 'DRAM:ANNOTATE:CALL:.*|DRAM:ANNOTATE:DB_SEARCH:.*|DRAM:ANNOTATE:QC:COLLECT.*' {
array = params.slurm ? params.array_size : 0
}
}
The directive is gated on params.slurm so the local / standard executor
always sees array = 0 (a no-op) — only --slurm runs honour the user's
array_size override. The selector covers every per-sample fan-out stage
(CALL, DB_SEARCH, and any current or future QC:COLLECT_* subworkflow).
Same Nextflow #6108 caveat applies — intermittent ConcurrentModification-
Exception when arrays combine with Singularity, recoverable via -resume.
This was referenced May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds optional SLURM job-array submission for the heavy per-sample fan-out stages of
ANNOTATE(CALL,DB_SEARCH,QC:COLLECT_*). With many input fastas, this collapses N individualsbatchcalls into one job array per stage, sparing fair-share priority on shared clusters.Supersedes #472 — this PR is the same author's earlier work narrowed to only the array-support change with @madeline-scyphers' three review comments addressed in the design from the start. Other improvements that were bundled into #472 (Channel→channel cleanup, rRNA/tRNA distill fix, fasta gzip support, …) will come up as separate focused PRs.
What changed
+19 lines, 3 files:nextflow.config— newparams.array_size = 0(default disabled, local-executor safe).nextflow_schema.json— schema entry forarray_sizeso the param validates.conf/modules.config— singlewithNameblock, gated onparams.slurm:How this addresses @madeline-scyphers' review comments on #472
array = params.slurm ? params.array_size : 0— without--slurmthe directive evaluates to 0, a no-op. Defaultarray_size=0further protects fresh users.DRAM:ANNOTATE:QC:COLLECT.*could also have a job array"COLLECT.*so any current or futureQC:COLLECT_*subworkflow picks it up.base.configfor the job array should probably be inmodules.config"modules.config—base.configis untouched by this PR.Relationship to
feature/condense-job-submission-using-collateI noticed there's a
feature/condense-job-submission-using-collatebranch that takes a different (more invasive) approach to the same problem — concatenating per-sample fastas and running them as one task per group, reducing sbatch count by batching inputs rather than arraying tasks. That work has been dormant since 2025-07-22 (~10 months). The two approaches are orthogonal —arraysubmits N tasks as a single sbatch; collation runs N inputs as a single task. They could coexist if the collation work resumes.This PR is small enough (
+19 lines) that it shouldn't conflict with future collation work; if collation does land, thearrayblock inmodules.configcan be deleted in one commit.Caveat
There's an open Nextflow issue (nextflow-io/nextflow#6108) — intermittent
ConcurrentModificationExceptionwhen arrays combine with Singularity. It's recoverable via-resume. Defaultarray_size = 0keeps users out of the race unless they opt in.Test plan
nextflow inspect main.nfparses cleanly.--slurm --array_size 20 --queue_size 200: array jobs appear insqueueand complete normally.🤖 Generated with Claude Code