Conversation
Oxford Nanopore dorado basecaller module with automatic model selection, modified base calling (5mCG_5hmCG via combined model syntax), and optional alignment to a reference genome. Uses local SIF for Track 1 (lab use); Track 2 upstream blocked pending dorado bioconda package. Tests: 2 stub (CPU) + 2 real GPU tests (NVIDIA L40S, componc_gpu_batch). Uses --profile singularity,gpu to expose GPU via --nv flag. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a stub test using the public HG002 GIAB pod5 from nf-core/test-datasets (PR nf-core/test-datasets#1968). Test references the file via modules_testdata_base_path so it resolves once that PR merges. Also commits the 2.2 MB 10-read pod5 subset locally for development. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Thin wrapper over nanoporetech/dorado:sha... adding version labels. Used by `wave freeze --dockerfile Dockerfile` to obtain a stable community.wave.seqera.io URI for nf-core upstream submission. See sandbox/TRACK2_wave_freeze.md for the full runbook. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
nf-core/parabricks modules use nvcr.io/nvidia/... directly without bioconda. Following same pattern: use nanoporetech/dorado:sha... from Docker Hub. Removes the singularity/docker ternary — single container field only. Local SIF override kept in tests/nextflow.config for MSKCC HPC testing. Tracking semantic version tags: nanoporetech/dorado#1584. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Container: sahuno/dorado:1.4.0 (wraps nanoporetech/dorado v1.4.0 + samtools) - meta.yml: document that dorado outputs SO:unknown; advise SAMTOOLS_SORT + SAMTOOLS_INDEX downstream (confirmed with GIAB HG002 10-read test on A100) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Correct model input YAML structure (remove extra nesting level) - Add topics: section for version broadcast - Fix licence list format - Add edam ontology for summary TSV output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch from sahuno/dorado:1.4.0 wrapper to the upstream nanoporetech/dorado:shac8f356489fa8b44b31beba841b84d2879de2088e (v1.4.0). Same approach as parabricks modules using nvcr.io/nvidia directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tests/nextflow.config: remove hardcoded local SIF container override, local models_dir, and /data1/greenbab reference paths - tests/main.nf.test: stub test 2 uses [[], [], []] for reference (stub block doesn't use reference; real aligned test is GPU-tagged); GPU test with reference uses params.modules_testdata_base_path - Remove GIAB Track 2 stub test (pending nf-core/test-datasets#1968) - Update snapshots Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ext.device is not an allowed ext key in nf-core modules (only ext.args, ext.prefix, ext.when are permitted). Hardcode --device cuda:all directly in the script block. Users needing a different device string can pass it via ext.args (e.g. --device cpu for testing). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
||
| cat <<-END_VERSIONS > versions.yml | ||
| "${task.process}": | ||
| dorado: 1.4.0 | ||
| END_VERSIONS |
There was a problem hiding this comment.
| cat <<-END_VERSIONS > versions.yml | |
| "${task.process}": | |
| dorado: 1.4.0 | |
| END_VERSIONS |
| output: | ||
| tuple val(meta), path("*.bam") , emit: bam | ||
| tuple val(meta), path("*_summary.tsv"), emit: summary , optional: true | ||
| tuple val(meta), path("*.log") , emit: log , optional: true |
There was a problem hiding this comment.
dorado basecaller doesn't have any option to write the log in a file, so it's not possible to get this output.
| // Docker Hub image directly — same pattern as nf-core/parabricks modules | ||
| // (nvcr.io/nvidia/...). SHA tag pins to v1.4.0; a semver tag is tracked in |
There was a problem hiding this comment.
| // Docker Hub image directly — same pattern as nf-core/parabricks modules | |
| // (nvcr.io/nvidia/...). SHA tag pins to v1.4.0; a semver tag is tracked in | |
| // Docker Hub image directly. SHA tag pins to v1.4.0; a semver tag is tracked in |
There was a problem hiding this comment.
This config is not used. You're not loading it in main.nf.test.
So, if it's not necessary, then I would remove it.
| // (nvcr.io/nvidia/...). SHA tag pins to v1.4.0; a semver tag is tracked in | ||
| // nanoporetech/dorado#1584. | ||
| conda null | ||
| container "nanoporetech/dorado:shac8f356489fa8b44b31beba841b84d2879de2088e" |
There was a problem hiding this comment.
| container "nanoporetech/dorado:shac8f356489fa8b44b31beba841b84d2879de2088e" | |
| container "docker.io/nanoporetech/dorado:shac8f356489fa8b44b31beba841b84d2879de2088e" |
There was a problem hiding this comment.
Don't modify this. It shouldn't be necessary.
|
Thanks for the review @dialvarezs! All points addressed in 8779494:
Regarding the container upload to the nfcore org — happy to pursue that. Should I open a request with ONT to mirror Lint: 51 passed, 0 failures (4 expected warnings for Docker Hub / SHA tag / process_gpu label). |
|
It seems that you forgot to push the changes. |
Summary
DORADO_BASECALLERprocess for Oxford Nanopore basecalling with doradoKey design decisions
Container — no bioconda/BioContainers (ONTPL licence)
dorado is not on bioconda due to its Oxford Nanopore commercial licence. The module uses the official
nanoporetech/doradoDocker Hub image directly (SHA-pinned to v1.4.0), the same pattern used by parabricks modules (nvcr.io/nvidia/...).conda nullis set with a comment. ONT does not publish semver Docker tags yet — tracked in nanoporetech/dorado#1584; the SHA will be replaced with a version tag when available.GPU label —
process_gpuThe module requires a CUDA GPU and uses
label 'process_gpu'. This label is defined innf-core/configsbut is flagged as non-standard by lint. Happy to discuss the right approach on the PR.CI tests — stub only (GPU not available on GitHub Actions)
gpuare excluded from CI but pass locally on a SLURM GPU nodeTest data
tests/data/test.pod5— minimal synthetic pod5 for stub teststests/data/HG002_PAW70337_giab_10reads.pod5— 10-read GIAB HG002 public ONT data (pending Add pod5 test data: HG002 ONT 10-read subset (GIAB PAW70337, 5kHz R10.4.1) test-datasets#1968 for upstream hosting; included locally for GPU test validation)Lint results
PR checklist
topic: versionsusingeval()patternDORADO_BASECALLER)process_gpu)conda null— bioconda not possible (ONTPL licence), documentednf-core modules test dorado/basecaller --profile docker— pending (requires Docker daemon + GPU)nf-core modules test dorado/basecaller --profile singularity— stub tests pass locally; real GPU tests require compute node🤖 Generated with Claude Code