HISAT2.wdl: replace output command substitutions with explicit fifo/wait #233
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The hisat2 tasks stream output into samtools to avoid having to materialize a giant text SAM file on the scratch disk. This is a good idea but it's implemented in a slightly risky way, using an output command substitution like
hisat2 ... -S >(samtools view -o output.bam ...). In this construct samtools is spawned as a background process, and bash does not wait for it before proceeding to the next command or exiting at the end of the script. Furthermore according to this Q&A it does not even provide a way to wait for it!This creates a race condition where the next step is liable to start reading a partial BAM file, including the runtime system potentially outputting a truncated file (cf. chanzuckerberg/miniwdl#211).
Here we replace the output command substitutions with a less-elegant but hopefully reliable construct, which allows us to explicitly wait for samtools before proceeding.