███████╗███╗ ███╗██████╗ ██╗██████╗ ██████╗ ███████╗██████╗
██╔════╝████╗ ████║██╔══██╗██║██╔══██╗██╔══██╗██╔════╝██╔══██╗
█████╗ ██╔████╔██║██████╔╝██║██████╔╝██████╔╝█████╗ ██████╔╝
██╔══╝ ██║╚██╔╝██║██╔══██╗██║██╔═══╝ ██╔══██╗██╔══╝ ██╔═══╝
██║ ██║ ╚═╝ ██║██║ ██║██║██║ ██║ ██║███████╗██║
╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝╚═╝ ╚═╝ ╚═╝╚══════╝╚═╝
██╗ ██╗ ██████╗ ██████╗ ██╗ ██╗██████╗ ███████╗███╗ ██╗ ██████╗██╗ ██╗
██║ ██║██╔═══██╗██╔══██╗██║ ██╔╝██╔══██╗██╔════╝████╗ ██║██╔════╝██║ ██║
██║ █╗ ██║██║ ██║██████╔╝█████╔╝ ██████╔╝█████╗ ██╔██╗ ██║██║ ███████║
██║███╗██║██║ ██║██╔══██╗██╔═██╗ ██╔══██╗██╔══╝ ██║╚██╗██║██║ ██╔══██║
╚███╔███╔╝╚██████╔╝██║ ██║██║ ██╗██████╔╝███████╗██║ ╚████║╚██████╗██║ ██║
╚══╝╚══╝ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚══════╝╚═╝ ╚═══╝ ╚═════╝╚═╝ ╚═╝
This WIP repo transforms the Stanford Memory Lab's (SML) internal fMRI preprocessing scripts into a generalizable toolbox for consistency within and across lab projects.
As such, this repo is intended to be used as a GitHub template for setting up fMRI preprocessing pipelines that handle:
- 1. FlyWheel → Server: Automated transfer of scanner acquisitions from FlyWheel to server
- 2. DICOM → BIDS:
dcm2niixconverter (converts raw DICOM to BIDS format via heudiconv) - 3. Prep for fMRIPrep: Dummy scan removal + fieldmap susceptibility distortion correction setup
- 4. QC Metadata: Verify DICOM → NIfTI → BIDS metadata conversion
- 5. QC Volumes: Verify number of volumes per scan file matches expected counts
- 6. fMRIPrep Anat-Only: Run fMRIPrep anatomical workflows only (for manual FreeSurfer editing)
- 7. Download FreeSurfer: Download FreeSurfer outputs for manual surface editing
- 8. Upload FreeSurfer: Upload edited FreeSurfer outputs back to server (with automatic backup)
- 9. fMRIPrep Full: Run full fMRIPrep workflows (anatomical + functional)
- 10. FSL GLM Setup: Setup statistical model for FSL FEAT analysis
- 11. FSL Level 1: Run Level 1 GLM analysis (individual runs)
- 12. FSL Level 2: Run Level 2 GLM analysis (subject-level)
- 13. FSL Level 3: Run Level 3 GLM analysis (group-level)
- 14. Tarball Utility: Optimize inode usage by archiving sourcedata directories
- Future: Automated HDF5 file management and compression
Note
- indicates workflows that have been finished and validated
- indicates workflows that are still under active development
Full documentation is available on ReadTheDocs.
For quick reference, see:
- Installation Guide
- Configuration Guide
- Usage Guide
- Workflows Guide
- Changelog
- Release Process
- Contributing Guidelines
- Click the "Use this template" button at the top of this repository
- Select "Create a new repository"
- Choose a name for your repository
- Select whether you want it to be public or private
- Click "Create repository from template"
This will create a new repository with all the files from this template, allowing you to customize it for your specific preprocessing needs while maintaining the core functionality for handling:
- Fieldmap-based distortion correction
- Dummy scan removal
- BIDS-compliance
- JSON metadata management
- Quality control checks
- FSL FEAT statistical analysis
- FreeSurfer manual editing workflows
The template provides a standardized structure and validated scripts that you can build upon, while keeping your specific study parameters and paths separate in configuration files.
- Preprocessing scripts for handling fieldmaps and dummy scans
- Configuration templates and examples
- Documentation and usage guides
- Quality control utilities
- BIDS metadata management tools
- FSL FEAT statistical analysis pipeline (Level 1, 2, 3 GLM)
- FreeSurfer manual editing utilities (download/upload with safety features)
- An interactive terminal user interface (TUI) launcher for triggering pipeline steps
After creating your repository from this template:
- Clone your new repository
- Copy
config.template.yamltoconfig.yamland customize parameters - Modify paths and scan parameters for your study
- Copy
all-subjects.template.txttoall-subjects.txtand add your subject IDs - Follow the Configuration Guide for detailed setup instructions
The preprocessing pipeline requires proper configuration of several parameters to handle your study's specific requirements. This guide explains how to set up the config.yaml file that controls the pipeline's behavior.
Important
There are two approaches you can take to trigger each preprocessing step following proper configuration in the config.yaml file:
-
Use the provided TUI
launcherexecutable, which provides an interactive popup window with more context and explanations + interactive parameter setting (as needed) for any given step. -
Manually running each step's sidecar executable, which for each core step directory (e.g.,
01-prepare), there exists an associated sidecar executable (e.g.,01-run.sbatch).
Note: The provided launcher mentioned in point 1 above simply calls upon these sidecar executables; the added context and interactivity of this method may be more comfortable for users less familiar with running commands in the terminal.
Thus, from the root of your project scripts directory, you can either call:
./launch# Step 1: FlyWheel download
./01-run.sbatch <fw_subject_id> <fw_session_id> <new_bids_subject_id>
# Step 2: dcm2niix BIDS conversion
./02-run.sbatch <fw_session_id> <new_bids_subject_id> [--skip-tar]
# Step 3: Prep for fMRIPrep
./03-run.sbatch
# Step 4: QC - verify metadata
./04-run.sbatch
# Step 5: QC - verify volume counts
./05-run.sbatch
# Step 6: fMRIPrep anatomical workflows only
./06-run.sbatch
# Step 7: Download FreeSurfer outputs for manual editing
./toolbox/download_freesurfer.sh --server <server> --user <user> --remote-dir <dir> --subjects <list>
# Step 8: Upload edited FreeSurfer outputs back to server
./toolbox/upload_freesurfer.sh --server <server> --user <user> --remote-dir <dir> --subjects <list>
# Step 9: fMRIPrep full workflows (anatomical + functional)
./07-run.sbatch
# Step 10: FSL GLM - Setup new statistical model
./10-fsl-glm/setup_glm.sh
# Step 11: FSL GLM - Run Level 1 analysis (individual runs)
./08-run.sbatch <model-name> [--no-feat]
# Step 12: FSL GLM - Run Level 2 analysis (subject-level)
./09-run.sbatch <model-name> [--no-feat]
# Step 13: FSL GLM - Run Level 3 analysis (group-level)
./10-run.sbatch <model-name> [--no-feat]
# Step 14: Tarball/Untar utility for sourcedata directories
./toolbox/tarball_sourcedata.sh [--tar-all|--tar-subjects|--untar-all|--untar-subjects] --sourcedata-dir <dir>cp config.template.yaml config.yaml- Set
BASE_DIRto your study's root directory - Ensure
RAW_DIRpoints to your BIDS-formatted data - Verify
TRIM_DIRlocation for trimmed BIDS-compliant outputs that will later be used for fmriprep - Set
WORKFLOW_LOG_DIRfor fMRIPrep workflow logs - Set
TEMPLATEFLOW_HOST_HOMEfor templateflow local cache - Set
FMRIPREP_HOST_CACHEfor fmriprep local cache - Set
FREESURFER_LICENSEto the location of yourfreesurferlicense
- Update
task_idto match your BIDS task name - Set
new_task_idif task renaming is needed - Modify
run_numbersto match your scan sequence / number of task runs - Adjust
n_dummybased on your scanning protocol
- Set
EXPECTED_FMAP_VOLSto match your fieldmap acquisition - Set
EXPECTED_BOLD_VOLSto match your BOLD acquisition
- Update
fmap_mappingto reflect your fieldmap/BOLD correspondence - Ensure each BOLD run has a corresponding fieldmap entry
- Copy
all-subjects.template.txttoall-subjects.txtand list all subject ids (just the numbers, not the "sub-" part)
- Adjust
DIR_PERMISSIONSandFILE_PERMISSIONSbased on your system requirements
- Enable
DEBUGmode (for testing)
# ============================================================================
# (1) SETUP DIRECTORIES
# ============================================================================
directories:
base_dir: '/my/project/dir'
scripts_dir: '${BASE_DIR}/scripts'
raw_dir: '${BASE_DIR}/bids'
trim_dir: '${BASE_DIR}/bids_trimmed'
workflow_log_dir: '${BASE_DIR}/logs/workflows'
templateflow_host_home: '${HOME}/.cache/templateflow'
fmriprep_host_cache: '${HOME}/.cache/fmriprep'
freesurfer_license: '${HOME}/freesurfer.txt'# ============================================================================
# (2) USER CONFIGURATION
# ============================================================================
user:
email: 'hello@stanford.edu'
username: 'johndoe'
fw_group_id: 'pi'
fw_project_id: 'amass'# ============================================================================
# (3) TASK/SCAN PARAMETERS
# ============================================================================
scan:
task_id: 'SomeTaskName'
new_task_id: 'cleanname'
n_dummy: 5
run_numbers:
- '01'
- '02'
- '03'
- '04'
- '05'
- '06'
- '07'
- '08'# ============================================================================
# (4) DATA VALIDATION VALUES FOR UNIT TESTS
# ============================================================================
validation:
expected_fmap_vols: 12
expected_bold_vols: 220
expected_bold_vols_after_trimming: 210# ============================================================================
# (5) FIELDMAP <-> TASK BOLD MAPPING
# ============================================================================
# Each key represents a BOLD run number, and its value is the fieldmap number
# Example: here, each fmap covers two runs
fmap_mapping:
'01': '01' # TASK BOLD RUN 01 USES FMAP 01
'02': '01' # TASK BOLD RUN 02 USES FMAP 01
'03': '02' # TASK BOLD RUN 03 USES FMAP 02
'04': '02' # TASK BOLD RUN 04 USES FMAP 02
'05': '03'
'06': '03'
'07': '04'
'08': '04'# ============================================================================
# (6) SUBJECT IDS <-> PER PREPROC STEP MAPPING (OPTIONAL)
# ============================================================================
# By default, subjects will be pulled from the master 'all-subjects.txt' file
# However, if you want to specify different subject lists per pipeline step,
# you may do so here by uncommenting and configuring the mapping below:
#
# subjects_mapping:
# '01-fw2server': '01-subjects.txt'
# '02-raw2bids': '02-subjects.txt'
#
# Note: keep in mind that we've built in checks at the beginning of each pipeline
# step that skip a subject if there's already a record of them being preprocessed;
# thus, you shouldn't necessarily need separate 0x-subjects.txt files per step
# unless this extra layer of control is useful for your needs.Subject list files now support suffix modifiers for granular per-subject control. This allows you to maintain a single subject list while specifying different behavior for each subject.
Syntax: subject_id:modifier1:modifier2:...
Supported Modifiers:
step1,step2,step3,step4,step5,step6- Only run specified step(s) for this subjectforce- Force rerun even if subject was already processedskip- Skip this subject entirely
Examples:
101 # Standard subject ID, runs all steps normally
102:step4 # Only run step 4 (prep-fmriprep) for this subject
103:step4:step5 # Only run steps 4 and 5 for this subject
104:force # Force rerun all steps for this subject
105:step5:force # Only run step 5, force rerun
106:skip # Skip this subject entirely
Example Subject List File (e.g., 04-subjects.txt):
101
102:step4
103:step4:force
104
105:skip
This feature allows the template to maintain a single subject list file while providing extensible, fine-grained control over how the pipeline handles different subjects.
# ============================================================================
# (7) DEFAULT PERMISSIONS
# ============================================================================
permissions:
dir_permissions: '775'
file_permissions: '775'# ============================================================================
# (8) SLURM JOB HEADER CONFIGURATOR (FOR GENERAL TASKS)
# ============================================================================
slurm:
email: '${USER_EMAIL}'
time: '2:00:00'
dcmniix_time: '6:00:00'
mem: '8G'
cpus: '8'
array_throttle: '10'
log_dir: '${BASE_DIR}/logs/slurm'
partition: 'hns,normal'# ============================================================================
# (9) PIPELINE SETTINGS
# ============================================================================
pipeline:
fmriprep_version: '24.0.1'
derivs_dir: '${TRIM_DIR}/derivatives/fmriprep-${FMRIPREP_VERSION}'
singularity_image_dir: '${BASE_DIR}/singularity_images'
singularity_image: 'fmriprep-${FMRIPREP_VERSION}.simg'
heudiconv_image: 'heudiconv_latest.sif'
# ============================================================================
# (10) FMRIPREP SPECIFIC SLURM SETTINGS
# ============================================================================
fmriprep_slurm:
job_name: 'fmriprep${FMRIPREP_VERSION//.}_${new_task_id}'
array_size: '1'
time: '48:00:00'
cpus_per_task: '16'
mem_per_cpu: '4G'
# ============================================================================
# (11) FMRIPREP SETTINGS
# ============================================================================
fmriprep:
omp_threads: 8
nthreads: 12
mem_mb: 30000
fd_spike_threshold: 0.9
dvars_spike_threshold: 3.0
output_spaces: 'MNI152NLin2009cAsym:res-2 anat fsnative fsaverage5'# ============================================================================
# (12) MISC SETTINGS
# ============================================================================
misc:
debug: 0Tip
- Verify all paths exist and are accessible
- Confirm volume counts match your acquisition protocol
- Test the configuration on a single subject
- Review logs for any configuration warnings
Caution
- Incorrect path specifications
- Mismatched volume counts
- Incorrect fieldmap mappings
- Permission issues
The toolbox/ directory contains helpful utilities for managing your fMRI data:
The tarball_sourcedata.sh script helps optimize inode usage on supercompute environments by archiving subject sourcedata directories into tar files.
Features:
- Tarball individual or all subject directories
- Extract tarballs back to sourcedata directories
- Support for comma-separated subject lists or subject list files
- Optional separate output directory for tar archives
- Automatic cleanup of original directories (with option to keep)
- Progress indicators and error handling
Usage Examples:
# Tarball all subjects in sourcedata directory
./toolbox/tarball_sourcedata.sh --tar-all --sourcedata-dir /path/to/sourcedata
# Tarball specific subjects (removes original directories by default)
./toolbox/tarball_sourcedata.sh --tar-subjects "001,002,003" --sourcedata-dir /path/to/sourcedata
# Tarball subjects from a file
./toolbox/tarball_sourcedata.sh --tar-subjects all-subjects.txt --sourcedata-dir /path/to/sourcedata
# Tarball but keep original directories
./toolbox/tarball_sourcedata.sh --tar-all --sourcedata-dir /path/to/sourcedata --keep-original
# Store tar files in a separate directory
./toolbox/tarball_sourcedata.sh --tar-all --sourcedata-dir /path/to/sourcedata --output-dir /path/to/tarballs
# Extract all tar files
./toolbox/tarball_sourcedata.sh --untar-all --sourcedata-dir /path/to/sourcedata
# Extract specific subjects
./toolbox/tarball_sourcedata.sh --untar-subjects "001,002" --sourcedata-dir /path/to/sourcedata
# Get help
./toolbox/tarball_sourcedata.sh --helpWhy use this utility?
- Reduces inode usage significantly on shared supercompute environments
- Each subject's sourcedata directory may contain thousands of DICOM files
- Archiving into a single tar file per subject drastically reduces inode consumption (e.g., a directory tree with 5000 files using 5000+ inodes becomes a single tar file using 1 inode)
- Easy to extract subjects back when needed for reprocessing or analysis
The download_freesurfer.sh and upload_freesurfer.sh scripts enable a complete workflow for manually editing FreeSurfer surface reconstructions.
Features:
- Download FreeSurfer outputs from remote server via rsync
- Upload edited surfaces back to server with automatic backups
- Interactive and non-interactive modes
- Support for individual subjects or batch downloads/uploads
- Multiple safety confirmations before destructive operations
- Automatic timestamped backups of original surfaces
Usage Examples:
# Download FreeSurfer outputs interactively
./toolbox/download_freesurfer.sh
# Download specific subjects non-interactively
./toolbox/download_freesurfer.sh \
--server login.sherlock.stanford.edu \
--user mysunetid \
--remote-dir /oak/stanford/groups/mylab/projects/mystudy \
--subjects sub-001,sub-002
# Upload edited outputs with automatic backup
./toolbox/upload_freesurfer.sh
# Upload specific subjects non-interactively
./toolbox/upload_freesurfer.sh \
--server login.sherlock.stanford.edu \
--user mysunetid \
--remote-dir /oak/stanford/groups/mylab/projects/mystudy \
--subjects sub-001,sub-002Complete Workflow:
- Run fMRIPrep anatomical workflows only (Step 6):
./06-run.sbatch - Download FreeSurfer outputs:
./toolbox/download_freesurfer.sh - Edit surfaces locally using Freeview or other tools
- Upload edited surfaces:
./toolbox/upload_freesurfer.sh - Run full fMRIPrep workflows (Step 7):
./07-run.sbatch
See toolbox/FREESURFER_EDITING.md for complete documentation including:
- When to perform manual edits
- Freeview editing instructions
- Common editing tasks (brainmask, white matter, surfaces)
- Troubleshooting guide
- Best practices
verify_nii_metadata.py- Quality control for converted NIfTI metadatadir_checksum_compare.py- Compare directories using checksumspull_fmriprep_reports.sh- Download fMRIPrep HTML reports from serversummarize_bold_scan_volume_counts.sh- Validate scan volumes match expected counts
Note
Please use the issues tab (https://github.com/shawntz/fmriprep-workbench/issues) to make note of any bugs, comments, suggestions, feedback, etc… all are welcomed and appreciated, thanks!
-Shawn
See our Contributing Guidelines for how to get involved.


