Workshop materials for the Methods for Multi-Omics Data Analysis Short Course at The Jackson Laboratory.
This course teaches a transferable analytical strategy for integrating public multi-omics data to generate mechanistic hypotheses. The anchor example is leukemia inhibitory factor (LIF) and its relationship to cachexia — but the approach applies to any disease-focused multi-omics question.
- Cross-study integration demo — a complete pipeline from SRA download through RNA-seq GSEA and metabolomics enrichment, using the KPC pancreatic cancer mouse model (PRJNA773714 RNA-seq + ST003927 metabolomics)
- GTEx live-coded session — hands-on Spearman correlation, GSEA, and biological interpretation using pre-processed GTEx expression data
- Interpretation guides — GSEA, GO enrichment, Spearman correlation, and validation methodology
vignettes/ # Quarto source files (renders to docs/ for GitHub Pages)
introduction/ # Workshop overview and learning path
setup/ # Environment setup guide
gtex/ # GTEx expression access, QC, and LIF analysis
kpc-cross-study/ # KPC RNA-seq + metabolomics integration pipeline
integration/ # Cross-dataset NES concordance and validation
guides/ # GSEA, GO enrichment, Spearman, validation guides
reference/ # Helper function reference
scripts/
gtex/ # GTEx metadata fetch, expression QC, GO enrichment
sra/ # SRA prefetch, tximport aggregation, Slurm submission
install/ # Package installation (gtexr, MetaboAnalystR)
dev/ # Helper reference generation
config/
gtex/ # Gene list template and tissue preferences
sra/ # SRA accession lists and run tables
analysis/ # GO keyword filters for pathway analysis
workshopr/ # R helper package shared across scripts and vignettes
# R packages (renv — first time only, ~20 min)
Rscript scripts/install/setup-renv.R
# Python packages
pip install -r requirements.txt
# Preview documentation
quarto preview vignettes/On subsequent clones or after renv.lock changes:
renv::restore()
# Then re-run the two patch scripts (qs + MetaboAnalystR cannot auto-restore):
source("scripts/install/install-qs-shim.R")
source("scripts/install/install-metaboanalystr.R")Preview the documentation site locally:
quarto preview vignettes/- Graduate students and postdocs new to multi-omics data analysis
- Computational biologists wanting practical GSEA + metabolomics integration skills
- Wet-lab researchers interested in generating mechanistic hypotheses from public data
Prerequisites: intermediate R (data frames, functions, pipes); basic gene expression concepts (counts, TPM, log transformation).
- GTEx v10 — Genotype-Tissue Expression project, healthy human transcriptome reference
- BioProject PRJNA773714 — KPC mouse model RNA-seq (Dasgupta et al., J Exp Med 2025)
- ST003927 — Metabolomics Workbench LC-MS/MS data, same experimental design
- GSE133523 — Human skeletal muscle cachexia (GEO, pancreatic cancer vs. controls)
- fgsea / msigdbr / MetaboAnalystR — open-source enrichment analysis tools
All files under data/ are gitignored — populate them by running the vignettes in order or the fetch scripts in scripts/.
data/
gtex/
expression-raw/ # .gct files downloaded from the GTEx portal
metadata/ # Sample-level metadata (fetched via gtexr)
expression-qc/ # QC tables and filtered expression matrices
correlation/ # Spearman ρ outputs (LIF co-expression)
enrichment/ # GO:BP GSEA results per tissue
stats/ # Summary statistics
intermediary/ # Intermediate cached objects (.rds)
geo/
GSE133523/ # Human skeletal muscle cachexia raw data (GEO)
metabolomics-workbench/
mwtab_txt/ # Raw mwtab-format files (ST003927)
intermediary/ # Parsed and normalised metabolomics objects
plots/ # QC and MSEA figures
sra/
PRJNA773714/ # KPC mouse RNA-seq (SRA prefetch output)
salmon_quant/ # Per-sample Salmon quantification directories
integrated/
stats/ # Cross-dataset NES concordance tables (.rds)
intermediary/ # Merged multi-omics objects
plots/ # Integration figures
reference/
salmon_index_grcm39/ # Salmon index for GRCm39 (built once, reused)
metaboanalystr/ # MetaboAnalystR session temp files (auto-generated)
To populate from scratch:
| Data | How to fetch |
|---|---|
| GTEx metadata | scripts/gtex/gtex-metadata-fetch.R |
GTEx expression .gct |
Download from gtexportal.org → data/gtex/expression-raw/ |
| PRJNA773714 RNA-seq | scripts/sra/sra-prefetch.R → scripts/sra/slurm-sra-prefetch.sh (HPC) → scripts/sra/sra-tximport.R |
| ST003927 metabolomics | Fetched automatically by the metabolomics QC vignette |
| GSE133523 | Fetched automatically by the GSE133523 GSEA vignette |
The Jackson Laboratory, as part of the NIH Common Fund Data Ecosystem (CFDE).