Code for tracking subsolid lung nodules across serial CT examinations and predicting their growth using 3D temporal models.
| Step | Script | Purpose |
|---|---|---|
| 1 | preprocessing/nifti.py |
Convert source images and nodule masks to NIfTI |
| 2 | preprocessing/lung_masks.py |
Generate binary lung-field masks |
| 3 | preprocessing/match.py |
Deformable registration + cross-timepoint nodule matching |
| 4 | preprocessing/growth.py |
Compute VDT and growth labels for all nodule pairs |
| 5 | preprocessing/precompute.py |
Extract 64x64x64 crops for model input |
| 6 | models/discriminative.py |
Train 3D growth classifier |
| 6 | models/generative.py |
Train 3D future mask predictor |
conda env create -f environment.yml
conda activate nodule-growthEdit paths.yaml to set data_root and verify all entries point to valid locations on your system. All scripts read paths from this file exclusively.
The pipeline is agnostic to the upstream detection/segmentation algorithm. Any system that produces the following can be used as input:
Scan Metadata (scan_metadata_csv in paths.yaml)
A CSV with one row per CT scan. Required columns:
| Column | Description |
|---|---|
PatientID |
Unique patient identifier |
ScanNumber |
Scan index within patient |
StudyDate |
Acquisition date |
Series |
Series identifier |
SeriesInstanceUID |
DICOM SeriesInstanceUID |
imagePath |
Path to source CT image (DICOM or MHD) |
maskPath |
Path to nodule segmentation mask |
Nodule Annotations (nodule_annotations_csv)
Per-nodule measurements. Required columns:
| Column | Description |
|---|---|
FindingUID |
Unique nodule identifier |
PatientID |
Patient identifier |
SeriesInstanceUID |
Which scan this measurement belongs to |
StudyDate |
Date of measurement |
Long |
Long-axis diameter (mm) |
Short |
Short-axis diameter (mm) |
Nodule Segmentations (nodule_segmentations_csv)
Per-nodule segmentation metadata. Required columns:
| Column | Description |
|---|---|
PatientID |
Patient identifier |
SeriesInstanceUID |
Scan identifier |
mask_label |
Integer label of this nodule within the mask volume |
FindingUID |
Links to annotation table |
NoduleTypeStr |
Nodule type (e.g., "ground glass", "Partial solid") |
segmentation |
JSON containing sitk_volume_mm3 |
Registration Patient List (registration_pids_csv)
A single-column CSV (PatientID) listing which patients to process.
Patient Splits (train/val/test)
Three single-column CSVs (PatientID) defining patient-level data splits. Used by the models to prevent data leakage.
3D ResNet predicting whether a nodule is growing (>= 1.5 mm diameter increase on any axis). Takes a 2-channel 64^3 input (CT + mask) conditioned on elapsed time. Trained with class-weight balancing and early stopping on validation AUC.
3D U-Net predicting the nodule segmentation mask at a future timepoint. Takes a 3-channel 64^3 input (CT + mask + time map) with learned time embedding at the bottleneck. Trained with a combined Dice + BCE + volume loss, early stopping on validation Dice.
| Criterion | Growing | Stable | Shrinking |
|---|---|---|---|
| Volume change | > 25% increase | +/- 25% | > 25% decrease |
| Diameter change | >= 1.5 mm increase (any axis) | < 1.5 mm | >= 1.5 mm decrease |
| VDT | < 400 days | >= 400 days | — |
Biologically implausible VDT (< 20 or > 5000 days) is excluded.
MIT License.