idsplat.mp4
Official code release for IDSplat, built on top of Nerfstudio.
IDSplat decomposes autonomous driving scenes into a static background and per-instance dynamic Gaussians, enabling high-quality novel view synthesis for camera and lidar without requiring annotations.
We recommend using uv for dependency management.
The training scripts use uv run, which automatically creates and syncs the virtual environment on first use — no separate install step is required.
For non-50-series GPUs, edit pyproject.toml to switch to the CUDA 12.4 PyTorch index (swap the commented/uncommented [tool.uv] blocks) before syncing.
The gsplat dependency is installed from a custom fork that adds rolling-shutter camera and lidar rendering support.
IDSplat supports the Waymo Open Dataset (WOD) v2 and PandaSet.
Data paths are always passed explicitly via --data and --masks_dir in the training scripts, so datasets can live anywhere on the filesystem.
Download the Waymo Open Dataset v2 (parquet format) from the official website.
The expected directory structure is:
<wod_root>/
training/
camera_box/
camera_image/
lidar/
lidar_box/
...
validation/
...
On first run, images are extracted from the parquet files and cached to disk. The cache location is controlled by --output_folder (default: /data/wod/images).
Three evaluation splits are provided in scripts/arrays/:
| Split | File | Cameras | Parquet dir |
|---|---|---|---|
| Dynamic NOTR | wod_id_to_seq_notr_dynamic_50frames.txt |
FRONT, FRONT_LEFT, FRONT_RIGHT | training |
| PVG | wod_id_to_seq_pvg.txt |
FRONT, FRONT_LEFT, FRONT_RIGHT | training |
| StreetGS | wod_id_to_seq_adgs_split.txt |
FRONT | validation |
Each sequence file has the format:
# scene_id, seg_name, start_timestep, end_timestep, scene_type
16,seg102319,0,50,dynamic
Download PandaSet from Hugging Face or the official website.
The expected directory structure is:
<pandaset_root>/
001/
camera/
back_camera/
front_camera/
front_left_camera/
...
lidar/
meta/
...
011/
...
The default config expects PandaSet at data/pandaset relative to the repo root.
Sequence IDs used for evaluation are listed in scripts/arrays/pandaset_id_to_seq.txt.
IDSplat uses per-frame tracked instance segmentation masks generated by Grounded-SAM-2.
Masks are organized per sequence and camera:
<masks_dir>/
<sequence_id>/
<camera_id>/
json_data/
mask_000_<camera_id>.json
mask_001_<camera_id>.json
...
mask_data/
mask_000.npy
mask_001.npy
...
Each .json file has the structure:
{
"mask_name": "mask_000.npy",
"labels": {
"1": {"instance_id": 1, "class_name": "car"},
"2": {"instance_id": 2, "class_name": "truck"}
}
}The corresponding .npy file is a 2D integer array of shape (H, W) where each pixel value is the instance ID (0 = background). Instance IDs must be consistent across frames (i.e., the same physical object must have the same ID in every frame it appears).
The results in the paper were obtained using Grounded-SAM-2 with tracking to generate the tracked instance masks. Similar models (e.g. SAM-3) would conceptually work as well but has not been tested.
bash scripts/train_waymo_notr.sh \
--sequence 102319 \
--data /path/to/wod \
--masks_dir /path/to/masksThe start/end frames are looked up automatically from scripts/arrays/wod_id_to_seq_notr_dynamic_50frames.txt. Pass --start_frame and --end_frame explicitly to override.
bash scripts/train_waymo_pvg.sh \
--sequence 102353 \
--data /path/to/wod \
--masks_dir /path/to/masksFrame ranges are looked up from scripts/arrays/wod_id_to_seq_pvg.txt.
bash scripts/train_waymo_streetgs.sh \
--sequence 104481 \
--data /path/to/wod \
--masks_dir /path/to/masksFrame ranges are looked up from scripts/arrays/wod_id_to_seq_adgs_split.txt.
bash scripts/train_pandaset.sh \
--sequence 001 \
--data /path/to/pandaset \
--masks_dir /path/to/masksSequence IDs are listed in scripts/arrays/pandaset_id_to_seq.txt.
All training scripts accept additional arguments that are forwarded. For example, to use the browser viewer instead of W&B:
bash scripts/train_pandaset.sh \
--sequence 001 --data /path/to/pandaset --masks_dir /path/to/masks \
--vis viewerbash scripts/eval.sh --config outputs/<experiment>/idsplat/<timestamp>/config.ymlTo open the interactive viewer for a trained model:
uv run ns-viewer --load-config outputs/<experiment>/idsplat/<timestamp>/config.ymlIf you use IDSplat in your research, please cite:
@article{lindstrom2025idsplat,
title={IDSplat: Instance-Decomposed 3D Gaussian Splatting for Driving Scenes},
author={Lindstr{\"o}m, Carl and Rafidashti, Mahan and Fatemi, Maryam and Hammarstrand, Lars and Oswald, Martin R and Svensson, Lennart},
journal={arXiv preprint arXiv:2511.19235},
year={2025}
}