autonomousvision
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 13 additions & 4 deletions b/‎README.md‎
Lines changed: 13 additions & 4 deletions
diff --git a/‎assets/navsim_cameras.gif‎
3.42 MB b/‎assets/navsim_cameras.gif‎
3.42 MB
diff --git a/‎docs/agents.md‎
Lines changed: 35 additions & 7 deletions b/‎docs/agents.md‎
Lines changed: 35 additions & 7 deletions
diff --git a/‎docs/cache.md‎
Lines changed: 0 additions & 2 deletions b/‎docs/cache.md‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎docs/install.md‎
Lines changed: 10 additions & 7 deletions b/‎docs/install.md‎
Lines changed: 10 additions & 7 deletions
diff --git a/‎docs/splits.md‎
Lines changed: 113 additions & 0 deletions b/‎docs/splits.md‎
Lines changed: 113 additions & 0 deletions
@@ -13,3 +13,5 @@
 # files
 *.log
 *.pkl
+*.jpg
+*.pcd
@@ -25,8 +25,11 @@
 >   - Large-scale publicly available test split for internal benchmarking
 >   - Continually-maintained devkit
 
-🏁 **NAVSIM** will serve as a main track in the **`CVPR 2024 Autonomous Grand Challenge`**. The warm-up phase of the challenge has begun! For further details, please [check the challenge website](https://opendrivelab.com/challenge2024/)!
+🏁 **NAVSIM** will serve as a main track in the **`CVPR 2024 Autonomous Grand Challenge`**. The leaderboard for the challenge is open! For further details, please [check the challenge website](https://opendrivelab.com/challenge2024/)!
 
+<p align="center">
+  <img src="assets/navsim_cameras.gif" width="800">
+</p>
 
 ## Table of Contents
 1. [Highlights](#highlight)
@@ -41,15 +44,21 @@
 - [Download and installation](docs/install.md)
 - [Understanding and creating agents](docs/agents.md) 
 - [Understanding the data format and classes](docs/cache.md)
+- [Dataset splits vs. filtered training / test splits](docs/splits.md)
 - [Understanding the PDM Score](docs/metrics.md)
 - [Submitting to the Leaderboard](docs/submission.md)
 
 <p align="right">(<a href="#top">back to top</a>)</p>
 
 
 ## Changelog <a name="changelog"></a>
-- **`[2024/04/05]`** **IMPORTANT NOTE**: Please re-download the `competition_test` split.
-  - There has been an issue with the `competition_test` split, so that the Ego-Status information was incorrect. Please download the updated files. For details see [installation](docs/install.md).
+- **`[2024/04/21]`** NAVSIM v1.0 release (official devkit version for [AGC 2024](https://opendrivelab.com/challenge2024/))
+  - **IMPORTANT NOTE**: The name of the data split `competition_test` was changed to `private_test_e2e`. Please adapt your directory name accordingly. For details see [installation](docs/install.md).
+  - Parallelization of metric caching / evaluation
+  - Adds [Transfuser](https://arxiv.org/abs/2205.15997) baseline (see [agents](docs/agents.md#Baselines))
+  - Adds standardized training and test filtered splits (see [splits](docs/splits.md))
+  - Visualization tools (see [tutorial_visualization.ipynb](tutorial/tutorial_visualization.ipynb))
+  - Refactoring
 - **`[2024/04/03]`** NAVSIM v0.4 release
   - Support for test phase frames of competition
   - Download script for trainval
@@ -61,7 +70,7 @@
   - Major refactoring of dataloading and configs
 - **`[2024/03/11]`** NAVSIM v0.2 release
   - Easier installation and download
-  - mini and test split integration
+  - mini and test data split integration
   - Privileged `Human` agent
 - **`[2024/02/20]`** NAVSIM v0.1 release (initial demo)
   - OpenScene-mini sensor blobs and annotation logs
 
@@ -39,28 +39,33 @@ In addition to the methods mentioned above, you have to implement the methods be
 Have a look at `navsim.agents.ego_status_mlp_agent.EgoStatusMLPAgent` for an example.
 
 - `get_feature_builders()`
-Has to return a List of feature builders (of type `navsim.planning.training. abstract_feature_target_builder.AbstractFeatureBuilder`). 
+Has to return a List of feature builders (of type `navsim.planning.training.abstract_feature_target_builder.AbstractFeatureBuilder`). 
 FeatureBuilders take the `AgentInput` object and compute the feature tensors used for agent training and inference. One feature builder can compute multiple feature tensors. They have to be returned in a dictionary, which is then provided to the model in the forward pass.
 Currently, we provide the following feature builders:
-    - EgoStateFeatureBuilder (returns a Tensor containing current velocity, acceleration and driving command)
-    - _the list will be increased in future devkit versions_
+    - [EgoStatusFeatureBuilder](https://github.com/autonomousvision/navsim/blob/main/navsim/agents/ego_status_mlp_agent.py#L18) (returns a Tensor containing current velocity, acceleration and driving command)
+    - [TransfuserFeatureBuilder](https://github.com/autonomousvision/navsim/blob/main/navsim/agents/transfuser/transfuser_features.py#L28) (returns a dictionary containing the current front image, LiDAR BEV map, and the ego status)
 
 - `get_target_builders()`
-Similar to `get_feature_builders()`, returns the target builders of type `navsim.planning.training. abstract_feature_target_builder.AbstractTargetBuilder` used in training. In contrast to feature builders, they have access to the Scene object which contains ground-truth information (instead of just the AgentInput).
+Similar to `get_feature_builders()`, returns the target builders of type `navsim.planning.training.abstract_feature_target_builder.AbstractTargetBuilder` used in training. In contrast to feature builders, they have access to the Scene object which contains ground-truth information (instead of just the AgentInput).
 
 - `forward()`
 The forward pass through the model. Features are provided as a dictionary which contains all the features generated by the feature builders. All tensors are already batched and on the same device as the model. The forward pass has to output a Dict of which one entry has to be "trajectory" and contain a tensor representing the future trajectory, i.e. of shape [B, T, 3], where B is the batch size, T is the number of future timesteps and 3 refers to x,y,heading. 
 
-- `compute_loss`()`
+- `compute_loss()`
 Given the features, the targets and the model predictions, this function computes the loss used for training. The loss has to be returned as a single Tensor.
 
 - `get_optimizers()`
 Use this function to define the optimizers used for training. 
 Depending on whether you want to use a learning-rate scheduler or not, this function needs to either return just an Optimizer (of type `torch.optim.Optimizer`) or a dictionary that contains the Optimizer (key: "optimizer") and the learning-rate scheduler of type `torch.optim.lr_scheduler.LRScheduler` (key: "lr_scheduler").
 
+- `get_training_callbacks()`
+In this function, you can return a List of `pl.Callback` to monitor or visualize the training process of the learned model. We implemented a callback for TransFuser in `navsim.agents.transfuser.transfuser_callback.TransfuserCallback`, which can serve as a starting point.
+
 - `compute_trajectory()`
 In contrast to the non-learning-based Agent, you don't have to implement this function.
 In inference, the trajectory will automatically be computed using the feature builders and the forward method.
+
+
 ## Inputs
 
 `get_sensor_config()` can be overwritten to determine which sensors are accessible to the agent. 
@@ -77,5 +82,28 @@ You can configure the set of sensor modalities to use and how much history you n
 
 Given this input, you will need to override the `compute_trajectory()` method and output a `Trajectory`. This is an array of BEV poses (with x, y and heading in local coordinates), as well as a `TrajectorySampling` config object that indicates the duration and frequency of the trajectory. The PDM score is evaluated for a horizon of 4 seconds at a frequency of 10Hz. The `TrajectorySampling` config facilitates interpolation when the output frequency is different from the one used during evaluation.
 
-Check out this simple constant velocity agent for an example agent implementation:
-https://github.com/autonomousvision/navsim/blob/51cecd51aa70b0e6bcfb3541b91ae88f2a78a25e/navsim/agents/constant_velocity_agent.py#L9
+Check out the baseline for implementations of agents!
+
+
+## Baselines
+
+NAVSIM provides several baselines, which serve as comparison or starting points for new end-to-end driving models.
+
+### `ConstantVelocityAgent`:
+The `ConstantVelocityAgent` is a naive baseline and follows the most simple driving logic. The agent maintains constant speed and a constant heading angle, resulting in a straight-line output trajectory. You can use the agent to familiarize yourself with the `AbstractAgent` interface or analyze samples that have a trivial solution for achieving a high PDM score.
+
+Link to the [implementation](https://github.com/autonomousvision/navsim/blob/main/navsim/agents/constant_velocity_agent.py).
+
+### `EgoStatusMLPAgent`: 
+The `EgoStatusMLPAgent` is a blind baseline, which ignores all sensors that perceive the environment. The agent applies a Multilayer perceptron to the state of the ego vehicle (i.e., the velocity, acceleration, and driving command). Thereby, the EgoStatusMLP serves as an upper bound for performance, which can be achieved by merely extrapolating the kinematic state of the ego vehicle. The EgoStatusMLP is a lightweight learned example, showcasing the procedure of creating feature caches and training an agent in NAVSIM.
+
+Link to the [implementation](https://github.com/autonomousvision/navsim/blob/main/navsim/agents/ego_status_mlp_agent.py).
+
+### `TransfuserAgent`: 
+[Transfuser](https://arxiv.org/abs/2205.15997) is an example of a sensor agent that utilizes both camera and LiDAR inputs. The backbone of Transfuser applies CNNs on a front-view camera image and a discretized LiDAR BEV grid. The features from the camera and LiDAR branches are fused over several convolution stages with Transformers to a combined feature representation. The Transfuser architecture combines several auxiliary tasks and imitation learning with strong closed-loop performance in end-to-end driving with the CARLA simulator.
+
+In NAVSIM, we implement the Transfuser backbone from [CARLA Garage](https://github.com/autonomousvision/carla_garage) and use BEV semantic segmentation and DETR-style bounding-box detection as auxiliary tasks. To facilitate the wide-angle camera view of the Transfuser, we stitch patches of the three front-facing cameras. Transfuser is a good starting point for sensor agents and provides pre-processing for image and LiDAR sensors, training visualizations with callbacks, and more advanced loss functions (i.e., Hungarian matching for detection). 
+
+Link to the [implementation](https://github.com/autonomousvision/navsim/blob/main/navsim/agents/transfuser).
+
+
@@ -2,8 +2,6 @@
 
 OpenScene is a compact redistribution of the large-scale [nuPlan dataset](https://motional-nuplan.s3.ap-northeast-1.amazonaws.com/index.html), retaining only relevant annotations and sensor data at 2Hz. This reduces the dataset size by a factor of >10. The data used in NAVSIM is structured into `navsim.common.dataclasses.Scene` objects. A `Scene` is a list of `Frame` objects, each containing the required inputs and annotations for training a planning `Agent`.
 
-**Filtering.** The NAVSIM validation an test sets will be filtered to increase the representation of challenging situations. Furthermore, the test set will only include agent inputs and exclude any privileged annotations.
-
 **Caching.** Evaluating planners involves significant preprocessing of the raw annotation data, including accessing the global map at each ´Frame´ and converting it into a local coordinate system. You can generate the cache with:
 ```
 cd $NAVSIM_DEVKIT_ROOT/scripts/
 
@@ -19,16 +19,18 @@ Navigate to the download directory and download the maps
 cd download && ./download_maps
 ```
 
-Next download the splits you want to use.
-You can download the mini, trainval, test and submission_test split with the following scripts
+Next download the data splits you want to use.
+Note that the dataset splits do not exactly map to the recommended standardized training / test splits-
+Please refer to [splits](splits.md) for an overview on the standardized training and test splits including their size and check which dataset splits you need to download in order to be able to run them.
+
+You can download the mini, trainval, test and private_test_e2e dataset split with the following scripts
 ```
 ./download_mini
 ./download_trainval
 ./download_test
-./download_competition_test
+./download_private_test_e2e
 ```
-
-**The mini split and the test split take around ~160GB and ~220GB of memory respectively**
+Also, the script `./download_navtrain` can be used to download a small portion of the  `trainval` dataset split which is needed for the `navtrain` training split. 
 
 This will download the splits into the download directory. From there, move it to create the following structure.
 ```angular2html
@@ -40,17 +42,18 @@ This will download the splits into the download directory. From there, move it t
     ├── navsim_logs
     |    ├── test
     |    ├── trainval
-    |    ├── competition_test
+    |    ├── private_test_e2e
     │    └── mini
     └── sensor_blobs
          ├── test
          ├── trainval
-         ├── competition_test
+         ├── private_test_e2e
          └── mini
 ```
 Set the required environment variables, by adding the following to your `~/.bashrc` file
 Based on the structure above, the environment variables need to be defined as:
 ```
+export NUPLAN_MAP_VERSION="nuplan-maps-v1.0"
 export NUPLAN_MAPS_ROOT="$HOME/navsim_workspace/dataset/maps"
 export NAVSIM_EXP_ROOT="$HOME/navsim_workspace/exp"
 export NAVSIM_DEVKIT_ROOT="$HOME/navsim_workspace/navsim"
 
@@ -0,0 +1,113 @@
+# Dataset splits vs. filtered training / test splits
+
+The NAVSIM framework utilizes several dataset splits for standardized training and evaluating agents. 
+All of them use the OpenScene dataset that is divided into the dataset splits `mini`,`trainval`,`test`,`private_test_e2e`, which can all be downloaded separately.
+
+It is possible to run trainings and evaluations directly on these sets (see `Standard` in table below). 
+Alternatively, you can run trainings and evaluations on training and validation splits that were filtered for challenging scenarios (see `NAVSIM` in table below), which is the recommended option for producing comparable and competitive results efficiently.
+In contrast to the dataset splits which refer to a downloadable set of logs, the training / test splits are implemented as scene filters, which define how scenes are extracted from these logs.
+
+The NAVSIM training / test splits subsample the OpenScene dataset splits.
+Moreover, the NAVSIM splits include overlapping scenes, while the Standard splits are non-overlapping.
+Specifically, `navtrain` is based on the `trainval` data and `navtest` on the `test` data.
+
+As the `trainval` sensor data is very large, we provide a separate download link, which loads only the frames needed for `navtrain`. 
+This eases access for users that only want to run the `navtrain` split and not the `trainval` split. If you already downloaded the full `trainval` sensor data, it is **not necessary** to download the `navtrain` frames as well.
+The logs are always the complete dataset split.
+
+## Overview
+The Table belows offers an overview on the training and test splits supported by NAVSIM. It also shows which config parameters have to be used to set the dataset split (`split`) and training/test split (`scene-filter`).
+
+<table border="0">
+    <tr>
+        <th></th>
+        <th>Name</th>
+        <th>Description</th>
+        <th>Logs</th>
+        <th>Sensors</th>
+        <th>Config parameters</th>
+    </tr>
+    <tr>
+        <td rowspan="3">Standard</td>
+        <td>trainval</td>
+        <td>Large split for training and validating agents with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
+        <td>14GB</td>
+        <td>&gt;2000GB</td>
+        <td>
+        split=trainval<br>
+        scene_filter=all_scenes
+        </td>
+    </tr>
+    <tr>
+        <td>test</td>
+        <td>Small split for testing agents with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
+        <td>1GB</td>
+        <td>217GB</td>
+        <td>
+        split=test<br>
+        scene_filter=all_scenes
+        </td>
+    </tr>
+    <tr>
+        <td>mini</td>
+        <td>Demo split for with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
+        <td>1GB</td>
+        <td>151GB</td>
+        <td>
+        split=mini<br>
+        scene_filter=all_scenes
+        </td>
+    </tr>
+    <tr>
+        <td rowspan="2">NAVSIM</td>
+        <td>navtrain</td>
+        <td>Standard split for training agents in NAVSIM with non-trivial driving scenes. Sensors available separately in <a href="https://github.com/autonomousvision/navsim/blob/main/download/download_navtrain.sh">download_navtrain.sh</a>.</td>
+        <td>-</td>
+        <td>445GB*</td>
+        <td>
+        split=trainval<br>
+        scene_filter=navtrain
+        </td>
+    </tr>
+    <tr>
+        <td>navtest</td>
+        <td>Standard split for testing agents in NAVSIM with non-trivial driving scenes. Available as a filter for test split.</td>
+        <td>-</td>
+        <td>-</td>
+        <td>
+        split=test<br>
+        scene_filter=navtest
+        </td>
+    </tr>
+    <tr>
+        <td rowspan="2">Competition</td>
+        <td>warmup_test_e2e</td>
+        <td>Warmup test split to validate submission on hugging face. Available as a filter for mini split.</td>
+        <td>-</td>
+        <td>-</td>
+        <td>
+        split=mini<br>
+        scene_filter=warmup_test_e2e
+        </td>
+    </tr>
+    <tr>
+        <td>private_test_e2e</td>
+        <td>Private test split for the challenge leaderboard on hugging face.</td>
+        <td>&lt;1GB</td>
+        <td>25GB</td>
+        <td>
+        split=private_test_e2e<br>
+        scene_filter=private_test_e2e
+        </td>
+    </tr>
+</table>
+
+(*300GB without history)
+
+## Splits
+
+The standard splits `trainval`, `test`, and `mini` are from the OpenScene dataset. Note that the data corresponds to the nuPlan dataset with a lower frequency of 2Hz. You can download all standard splits over Hugging Face with the bash scripts in [download](../download)
+
+NAVSIM provides a subset and filter of the `trainval` split, called `navtrain`. The `navtrain` split facilitates a standardized training scheme and requires significantly less sensor data storage than `travel` (445GB vs. 2100GB). If your agents don't need historical sensor inputs, you can download `navtrain` without history, which requires 300GB of storage. Note that `navtrain` can be downloaded separately via [download_navtrain.sh](https://github.com/autonomousvision/navsim/blob/main/download/download_navtrain.sh) but still requires access to the `trainval` logs. Similarly, the `navtest` split enables a standardized set for testing agents with a provided scene filter. Both `navtrain` and `navtest` are filtered to increase interesting samples in the sets. 
+
+For the challenge on Hugging Face, we provide the `warmup_test_e2e` and `private_test_e2e` for the warm-up and challenge track, respectively. Note that `private_test_e2e` requires you to download the data, while `warmup_test_e2e` is a scene filter for the `mini` split.
-Original file line number
+Diff line change
 # files
 *.log
 *.pkl
 +*.jpg
 +*.pcd