Requested badge: Functional
This repository contains the code for the ICSE'25 paper "A Differential Testing Framework to Identify Critical AV Failures Leveraging Arbitrary Inputs". As described below, the repository contains the code for replicating the experimental analysis performed in the paper, including generating all figures and tables. It also contains information on how to repeat the full experiment using user-provided datasets and systems.
This repository is available on GitHub and archived on Software Heritage. A preprint of the paper is available in the repository.
The main experiment consists of providing video input to 5 different AV systems and recording their steering angles in response to this video. These responses are then analyzed using the differential testing approach proposed in the paper and implemented in /3_Process/OutlierDetection.py.
The usage information in the setup and reproduction sections below describe how to use the provided scripts to reproduce the data from the paper.
The steering angle output of the 5 AV systems are available in /3_Process/cache/*.
The input videos used in the experiment cannot be directly included in this repository due to licensing limitations. See the datasets readme for more information.
To replicate the full experiment:
- First install the 5 SUTs following the process described in 0_Setup.
- Then, obtain the datasets used in the experiment as explained in 1_Datasets; note: due to licensing limitations these cannot be directly included and must be obtained from their original sources.
- Finally, these datasets must be preprocessed into a common format as explained in 2_TransformVideos.
- Once the videos have been processed, follow the instructions for each of the different SUTs in 0_Setup to run each version of OpenPilot on the different videos.
- Follow the instructions for replicating the figures and data to utilize the scripts in 3_Process to generate the figures.
To replicate the pipeline on user-supplied videos, repeat the process above, but replace step 2 with adding user-supplied videos. These videos will still need to be preprocessed as described in step 3.
To replicate the pipeline for other SUTs, the user must extract the steering angle readings from the SUT based on the video. The steering angles can then be processed directly by 3_Process/OutlierDetection.py to identify failures as described in the paper.
A Dockerfile is provided for convenience in replication of the figures and results based on the cached data provided. First, build the Docker image as:
docker build -t difftest .If running locally (outside of Docker), first set up the Python environment.
With conda installed, run the following:
source create_env.shThis will create the difftest conda environment and install all relevant dependencies.
We first describe the structure of the repository and then describe how to utilize the scripts to reproduce the experimental analysis from the paper.
Folder Structure:
- 0_Setup - Information on setting up and running the SUTs used in the experiment
- 1_Datasets - Placeholder for datasets - ommitted for licensing; see the datasets readme.
- 2_TransformVideos - Scripts to normalize data in
1_Datasets - ⭐ 3_Process - Scripts to execute the experiment
- 📋 cache - Raw performance data from the SUTs evaluated on all videos.
- 🧰 🌟 OutlierDetection.py - Code to perform the statistical analysis of DiffTest4AV. This implementation uses the Dixon's Q test for outlier detection (
dixon).
The following was tested on a fresh install of Ubuntu 22.04 using miniconda
docker build -t difftest . # if not run during setup above
docker run -it --rm -v "$(pwd)/:/difftest" difftest /bin/bash
source generate_figures.shWith conda installed, run the following:
source create_env.sh # if not run during setup above
source generate_figures.shThis will launch all of the scripts in succession to compute all of the figures and tables used in the paper. The scripts are heavily parallelized and will run for ~20 minutes on a machine with 32 cores; runtimes will vary based on available hardware.
All figures will be saved in 3_Process/gen_figures/. A version of these figures has been bundled with this repository; running the script will overwrite the included files.
All png files generated should be an exact binary match with the original files bundled with the repository; however, the pdf version of the images may differ in the file binary due to system variations - the image itself is the same.
The following table describes how to find the figures used in the paper. NOTE: all referenced frames from the paper, e.g. Figures 1, 3, 4, 5, 6, and 9 will appear as a blank image with steering angles only since the videos are not included.
| Paper Figure | Generated file |
|---|---|
| Fig 1. | image link |
| Fig 3. | image link |
| Table 1 | table link |
| Fig 4. | image link |
| Table 2 | table link |
| Fig 5. | image link |
| Fig 6. | image link |
| Fig 7. | image link |
| Fig 8a. | image link |
| Fig 8b. | image link |
| Fig 8c. | image link |
| Table V | table link |
| Fig 9a. | image link |
| Fig 9b. | image link |
| Fig 9c. | image link |
| Fig 9d. | image link |
| Fig 9e. | image link |
| Table VI | table link |

