Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 39 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ runtime traces produced from instrumented code,
PolyTracker is controlled via a Python script called `polytracker`. You can
install it by running

```
```shell-script
pip3 install polytracker
```

Expand All @@ -57,13 +57,13 @@ users are likely to run it in a containerized environment. Luckily,
`polytracker` makes this easy. All you need to do is have `docker` installed,
then run:

```
```shell-script
polytracker docker pull
```

and

```
```shell-script
polytracker docker run
```

Expand All @@ -78,7 +78,7 @@ instrumented program's control flow graph, and even extract a context free
grammar matching the inputs accepted by the program. You can explore these
commands by running

```
```shell-script
polytracker --help
```

Expand All @@ -100,7 +100,7 @@ instrumented environment. This will produce a `blight_journal.jsonl` file that
records all commands run during the build. If you have a C/C++ target, you can
instrument it by invoking `polytracker build` and passing your build command:

```bash
```shell-script
polytracker build gcc -g -o my_binary my_source.c
```

Expand All @@ -110,14 +110,14 @@ directory to build an instrumented version of your build target. The
instrumented build target will be built using the same flags as the original
build target.

```bash
```shell-script
polytracker instrument-targets my_binary
```

`build` also supports more complex programs that use a build system like
autotiools or CMake:

```bash
```shell-script
polytracker build cmake .. -DCMAKE_BUILD_TYPE=Release
polytracker build ninja
# or
Expand All @@ -127,8 +127,8 @@ polytracker build make

Then run `instrument-targets` on any targets of the build:

```bash
$ polytracker instrument-targets a.bin b.so
```shell-script
polytracker instrument-targets a.bin b.so
```

Then `a.instrumented.bin` and `b.instrumented.so` will be the instrumented
Expand Down Expand Up @@ -199,10 +199,10 @@ instrumentation parameters without needing to recompile the binary.
### Environment Variables

PolyTracker accepts configuration parameters in the form of environment
variables to avoid recompiling target programs. The current environment
variables PolyTracker supports is:
variables to avoid recompiling target programs. The current set of environment
variables that PolyTracker supports is:

```
```shell-script
POLYDB: A path to which to save the output database (default is polytracker.tdag)

WLLVM_ARTIFACT_STORE: Provides a path to an existing directory to store artifact/manifest for all build targets
Expand Down Expand Up @@ -251,20 +251,20 @@ focuses on ignoring system libraries. The original script can be found in
Check out this Git repository. From the root, either build the base PolyTracker
Docker image:

```commandline
```shell-script
pip3 install -e ".[dev]" && polytracker docker rebuild
```

or pull the latest prebuilt version from DockerHub:

```commandline
```shell-script
docker pull trailofbits/polytracker:latest
```

For a demo of PolyTracker running on the [MuPDF](https://mupdf.com/) parser run
this command:

```commandline
```shell-script
docker build -t trailofbits/polytracker-demo-mupdf -f examples/pdf/Dockerfile-mupdf.demo .
```

Expand All @@ -275,16 +275,16 @@ information provided by the taint analysis.
For a demo of PolyTracker running on Poppler utils version 0.84.0 run this
command:

```commandline
```shell-script
docker build -t trailofbits/polytracker-demo-poppler -f examples/pdf/Dockerfile-poppler.demo .
```

All the poppler utils will be located in
`/polytracker/the_klondike/poppler-0.84.0/build/utils`.

```commandline
$ cd /polytracker/the_klondike/poppler-0.84.0/build/utils
$ ./pdfinfo_track some_pdf.pdf
```shell-script
cd /polytracker/the_klondike/poppler-0.84.0/build/utils
./pdfinfo_track some_pdf.pdf
```

## Building PolyTracker from Source
Expand Down Expand Up @@ -324,6 +324,20 @@ source file. This is most common when instrumenting compression and
cryptographic algorithms that have large block sizes. There are a number of
mitigations for this behavior currently being researched and developed.

## Publications and Current Use Cases

Here are some of the publicly available things we've done with PolyTracker. If you know of anything else you'd like to see listed here, please let us know!

- The [Format Analysis Workbench](https://github.com/galoisinc/faw) integrates several PolyTracker features from different versions of the codebase, namely grammar extraction and blind spot detection.
- Harmon, Carson, Bradford Larsen, and Evan A. Sultanik. "[Toward automated grammar extraction via semantic labeling of parser implementations.](https://bradfordlarsen.com/files/publications/semantic-labeling-langsec-2020.pdf)" 2020 IEEE Security and Privacy Workshops (SPW). IEEE, 2020.
- Brodin, Henrik, Marek Surovič, and Evan Sultanik. "[Blind spots: Identifying exploitable program inputs.](https://langsec.org/spw23/papers/Brodin_LangSec23.pdf)"
2023 IEEE Security and Privacy Workshops (SPW). IEEE, 2023.
- Henrik used PolyTracker's blind spots (`mapping` and `cavities` more precisely) trace analysis functionality to pinpoint a CVE and [wrote about it on the Trail of Bits blog](https://blog.trailofbits.com/2023/03/30/acropalypse-polytracker-blind-spots/).
- Kaoudis, Kelly, Henrik Brodin, and Evan Sultanik. "[Automatically Detecting Variability Bugs Through Hybrid Control and Data Flow Analysis.](https://langsec.org/spw23/papers/Kaoudis_LangSec23.pdf)"
2023 IEEE Security and Privacy Workshops (SPW). IEEE, 2023.
- Evan Sultanik, Marek Surovič, Henrik Brodin, Kelly Kaoudis, Facundo Tuesca, Carson Harmon, Lisa Overall, Joseph Sweeney, and Bradford Larsen.
"[PolyTracker: Whole-Input Dynamic Information Flow Tracing.](https://github.com/trailofbits/publications/blob/master/papers/issta24-polytracker.pdf)" In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2024.

## License and Acknowledgements

This research was developed by [Trail of Bits](https://www.trailofbits.com/)
Expand All @@ -333,8 +347,13 @@ licensed under the [Apache 2.0 license](LICENSE). © 2019, Trail of Bits.

## Maintainers

Please contact us using `firstname.lastname@trailofbits.com`.

[Evan Sultanik](https://github.com/ESultanik)<br />
[Henrik Brodin](https://github.com/hbrodin)<br />
[Kelly Kaoudis](https://github.com/kaoudis)<br />

## Past Maintainers

[Marek Surovič](https://github.com/surovic)<br />
[Facundo Tuesca](https://github.com/facutuesca)<br /> <br />
`firstname.lastname@trailofbits.com`