Within #145, there are very simple tests to detect whether pe-parse would correctly identify the executables as PE, without erroring. Unfortunately (but not unexpectedly), there are a few executables that are not parsed correctly.
Ideally, we should at least test and enforce that we support parsing of any PE in the Corkami dataset.
Reference to known failing tests:
|
static const std::unordered_set<std::string> kKnownPEFailure{ |
|
"virtsectblXP.exe", "maxsec_lowaligW7.exe", |
|
"maxsecXP.exe", "nullSOH-XP.exe", |
|
"tinyXP.exe", "tinydllXP.dll", |
|
"virtrelocXP.exe", "foldedhdrW7.exe", |
|
"maxvals.exe", "d_nonnull.dll", |
|
"reloccrypt.exe", "d_resource.dll", |
|
"fakerelocs.exe", "lfanew_relocW7.exe", |
|
"bigSoRD.exe", "tinyW7.exe", |
|
"reloccryptW8.exe", "standard.exe", |
|
"exe2pe.exe", "tinygui.exe", |
|
"dllfwloop.dll", "tinydrivXP.sys", |
|
"tiny.exe", "tinydll.dll", |
|
"foldedhdr.exe", "dllmaxvals.dll", |
|
"reloccryptXP.exe", "dosZMXP.exe", |
|
"tinyW7_3264.exe", "dllfw.dll", |
|
"hdrcode.exe", "ibrelocW7.exe", |
|
"d_tiny.dll", "sc.exe"}; |
Secondly, a much bigger task would be to confirm that pe-parse is correctly parsing all and only the information that the Corkami PEs claim to exhibit.
How to Start Investigating
First, run git submodule update --init to pull the Corkami dataset (We will be focusing on the PEs here https://github.com/corkami/pocs/tree/master/PE/bin).
Then, running the standalone dump-pe tool that is included in this repo should be an easy way to iterate on code changes, since the testing logic is basically the same.
$ ./build/dump-pe/dump-pe tests/assets/corkami-poc-dataset/PE/bin/virtsectblXP.exe
Error: 3 (Invalid section)
Location: ParsePEFromBuffer:2394
Use that information as a starting point for where to begin debugging. Moreover, most, if not all, of the PEs have a corresponding asm file that provides the source code for building the PE and how the file is constructed. Use this information to gain a better understanding of why pe-parse is having difficulty parsing it and what kind of fix would be needed. Here it is for our example https://github.com/corkami/pocs/blob/master/PE/virtsectblXP.asm
Within #145, there are very simple tests to detect whether pe-parse would correctly identify the executables as PE, without erroring. Unfortunately (but not unexpectedly), there are a few executables that are not parsed correctly.
Ideally, we should at least test and enforce that we support parsing of any PE in the Corkami dataset.
Reference to known failing tests:
pe-parse/tests/corkami_test.cpp
Lines 34 to 51 in 4286f10
Secondly, a much bigger task would be to confirm that pe-parse is correctly parsing all and only the information that the Corkami PEs claim to exhibit.
How to Start Investigating
First, run
git submodule update --initto pull the Corkami dataset (We will be focusing on the PEs here https://github.com/corkami/pocs/tree/master/PE/bin).Then, running the standalone
dump-petool that is included in this repo should be an easy way to iterate on code changes, since the testing logic is basically the same.Use that information as a starting point for where to begin debugging. Moreover, most, if not all, of the PEs have a corresponding
asmfile that provides the source code for building the PE and how the file is constructed. Use this information to gain a better understanding of why pe-parse is having difficulty parsing it and what kind of fix would be needed. Here it is for our example https://github.com/corkami/pocs/blob/master/PE/virtsectblXP.asm