This is the repository that corresponds to the Differential File System Analysis for the Quick Win Talk to be presented at SANS DFIR Summit 2023 on August 4, 2023. A copy of the slides are available here. A YouTube Vido Demonstration is avaiable at https://youtu.be/onC7x-BftSk.
Mature DevOps organizations use continuous integration/continuous delivery (CI/CD) techniques to deliver a hardened virtual machine “gold image” to production that does not need any additional configuration on first boot and is ready to join the cluster of virtual machines in the backend pool of its designated load balancer. This approach offers several significant security advantages, but it can also speed up the time to do a forensic analysis when Differential File System Analysis is employed.
Differential File System Analysis is a technique wherein the storage volume(s) of a VM launched from a gold image are mounted read-only to a forensic workstation and are used as a basis for comparison against the forensic copies of the storage volume(s) of a VM that is suspected to be compromised. A reference hash set of all files on the gold image can be prepared in advance by the CI/CD pipeline and stored until needed. Any hashes on the compromised system that are not found in the reference hash set are either new or altered.
Although this talk will demonstrate how to use the Differential File System Analysis technique and open-source software to investigate a compromised AWS EC2 instance, this technique is effective on any system launched recently from a gold image. The talk concludes with examples of how the high-level forensic processing steps can be automated to further reduce the time from compromise to analysis.
- Create a new role called "EC2_Responder" that has the
AmazonEC2FullAccessandAmazonS3FullAccesspolicies attached. - Use the web console to launch an Amazon Linux t2.micro EC2 Instance into the
us-east-1aavailability zone. Name itPROD_Host. Create a security group that allows SSH access from just your IP address. - Similarly, use the web console to launch an Ubuntu t2.micro EC2 Instance into the
us-east-1aavailability zone. Attach the EC2_Responder Role to the Instance and Name itDFIR_Host. Use the same security group that was created in the previous step. - Determine the Volume that is being used by the
PROD_Hostand label it with thenametag value set toPROD_volume - Label the remaining volume with the
nametag value set toDFIR_volume - SSH into the PROD_Host and paste the following command:
echo "Here is an important file" > /home/ec2-user/important.txt
Then exit the SSH session.
- SSH into the
DFIR_Host - [In the DFIR_Host SSH Session] Update the
DFIR_Hostand install sleuthkit, binutils, and the AWS CLI.
sudo apt upgrade && sudo apt update -y
sudo apt install -y unzip sleuthkit binutils
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
- [In the DFIR_Host SSH Session] Set the VOLUME_ID of the Volume that has the name set to
PROD_volumeusing the command:
PROD_VOLUME=$(aws ec2 describe-volumes --filters "Name=status,Values=in-use" "Name=tag:Name,Values=PROD_volume" --query "Volumes[*].VolumeId" --output text)
echo $PROD_VOLUME
TIP: Drop any of these commands into ChatGPT or Bard for a detailed explaination.
- [In the DFIR_Host SSH Session] Make a snapshot of the
PROD_volumeand set thenametag value toREFERENCEusing the command as follows:
REFERENCE_SNAPSHOT=$(aws ec2 create-snapshot --volume-id $PROD_VOLUME --description "Snapshot of REFERENCE volume created on 2023-07-25 14:07:14 PST" --tag-specifications "ResourceType=snapshot,Tags=[{Key=Name,Value=REFERENCE}]" --query SnapshotId --output text)
echo "wait for it..."
aws ec2 wait snapshot-completed --snapshot-ids $REFERENCE_SNAPSHOT
echo $REFERENCE_SNAPSHOT
- SSH into the
PROD_Hostand run the following infection script:
wget https://s3.amazonaws.com/forensicate.cloud-data/dont_peek2.sh
sudo bash dont_peek2.sh
Note: The instance will shut itself down in 5 minutes so just let it run. Wait until the instance has shut down before proceeding.
- Make a second snapshot of the
PROD_volumeand set thenametag value toEVIDENCEusing the command on the as follows:
EVIDENCE_SNAPSHOT=$(aws ec2 create-snapshot --volume-id $PROD_VOLUME --description "Snapshot of EVIDENCE volume created on 2023-07-25 14:07:14 PST" --tag-specifications "ResourceType=snapshot,Tags=[{Key=Name,Value=EVIDENCE}]" --query SnapshotId --output text)
echo "wait for it..."
aws ec2 wait snapshot-completed --snapshot-ids $EVIDENCE_SNAPSHOT
echo $EVIDENCE_SNAPSHOT
- Make a
REFERENCEvolume from theREFERENCEsnapshot.
REFERENCE_VOLUME=$(aws ec2 create-volume --snapshot-id $REFERENCE_SNAPSHOT --availability-zone us-east-1a --tag-specifications "ResourceType=volume,Tags=[{Key=Name,Value=REFERENCE}]" --query 'VolumeId' --output text)
echo $REFERENCE_VOLUME
- Similarly, make an
EVIDENCEvolume from theEVIDENCEsnapshot.
EVIDENCE_VOLUME=$(aws ec2 create-volume --snapshot-id $EVIDENCE_SNAPSHOT --availability-zone us-east-1a --tag-specifications "ResourceType=volume,Tags=[{Key=Name,Value=EVIDENCE}]" --query 'VolumeId' --output text)
echo $EVIDENCE_VOLUME
- Make an empty 100 GB volume named
DATA
DATA_VOLUME=$(aws ec2 create-volume --size 100 --availability-zone us-east-1a --volume-type gp2 --tag-specifications "ResourceType=volume,Tags=[{Key=Name,Value=DATA}]" --query 'VolumeId' --output text)
echo $DATA_VOLUME
- Query the Virtual Machine's Instance Metadata service (IMDS) to set the INSTANCE_ID
TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") && INSTANCE_ID=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-id)
echo $INSTANCE_ID
- Next, Attach the
REFERENCEvolume to/dev/xvdb
aws ec2 attach-volume --volume-id $REFERENCE_VOLUME --instance-id $INSTANCE_ID --device /dev/xvdb
This should return a result that looks something like:
{
"AttachTime": "2023-07-25T22:08:45.711000+00:00",
"Device": "/dev/xvdb",
"InstanceId": "i-01647e9e5121e6552",
"State": "attaching",
"VolumeId": "vol-0b97fa76e15e3babe"
}
- Similarly, attach the
EVIDENCEvolume to/dev/xvdc
aws ec2 attach-volume --volume-id $EVIDENCE_VOLUME --instance-id $INSTANCE_ID --device /dev/xvdc
- Next, attach the
DATAvolume to/dev/xvdd
aws ec2 attach-volume --volume-id $DATA_VOLUME --instance-id $INSTANCE_ID --device /dev/xvdd
- Switch to root and inspect the filesystem types using
lsblk -f
ubuntu@ip-172-31-89-235:~$ sudo su
root@ip-172-31-89-235:/home/ubuntu# lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
loop0 0 100% /snap/amazon-ssm-agent/6312
loop1 0 100% /snap/core18/2745
loop2 0 100% /snap/core20/1879
loop3 0 100% /snap/lxd/24322
loop4 0 100% /snap/snapd/19122
xvda
├─xvda1 ext4 1.0 cloudimg-rootfs 4513eb34-58e6-408e-8ed7-3d487fe6b35b 5.3G 30% /
├─xvda14
└─xvda15 vfat FAT32 UEFI 6192-5E23 98.3M 6% /boot/efi
xvdb
├─xvdb1 xfs / 3325c0ba-3d91-4d25-bb13-bdc5c47a979a
├─xvdb127
└─xvdb128 vfat FAT16 CE70-438F
xvdc
├─xvdc1 xfs / 3325c0ba-3d91-4d25-bb13-bdc5c47a979a
├─xvdc127
└─xvdc128 vfat FAT16 CE70-438F
xvdd
root@ip-172-31-89-235:/home/ubuntu#
Note that the UUID of the REFERENCE (xvdb) volume's partitions are the same as the EVIDENCE volume's partitions.
- Format the new
DATAvolume
mkfs -t xfs /dev/xvdd
- Make some mount points for the three volumes
mkdir /mnt/reference
mkdir /mnt/evidence
mkdir /mnt/data
- Change the UUID of the
EVIDENCEvolume so that it does not conflict with the UUID of theEVIDENCEvolume. Without this step, only one of the two volumes will be able to be mounted.
xfs_admin -U $(uuidgen) /dev/xvdc1
- Mount the
REFERENCEandEVIDENCEvolumes as read-only and mount theDATAas read-write.
mount -o ro -t xfs /dev/xvdb1 /mnt/reference
mount -o ro -t xfs /dev/xvdc1 /mnt/evidence
mount /dev/xvdd /mnt/data
lsblk
- Verify that the
REFERENCEandEVIDENCEvolumes are read-only and theDATAvolume is read-write.
touch /mnt/reference/tmp/test
touch /mnt/evidence/tmp/test
touch /mnt/data/test
ls /mnt/data/test
rm /mnt/data/test
NOTE: Shout out to Brian Carrier of Basis Technology for sleuthkit and the following references:
- https://www.sleuthkit.org/informer/sleuthkit-informer-6.html#hashes
- https://www.sleuthkit.org/informer/sleuthkit-informer-7.html
- Generate the List of MD5 hashes for files on the
REFERENCEandEVIDENCEvolumes
mkdir /mnt/data/hashdata && cd /mnt/data/hashdata
# Create REFERENCE Hash Set
find /mnt/reference -type f -print0 | xargs -0 md5sum | tee reference_files.md5
# Create EVIDENCE Hash Set
find /mnt/evidence -type f -print0 | xargs -0 md5sum | tee evidence_files.md5
wc -l *
NOTE: May need to fix a stray character as follows:
echo; echo "Highlight the stray character in reference_files.md5:"
grep -C3 22369d5c587517e7ff963c164b878f55 reference_files.md5
sed -i 's|^\\||g' reference_files.md5
echo; echo "Show the stray character is fixed in reference_files.md5:"
grep -C3 22369d5c587517e7ff963c164b878f55 reference_files.md5
echo; echo "Highlight the stray character in evidence_files.md5:"
grep -C3 22369d5c587517e7ff963c164b878f55 evidence_files.md5
sed -i 's|^\\||g' evidence_files.md5
echo; echo "Show the stray character is fixed in evidence_files.md5:"
grep -C3 22369d5c587517e7ff963c164b878f55 evidence_files.md5
- Create the Hash Datbases. For additional information read the man page for the hfind command and the –i option
# Create the Hash Database
hfind -i md5sum reference_files.md5
hfind -i md5sum evidence_files.md5
ls -l
file *
- Create a list of just the MD5 Hashes that were not found in reference_files:
awk '{print $1}' evidence_files.md5 | hfind reference_files.md5 | grep \
"Hash Not Found" | awk '{print $1}' > new+changed_hashes_in_evidence.md5
- Create a list of just the MD5 Hashes that were reference_files but are no longer in evidence_files:
awk '{print $1}' reference_files.md5 | hfind evidence_files.md5 | grep \
"Hash Not Found" | awk '{print $1}' > missing+changed_hashes_from_evidence.md5
-
Display a list of File Sizes using
wc -l *.md5 -
Determine the file names of the missing+changed files plus the new+changed files
grep -f missing+changed_hashes_from_evidence.md5 reference_files.md5 > missing+changed_files+hashes_from_evidence.md5
grep -f new+changed_hashes_in_evidence.md5 evidence_files.md5 > new+changed_files+hashes_in_evidence.md5
awk '{print $2}' missing+changed_files+hashes_from_evidence.md5 | sed -e 's|/mnt/reference||g' | sort > missing+changed_files_from_evidence.txt
awk '{print $2}' new+changed_files+hashes_in_evidence.md5 | sed -e 's|/mnt/evidence||g' | sort > new+changed_files_in_evidence.txt
- Identify which files are missing, changed, or new. If the file name exists in both the
missing+changed_files_from_evidence.txtand thenew+changed_files_in_evidence.txtit is a file that has a changed MD5 hash. If the file exists in only themissing+changed_files_from_evidence.txtthen is was deleted. Conversely, if it exists in only thenew+changed_files_in_evidence.txtit is a new file.
comm -12 missing+changed_files_from_evidence.txt new+changed_files_in_evidence.txt > CHANGED_FILES.txt
comm -13 missing+changed_files_from_evidence.txt new+changed_files_in_evidence.txt > NEW_FILES.txt
comm -23 missing+changed_files_from_evidence.txt new+changed_files_in_evidence.txt > DELETED_FILES.txt
- Determine the Size of each of the files:
wc -l CHANGED_FILES.txt NEW_FILES.txt DELETED_FILES.txt
- Display the reduction ratio:
NUMERATOR=$(wc -l CHANGED_FILES.txt NEW_FILES.txt DELETED_FILES.txt | grep total | cut -d" " -f2)
DENOMINATOR=$(wc -l evidence_files.md5 | cut -d" " -f1)
RATIO=$(bc <<< "scale=2; $NUMERATOR / $DENOMINATOR")
echo $NUMERATOR "/" $DENOMINATOR "=" $RATIO
- Which file was deleted?
cat DELETED_FILES.txt
- Which files were changed?
cat CHANGED_FILES.txt
- Notice that
/etc/crontabhas changed. This may be where malware tries to maintain persistance. Examin it:
diff /mnt/reference/etc/crontab /mnt/evidence/etc/crontab
- What is
/bin/wipefs?!?
strings /mnt/evidence/bin/wipefs
Yeah, definitely malicious
- Notice that
netstathas a changed MD5 hash. Lets look at the file sizes:
ls -l /mnt/evidence/usr/bin/netstat /mnt/reference/usr/bin/netstat
Well that is certainly suspicious. Turns out that netstat was trojanized!
- Look at the first 10 files listed in
NEW_FILES.txt
head NEW_FILES.txt
- What is the new
call2mins.shfile doing in thegrub2directory?
cat /mnt/evidence/boot/grub2/call2mins.sh
cat /mnt/evidence/bin/hello
And that is definitely suspicious!
- Looking again at
NEW_FILES.txtwe can see that there is an AWS credentials file. Lets look at it:
cat /mnt/evidence/home/ec2-user/.aws/credentials
Eww, not good. Make sure these are revoked.
I hope that you have enjoyed this demonstration of Differential Filesystem Analysis. Although this demo used AWS EC2, remember that it can be used with any cloud or even on-prem when incremental backups are made.