Fuzztruction - Prototype Of A Fuzzer That Does Not Directly Mutate Inputs (As Most Fuzzers Do) But Instead Uses A So-Called Generator Application To Produce An Input For Our Fuzzing Target
Fuzztruction is an academic prototype of a fuzzer that does not directly mutate inputs (as most fuzzers do) but instead uses a so-called generator application to produce an input for our fuzzing target. As programs generating data usually produce the correct representation, our fuzzer mutates the generator program (by injecting faults), such that the data produced is almost valid. Optimally, the produced data passes the parsing stages in our fuzzing target, called consumer, but triggers unexpected behavior in deeper program logic. This allows to even fuzz targets that utilize cryptography primitives such as encryption or message integrity codes. The main advantage of our approach is that it generates complex data without requiring heavyweight program analysis techniques, grammar approximations, or human intervention.
For instructions on how to reproduce the experiments from the paper, please read the fuzztruction-experiments
submodule documentation after reading this document.
Compatibility: While we try to make sure that our prototype is as platform independent as possible, we are not able to test it on all platforms. Thus, if you run into issues, please use Ubuntu 22.04.1, which was used during development as the host system.
Quickstart
# Clone the repository
git clone --recurse-submodules https://github.com/fuzztruction/fuzztruction.git
# Option 1: Get a pre-built version of our runtime environment.
# To ease reproduction of experiments in our paper, we recommend using our
# pre-built environment to avoid incompatibilities (~30 GB of data will be
# donwloaded)
# Do NOT use this if you don't want to reproduce our results but instead fuzz
# own targets (use the next command instead).
./env/pull-prebuilt.sh
# Option 2: Build the runtime environment for Fuzztruction from scratch.
# Do NOT run this if you executed pull-prebuilt.sh
./env/build.sh
# Spawn a container based on the image built/pulled before.
# To spawn a container using the prebuilt image (if pulled above),
# you need to set USE_PREBUILT to 1, e.g., `USE_PREBUILT=1 ./env/start.sh`
./env /start.sh
# Calling this script again will spawn a shell inside the container.
# (can be called multiple times to spawn multiple shells within the same
# container).
./env/start.sh
# Runninge start.sh the second time will automatically build the fuzzer.
# See `Fuzzing a Target using Fuzztruction` below for further instructions.
Components
Fuzztruction contains the following core components:
Scheduler
The scheduler orchestrates the interaction of the generator and the consumer. It governs the fuzzing campaign, and its main task is to organize the fuzzing loop. In addition, it also maintains a queue containing queue entries. Each entry consists of the seed input passed to the generator (if any) and all mutations applied to the generator. Each such queue entry represents a single test case. In traditional fuzzing, such a test case would be represented as a single file. The implementation of the scheduler is located in the scheduler
directory.
Generator
The generator can be considered a seed generator for producing inputs tailored to the fuzzing target, the consumer. While common fuzzing approaches mutate inputs on the fly through bit-level mutations, we mutate inputs indirectly by injecting faults into the generator program. More precisely, we identify and mutate data operations the generator uses to produce its output. To facilitate our approach, we require a program that generates outputs that match the input format the fuzzing target expects.
The implementation of the generator can be found in the generator
directory. It consists of two components that are explained in the following.
Compiler Pass
The compiler pass (generator/pass
) instruments the target using so-called patch points. Since the current (tested on LLVM12 and below) implementation of this feature is unstable, we patch LLVM to enable them for our approach. The patches can be found in the llvm
repository (included here as submodule). Please note that the patches are experimental and not intended for use in production.
The locations of the patch points are recorded in a separate section inside the compiled binary. The code related to parsing this section can be found at lib/llvm-stackmap-rs
, which we also published on crates.io.
During fuzzing, the scheduler chooses a target from the set of patch points and passes its decision down to the agent (described below) responsible for applying the desired mutation for the given patch point.
Agent
The agent, implemented in generator/agent
is running in the context of the generator application that was compiled with the custom compiler pass. Its main tasks are the implementation of a forkserver and communicating with the scheduler. Based on the instruction passed from the scheduler via shared memory and a message queue, the agent uses a JIT engine to mutate the generator.
Consumer
The generator's counterpart is the consumer: It is the target we are fuzzing that consumes the inputs generated by the generator. For Fuzztruction, it is sufficient to compile the consumer application with AFL++'s compiler pass, which we use to record the coverage feedback. This feedback guides our mutations of the generator.
Preparing the Runtime Environment (Docker Image)
Before using Fuzztruction, the runtime environment that comes as a Docker image is required. This image can be obtained by building it yourself locally or pulling a pre-built version. Both ways are described in the following. Before preparing the runtime environment, this repository, and all sub repositories, must be cloned:
git clone --recurse-submodules https://github.com/fuzztruction/fuzztruction.git
Local Build
The Fuzztruction runtime environment can be built by executing env/build.sh
. This builds a Docker image containing a complete runtime environment for Fuzztruction locally. By default, a pre-built version of our patched LLVM version is used and pulled from Docker Hub. If you want to use a locally built LLVM version, check the llvm
directory.
Pre-built
In most cases, there is no particular reason for using the pre-built environment -- except if you want to reproduce the exact experiments conducted in the paper. The pre-built image provides everything, including the pre-built evaluation targets and all dependencies. The image can be retrieved by executing env/pull-prebuilt.sh
.
The following section documents how to spawn a runtime environment based on either a locally built image or the prebuilt one. Details regarding the reproduction of the paper's experiments can be found in the fuzztruction-experiments
submodule.
Managing the Runtime Environment Lifecycle
After building or pulling a pre-built version of the runtime environment, the fuzzer is ready to use. The fuzzers environment lifecycle is managed by a set of scripts located in the env
folder.
Script | Description |
---|---|
./env/start.sh | Spawn a new container or spawn a shell into an already running container. Prebuilt: Exporting USE_PREBUILT=1 spawns a container based on a pre-built environment. For switching from pre-build to local build or the other way around, stop.sh must be executed first. |
./env/stop.sh | This stops the container. Remember to call this after rebuilding the image. |
Using start.sh
, an arbitrary number of shells can be spawned in the container. Using Visual Studio Codes' Containers extension allows you to work conveniently inside the Docker container.
Several files/folders are mounted from the host into the container to facilitate data exchange. Details regarding the runtime environment are provided in the next section.
Runtime Environment Details
This section details the runtime environment (Docker container) provided alongside Fuzztruction. The user in the container is named user
and has passwordless sudo
access per default.
Permissions: The Docker images' user is named
user
and has the same User ID (UID) as the user who initially built the image. Thus, mounts from the host can be accessed inside the container. However, in the case of using the pre-built image, this might not be the case since the image was built on another machine. This must be considered when exchanging data with the host.
Inside the container, the following paths are (bind) mounted from the host:
Container Path | Host Path | Note |
---|---|---|
/home/user/fuzztruction | ./ | Pre-built: This folder is part of the image in case the pre-built image is used. Thus, changes are not reflected to the host. |
/home/user/shared | ./ | Used to exchange data with the host. |
/home/user/.zshrc | ./data/zshrc | - |
/home/user/.zsh_history | ./data/zsh_history | - |
/home/user/.bash_history | ./data/bash_history | - |
/home/user/.config/nvim/init.vim | ./data/init.vim | - |
/home/user/.config/Code | ./data/vscode-data | Used to persist Visual Studio Code config between container restarts. |
/ssh-agent | $SSH_AUTH_SOCK | Allows using the SSH-Agent inside the container if it runs on the host. |
/home/user/.gitconfig | /home/$USER/.gitconfig | Use gitconfig from the host, if there is any config. |
/ccache | ./data/ccache | Used to persist ccache cache between container restarts. |
Usage
After building the Docker runtime environment and spawning a container, the Fuzztruction binary itself must be built. After spawning a shell inside the container using ./env/start.sh
, the build process is triggered automatically. Thus, the steps in the next section are primarily for those who want to rebuild Fuzztruction after applying modifications to the code.
Building Fuzztruction
For building Fuzztruction, it is sufficient to call cargo build
in /home/user/fuzztruction
. This will build all components described in the Components section. The most interesting build artifacts are the following:
Artifacts | Description |
---|---|
./generator/pass/fuzztruction-source-llvm-pass.so | The LLVM pass is used to insert the patch points into the generator application. Note: The location of the pass is recorded in /etc/ld.so.conf.d/fuzztruction.conf ; thus, compilers are able to find the pass during compilation. If you run into trouble because the pass is not found, please run sudo ldconfig and retry using a freshly spawned shell. |
./generator/pass/fuzztruction-source-clang-fast | A compiler wrapper for compiling the generator application. This wrapper uses our custom compiler pass, links the targets against the agent, and injects a call to the agents' init method into the generator's main. |
./target/debug/libgenerator_agent.so | The agent the is injected into the generator application. |
./target/debug/fuzztruction | The fuzztruction binary representing the actual fuzzer. |
Fuzzing a Target using Fuzztruction
We will use libpng
as an example to showcase Fuzztruction's capabilities. Since libpng
is relatively small and has no external dependencies, it is not required to use the pre-built image for the following steps. However, especially on mobile CPUs, the building process may take up to several hours for building the AFL++ binary because of the collision free coverage map encoding feature and compare splitting.
Building the Target
Pre-built: If the pre-built version is used, building is unnecessary and this step can be skipped.
Switch into the fuzztruction-experiments/comparison-with-state-of-the-art/binaries/
directory and execute ./build.sh libpng
. This will pull the source and start the build according to the steps defined in libpng/config.sh
.
Benchmarking the Target
Using the following command
sudo ./target/debug/fuzztruction fuzztruction-experiments/comparison-with-state-of-the-art/configurations/pngtopng_pngtopng/pngtopng-pngtopng.yml --purge --show-output benchmark -i 100
allows testing whether the target works. Each target is defined using a YAML
configuration file. The files are located in the configurations
directory and are a good starting point for building your own config. The pngtopng-pngtopng.yml
file is extensively documented.
Troubleshooting
If the fuzzer terminates with an error, there are multiple ways to assist your debugging efforts.
- Passing
--show-output
tofuzztruction
allows you to observe stdout/stderr of the generator and the consumer if they are not used for passing or reading data from each other. - Setting AFL_DEBUG in the
env
section of thesink
in theYAML
config can give you a more detailed output regarding the consumer. - Executing the generator and consumer using the same flags as in the config file might reveal any typo in the command line used to execute the application. In the case of using
LD_PRELOAD
, double check the provided paths.
Running the Fuzzer
To start the fuzzing process, executing the following command is sufficient:
sudo ./target/debug/fuzztruction ./fuzztruction-experiments/comparison-with-state-of-the-art/configurations/pngtopng_pngtopng/pngtopng-pngtopng.yml fuzz -j 10 -t 10m
This will start a fuzzing run on 10 cores, with a timeout of 10 minutes. Output produced by the fuzzer is stored in the directory defined by the work-directory
attribute in the target's config file. In case of pngtopng
, the default location is /tmp/pngtopng-pngtopng
.
If the working directory already exists, --purge
must be passed as an argument to fuzztruction
to allow it to rerun. The flag must be passed before the subcommand, i.e., before fuzz
or benchmark
.
Combining Fuzztruction and AFL++
For running AFL++ alongside Fuzztruction, the aflpp
subcommand can be used to spawn AFL++ workers that are reseeded during runtime with inputs found by Fuzztruction. Assuming that Fuzztruction was executed using the command above, it is sufficient to execute
sudo ./target/debug/fuzztruction ./fuzztruction-experiments/comparison-with-state-of-the-art/configurations/pngtopng_pngtopng/pngtopng-pngtopng.yml aflpp -j 10 -t 10m
for spawning 10 AFL++ processes that are terminated after 10 minutes. Inputs found by Fuzztruction and AFL++ are periodically synced into the interesting
folder in the working directory. In case AFL++ should be executed independently but based on the same .yml
configuration file, the --suffix
argument can be used to append a suffix to the working directory of the spawned fuzzer.
Computing Coverage
After the fuzzing run is terminated, the tracer
subcommand allows to retrieve a list of covered basic blocks for all interesting inputs found during fuzzing. These traces are stored in the traces
subdirectory located in the working directory. Each trace contains a zlib compressed JSON object of the addresses of all basic blocks (in execution order) exercised during execution. Furthermore, metadata to map the addresses to the actual ELF file they are located in is provided.
The coverage
tool located at ./target/debug/coverage
can be used to process the collected data further. You need to pass it the top-level directory containing working directories created by Fuzztruction (e.g., /tmp
in case of the previous example). Executing ./target/debug/coverage /tmp
will generate a .csv
file that maps time to the number of covered basic blocks and a .json
file that maps timestamps to sets of found basic block addresses. Both files are located in the working directory of the specific fuzzing run.