The LibAFL Fuzzing Library
by Andrea Fioraldi and Dominik Maier
Welcome to LibAFL, the Advanced Fuzzing Library. This book shall be a gentle introduction to the library.
This version of the LibAFL book is coupled with the release 1.0 beta of the library.
This document is still work-in-progress and incomplete. The structure and the concepts explained here are subject to change in future revisions, as the structure of LibAFL itself will evolve.
The HTML version of this book is available online at https://aflplus.plus/libafl-book/ and offline from the LibAFL repository in the docs/
folder.
Build it using mdbook build
in this folder, or run mdbook serve
to view the book.
Introduction
Fuzzers are important tools for security researchers and developers alike. A wide range of state-of-the-art tools like AFL++, libFuzzer or honggfuzz are available to users. They do their job in a very effective way, finding thousands of bugs.
From the perspective of a power user, however, these tools are limited. Their designs do not treat extensibility as a first-class citizen. Usually, a fuzzer developer can choose to either fork one of these existing tools, or to create a new fuzzer from scratch. In any case, researchers end up with tons of fuzzers, all of which are incompatible with each other. Their outstanding features cannot just be combined for new projects. By reinventing the wheel over and over, we may completely miss out on features that are complex to reimplement.
To tackle this issue, we created LibAFL, a library that is not just another fuzzer, but a collection of reusable pieces for individual fuzzers. LibAFL, written in Rust, helps you develop a fuzzer tailored for your specific needs. Be it a specific target, a particular instrumentation backend, or a custom mutator, you can leverage existing bits and pieces to craft the fastest and most efficient fuzzer you can envision.
Why LibAFL?
LibAFL gives you many of the benefits of an off-the-shelf fuzzer, while being completely customizable. Some highlight features currently include:
multi platform
: LibAFL works pretty much anywhere you can find a Rust compiler for. We already used it on Windows, Android, MacOS, and Linux, on x86_64, aarch64, ...portable
:LibAFL
can be built inno_std
mode. This means it does not require a specific OS-dependent runtime to function. Define an allocator and a way to map pages, and you are good to inject LibAFL in obscure targets like embedded devices, hypervisors, or maybe even WebAssembly?adaptable
: Given years of experience fine-tuning AFLplusplus and our academic fuzzing background, we could incorporate recent fuzzing trends into LibAFL's design and make it future-proof. To give an example, as opposed to old-school fuzzers, aBytesInput
is just one of the potential forms of inputs: feel free to use and mutate an Abstract Syntax Tree instead, for structured fuzzing.scalable
: As part of LibAFL, we developedLow Level Message Passing
,LLMP
for short, which allows LibAFL to scale almost linearly over cores. That is, if you chose to use this feature - it is your fuzzer, after all. Scaling to multiple machines over TCP is also possible, using LLMP'sbroker2broker
feature.fast
: We do everything we can at compile time so that the runtime overhead is as minimal as it can get.bring your own target
: We support binary-only modes, like (full-system) QEMU-Mode and Frida-Mode with ASan and CmpLog, as well as multiple compilation passes for sourced-based instrumentation. Of course, we also support custom instrumentation, as you can see in the Python example based on Google's Atheris.usable
: This one is on you to decide. Dig right in!
Getting Started
To get started with LibAFL, there are some initial steps to take.
In this chapter, we discuss how to download and build LibAFL, using Rust's cargo
command.
We also describe the structure of LibAFL's components, so-called crates, and the purpose of each individual crate.
Setup
The first step is to download LibAFL and all dependencies that are not automatically installed with cargo
.
Command Line Notation
In this chapter and throughout the book, we show some commands used in the terminal. Lines that you should enter in a terminal all start with
$
. You don’t need to type in the$
character; it indicates the start of each command. Lines that don’t start with$
typically show the output of the previous command. Additionally, PowerShell-specific examples will use>
rather than$
.
While technically you do not need to install LibAFL, but can use the version from crates.io directly, we do recommend to download or clone the GitHub version.
This gets you the example fuzzers, additional utilities, and latest patches.
The easiest way to do this is to use git
.
$ git clone https://github.com/AFLplusplus/LibAFL.git
Alternatively, on a UNIX-like machine, you can download a compressed archive and extract it with:
$ wget https://github.com/AFLplusplus/LibAFL/archive/main.tar.gz
$ tar xvf main.tar.gz
$ rm main.tar.gz
$ ls LibAFL-main # this is the extracted folder
Clang installation
One of the external dependencies of LibAFL is the Clang C/C++ compiler. While most of the code is written in pure Rust, we still need a C compiler because stable Rust still does not support features that some parts of LibAFL may need, such as weak linking, and LLVM builtins linking. For these parts, we use C to expose the missing functionalities to our Rust codebase.
In addition, if you want to perform source-level fuzz testing of C/C++ applications, you will likely need Clang with its instrumentation options to compile the programs under test.
On Linux you could use your distribution's package manager to get Clang, but these packages are not always up-to-date. Instead, we suggest using the Debian/Ubuntu prebuilt packages from LLVM that are available using their official repository.
For Microsoft Windows, you can download the installer package that LLVM generates periodically.
Despite Clang being the default C compiler on MacOS, we discourage the use of the build shipped by Apple and encourage
the installation from Homebrew, using brew install llvm
.
Alternatively, you can download and build the LLVM source tree - Clang included - following the steps explained here.
Rust installation
If you do not have Rust installed, you can easily follow the steps described here
to install it on any supported system.
Be aware that Rust versions shipped with Linux distributions may be outdated, LibAFL always targets the latest stable
version available via rustup upgrade
.
We suggest installing Clang and LLVM first.
Building LibAFL
LibAFL, as most of the Rust projects, can be built using cargo
from the root directory of the project with:
$ cargo build --release
Note that the --release
flag is optional for development, but you need to add it to do fuzzing at a decent speed.
Slowdowns of 10x or more are not uncommon for Debug builds.
The LibAFL repository is composed of multiple crates.
The top-level Cargo.toml
is the workspace file grouping these crates.
Calling cargo build
from the root directory will compile all crates in the workspace.
Build Example Fuzzers
The best starting point for experienced rustaceans is to read through, and adapt, the example fuzzers.
We group these fuzzers in the ./fuzzers
directory of the LibAFL repository.
The directory contains a set of crates that are not part of the workspace.
Each of these example fuzzers uses particular features of LibAFL, sometimes combined with different instrumentation backends (e.g. SanitizerCoverage, Frida, ...).
You can use these crates as examples and as skeletons for custom fuzzers with similar feature sets.
Each fuzzer will have a README.md
file in its directory, describing the fuzzer and its features.
To build an example fuzzer, you have to invoke cargo build --release
from its respective folder (fuzzers/[FUZZER_NAME]
).
Crates
LibAFL is composed of different crates.
A crate is an individual library in Rust's Cargo build system, that you can use by adding it to your project's Cargo.toml
, like:
[dependencies]
libafl = { version = "*" }
Crate List
For LibAFL, each crate has its self-contained purpose, and the user may not need to use all of them in their project. Following the naming convention of the folders in the project's root, they are:
libafl
This is the main crate that contains all the components needed to build a fuzzer.
This crate has a number of feature flags that enable and disable certain aspects of LibAFL.
The features can be found in LibAFL's Cargo.toml
under "[features]
", and are usually explained with comments there.
Some features worthy of remark are:
std
enables the parts of the code that use the Rust standard library. Without this flag, LibAFL isno_std
compatible. This disables a range of features, but allows us to use LibAFL in embedded environments, read theno_std
section for further details.derive
enables the usage of thederive(...)
macros defined in libafl_derive from libafl.rand_trait
allows you to use LibAFL's very fast (but insecure!) random number generator wherever compatibility with Rust'srand
crate is needed.llmp_bind_public
makes LibAFL's LLMP bind to a public TCP port, over which other fuzzers nodes can communicate with this instance.introspection
adds performance statistics to LibAFL.
You can choose the features by using features = ["feature1", "feature2", ...]
for LibAFL in your Cargo.toml
.
Out of this list, by default, std
, derive
, and rand_trait
are already set.
You can choose to disable them by setting default-features = false
in your Cargo.toml
.
libafl_bolts
The libafl_bolts
crate is a minimal tool shed filled with useful low-level rust features, not necessarily related to fuzzers.
In it, you'll find highlights like:
core_affinity
to bind the current process to coresSerdeAnyMap
a map that can store typed values in a serializable fashionminibsod
to dump the current process stateLLMP
, "low level message passing", a lock-free IPC mechanismRand
, different fast (non-cryptographically secure) RNG implementations like RomuRandShMem
, a platform independent shard memory implementationTuples
, a compiletime tuple implementation
... and much more.
libafl_sugar
The sugar crate abstracts away most of the complexity of LibAFL's API.
Instead of high flexibility, it aims to be high-level and easy-to-use.
It is not as flexible as stitching your fuzzer together from each individual component, but allows you to build a fuzzer with minimal lines of code.
To see it in action, take a look at the libfuzzer_stb_image_sugar
example fuzzer.
libafl_derive
This a proc-macro crate paired with the libafl
crate.
At the moment, it just exposes the derive(SerdeAny)
macro that can be used to define Metadata structs, see the section about Metadata for details.
libafl_targets
This crate exposes code to interact with, and to instrument, targets. To enable and disable features at compile-time, the features are enabled and disabled using feature flags.
Currently, the supported flags are:
pcguard_edges
defines the SanitizerCoverage trace-pc-guard hooks to track the executed edges in a map.pcguard_hitcounts
defines the SanitizerCoverage trace-pc-guard hooks to track the executed edges with the hitcounts (like AFL) in a map.libfuzzer
exposes a compatibility layer with libFuzzer style harnesses.value_profile
defines the SanitizerCoverage trace-cmp hooks to track the matching bits of each comparison in a map.
libafl_cc
This is a library that provides utils to wrap compilers and create source-level fuzzers.
At the moment, only the Clang compiler is supported. To understand it deeper, look through the tutorials and examples.
libafl_frida
This library bridges LibAFL with Frida as instrumentation backend. With this crate, you can instrument targets on Linux/macOS/Windows/Android for coverage collection. Additionally, it supports CmpLog, and AddressSanitizer instrumentation and runtimes for aarch64. See further information, as well as usage instructions, later in the book.
libafl_qemu
This library bridges LibAFL with QEMU user-mode to fuzz ELF cross-platform binaries.
It works on Linux and can collect edge coverage without collisions! It also supports a wide range of hooks and instrumentation options.
libafl_nyx
Nyx is a KVM-based snapshot fuzzer. libafl_nyx
adds these capabilities to LibAFL. There is a specific section explaining usage of libafl_nyx later in the book.
libafl_concolic
Concolic fuzzing is the combination of fuzzing and a symbolic execution engine. This can reach greater depth than normal fuzzing, and is exposed in this crate. There is a specific section explaining usage of libafl_concolic later in the book.
A Simple LibAFL Fuzzer
This chapter discusses a naive fuzzer using the LibAFL API.
You will learn about basic entities such as State
, Observer
, and Executor
.
While the following chapters discuss the components of LibAFL in detail, here we introduce the fundamentals.
We are going to fuzz a simple Rust function that panics under a condition. The fuzzer will be single-threaded and will stop after the crash, just like libFuzzer normally does.
You can find a complete version of this tutorial as an example fuzzer in fuzzers/baby_fuzzer
.
Warning
This example fuzzer is too naive for any real-world usage. Its purpose is solely to show the main components of the library, for a more in-depth walkthrough on building a custom fuzzer go to the Tutorial chapter directly.
Creating a project
We use cargo to create a new Rust project with LibAFL as a dependency.
$ cargo new baby_fuzzer
$ cd baby_fuzzer
The generated Cargo.toml
looks like the following:
[package]
name = "baby_fuzzer_listing_01"
version = "0.1.0"
authors = ["Your Name <you@example.com>"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
In order to use LibAFl we must add it as dependency adding libafl = { path = "path/to/libafl/" }
under [dependencies]
.
That path actually needs to point to the libafl
directory within the cloned repo, not the root of the repo itself.
You can use the LibAFL version from crates.io if you want, in this case, you have to use libafl = "*"
to get the latest version (or set it to the current version).
As we are going to fuzz Rust code, we want that a panic does not simply cause the program to exit, but raise an abort
that can then be caught by the fuzzer.
To do that, we specify panic = "abort"
in the profiles.
Alongside this setting, we add some optimization flags for the compilation, when building in release mode.
The final Cargo.toml
should look similar to the following:
[package]
name = "baby_fuzzer_listing_02"
version = "0.1.0"
authors = ["Your Name <you@example.com>"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
libafl = { path = "path/to/libafl/" }
libafl_bolts = { path = "path/to/libafl_bolts/" }
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
lto = true
codegen-units = 1
opt-level = 3
debug = true
The function under test
Opening src/main.rs
, we have an empty main
function.
To start, we create the closure that we want to fuzz. It takes a buffer as input and panics if it starts with "abc"
.
ExitKind
is used to inform the fuzzer about the harness' exit status.
extern crate libafl; extern crate libafl_bolts; use libafl::{ executors::ExitKind, inputs::{BytesInput, HasTargetBytes}, }; use libafl_bolts::AsSlice; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); }
To test the crash manually, you can add a feature in Cargo.toml
that enables the call that triggers the panic:
[features]
panic = []
And then run the program with that feature activated:
$ cargo run -F panic
And you should see the program crash as expected.
Generating and running some tests
One of the main components that a LibAFL-based fuzzer uses is the State, a container of the data that will evolve during the fuzzing process.
It includes all state, such as the Corpus of inputs, the current RNG state, and potential Metadata for the testcases and run.
In our main
we create a basic State instance like the following:
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, AsSlice}; use std::path::PathBuf; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut (), &mut (), ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcases from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, (), ()); // Create the executor for an in-process function let mut executor = InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut mgr) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
-
The first parameter is a random number generator, that is part of the fuzzer state, in this case, we use the default one
StdRand
, but you can choose a different one. We seed it with the current nanoseconds. -
The second parameter is an instance of something implementing the Corpus trait,
InMemoryCorpus
in this case. The corpus is the container of the testcases evolved by the fuzzer, in this case, we keep it all in memory.To avoid type annotation error, you can use
InMemoryCorpus::<BytesInput>::new()
to replaceInMemoryCorpus::new()
. If not, type annotation will be automatically inferred when addingexecutor
. -
The third parameter is another Corpus that stores the "solution" testcases for the fuzzer. For our purpose, the solution is the input that triggers the panic. In this case, we want to store it to disk under the
crashes
directory, so we can inspect it. -
The last two parameters are feedback and objective, we will discuss them later.
Another required component is the EventManager. It handles some events such as the addition of a testcase to the corpus during the fuzzing process. For our purpose, we use the simplest one that just displays the information about these events to the user using a Monitor
instance.
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, AsSlice}; use std::path::PathBuf; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut (), &mut (), ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcases from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, (), ()); // Create the executor for an in-process function let mut executor = InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut mgr) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
In addition, we have the Fuzzer, an entity that contains some actions that alter the State. One of these actions is the scheduling of the testcases to the fuzzer using a Scheduler.
We create it as QueueScheduler
, a scheduler that serves testcases to the fuzzer in a FIFO fashion.
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, AsSlice}; use std::path::PathBuf; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut (), &mut (), ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcases from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, (), ()); // Create the executor for an in-process function let mut executor = InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut mgr) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
Last but not least, we need an Executor that is the entity responsible to run our program under test. In this example, we want to run the harness function in-process (without forking off a child, for example), and so we use the InProcessExecutor
.
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, AsSlice}; use std::path::PathBuf; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut (), &mut (), ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcases from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, (), ()); // Create the executor for an in-process function let mut executor = InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut mgr) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
It takes a reference to the harness, the state, and the event manager. We will discuss the second parameter later.
As the executor expects that the harness returns an ExitKind object, so we have added ExitKind::Ok
to our harness function before.
Now we have the 4 major entities ready for running our tests, but we still cannot generate testcases.
For this purpose, we use a Generator, RandPrintablesGenerator
that generates a string of printable bytes.
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, AsSlice}; use std::path::PathBuf; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut (), &mut (), ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcases from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, (), ()); // Create the executor for an in-process function let mut executor = InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut mgr) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
Now you can prepend the necessary use
directives to your main.rs and compile the fuzzer.
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, AsSlice}; use std::path::PathBuf; fn main() { let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); if buf.len() > 0 && buf[0] == 'a' as u8 { if buf.len() > 1 && buf[1] == 'b' as u8 { if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut (), &mut (), ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcases from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, (), ()); // Create the executor for an in-process function let mut executor = InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut mgr) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
When running, you should see something similar to:
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
Running `target/debug/baby_fuzzer`
[LOG Debug]: Loaded 0 over 8 initial testcases
Evolving the corpus with feedbacks
Now you simply ran 8 randomly generated testcases, but none of them has been stored in the corpus. If you are very lucky, maybe you triggered the panic by chance but you don't see any saved file in crashes
.
Now we want to turn our simple fuzzer into a feedback-based one and increase the chance to generate the right input to trigger the panic. We are going to implement a simple feedback based on the 3 conditions that are needed to reach the panic. To do that, we need a way to keep track of if a condition is satisfied.
Observer can record the information about properties of a fuzzing run and then feeds the fuzzer. We use the StdMapObserver
, the default observer that uses a map to keep track of covered elements. In our fuzzer, each condition is mapped to an entry of such map.
We represent such map as a static mut
variable.
As we don't rely on any instrumentation engine, we have to manually track the satisfied conditions by signals_set
in our harness:
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, feedbacks::{CrashFeedback, MaxMapFeedback}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, observers::StdMapObserver, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice}; use std::path::PathBuf; // Coverage map with explicit assignments due to the lack of instrumentation static mut SIGNALS: [u8; 16] = [0; 16]; fn signals_set(idx: usize) { unsafe { SIGNALS[idx] = 1 }; } fn main() { // The closure that we want to fuzz let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); signals_set(0); // set SIGNALS[0] if buf.len() > 0 && buf[0] == 'a' as u8 { signals_set(1); // set SIGNALS[1] if buf.len() > 1 && buf[1] == 'b' as u8 { signals_set(2); // set SIGNALS[2] if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // Create an observation channel using the signals map let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) }; // Feedback to rate the interestingness of an input let mut feedback = MaxMapFeedback::new(&observer); // A feedback to choose if an input is a solution or not let mut objective = CrashFeedback::new(); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut feedback, &mut objective, ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcasess from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective); // Create the executor for an in-process function with just one observer let mut executor = InProcessExecutor::new( &mut harness, tuple_list!(observer), &mut fuzzer, &mut state, &mut mgr, ) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
The observer can be created directly from the SIGNALS
map, in the following way:
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, feedbacks::{CrashFeedback, MaxMapFeedback}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, observers::StdMapObserver, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice}; use std::path::PathBuf; // Coverage map with explicit assignments due to the lack of instrumentation static mut SIGNALS: [u8; 16] = [0; 16]; fn signals_set(idx: usize) { unsafe { SIGNALS[idx] = 1 }; } fn main() { // The closure that we want to fuzz let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); signals_set(0); // set SIGNALS[0] if buf.len() > 0 && buf[0] == 'a' as u8 { signals_set(1); // set SIGNALS[1] if buf.len() > 1 && buf[1] == 'b' as u8 { signals_set(2); // set SIGNALS[2] if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // Create an observation channel using the signals map let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) }; // Feedback to rate the interestingness of an input let mut feedback = MaxMapFeedback::new(&observer); // A feedback to choose if an input is a solution or not let mut objective = CrashFeedback::new(); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut feedback, &mut objective, ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcasess from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective); // Create the executor for an in-process function with just one observer let mut executor = InProcessExecutor::new( &mut harness, tuple_list!(observer), &mut fuzzer, &mut state, &mut mgr, ) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
The observers are usually kept in the corresponding executor as they keep track of information that is valid for just one run. We have then to modify our InProcessExecutor creation to include the observer as follows:
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, feedbacks::{CrashFeedback, MaxMapFeedback}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, observers::StdMapObserver, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice}; use std::path::PathBuf; // Coverage map with explicit assignments due to the lack of instrumentation static mut SIGNALS: [u8; 16] = [0; 16]; fn signals_set(idx: usize) { unsafe { SIGNALS[idx] = 1 }; } fn main() { // The closure that we want to fuzz let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); signals_set(0); // set SIGNALS[0] if buf.len() > 0 && buf[0] == 'a' as u8 { signals_set(1); // set SIGNALS[1] if buf.len() > 1 && buf[1] == 'b' as u8 { signals_set(2); // set SIGNALS[2] if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // Create an observation channel using the signals map let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) }; // Feedback to rate the interestingness of an input let mut feedback = MaxMapFeedback::new(&observer); // A feedback to choose if an input is a solution or not let mut objective = CrashFeedback::new(); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut feedback, &mut objective, ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcasess from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective); // Create the executor for an in-process function with just one observer let mut executor = InProcessExecutor::new( &mut harness, tuple_list!(observer), &mut fuzzer, &mut state, &mut mgr, ) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
Now that the fuzzer can observe which condition is satisfied, we need a way to rate an input as interesting (i.e. worth of addition to the corpus) based on this observation. Here comes the notion of Feedback.
Feedback is part of the State and provides a way to rate input and its corresponding execution as interesting looking for the information in the observers. Feedbacks can maintain a cumulative state of the information seen so far in a metadata in the State, in our case it maintains the set of conditions satisfied in the previous runs.
We use MaxMapFeedback
, a feedback that implements a novelty search over the map of the MapObserver. Basically, if there is a value in the observer's map that is greater than the maximum value registered so far for the same entry, it rates the input as interesting and updates its state.
Objective Feedback is another kind of Feedback which decides if an input is a "solution". It will save input to solutions(./crashes
in our case) rather than corpus when the input is rated interesting. We use CrashFeedback
to tell the fuzzer that if an input causes the program to crash it is a solution for us.
We need to update our State creation including the feedback state and the Fuzzer including the feedback and the objective:
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, feedbacks::{CrashFeedback, MaxMapFeedback}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, observers::StdMapObserver, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice}; use std::path::PathBuf; // Coverage map with explicit assignments due to the lack of instrumentation static mut SIGNALS: [u8; 16] = [0; 16]; fn signals_set(idx: usize) { unsafe { SIGNALS[idx] = 1 }; } fn main() { // The closure that we want to fuzz let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); signals_set(0); // set SIGNALS[0] if buf.len() > 0 && buf[0] == 'a' as u8 { signals_set(1); // set SIGNALS[1] if buf.len() > 1 && buf[1] == 'b' as u8 { signals_set(2); // set SIGNALS[2] if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // Create an observation channel using the signals map let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) }; // Feedback to rate the interestingness of an input let mut feedback = MaxMapFeedback::new(&observer); // A feedback to choose if an input is a solution or not let mut objective = CrashFeedback::new(); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut feedback, &mut objective, ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcasess from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective); // Create the executor for an in-process function with just one observer let mut executor = InProcessExecutor::new( &mut harness, tuple_list!(observer), &mut fuzzer, &mut state, &mut mgr, ) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
Once again, you need to add the necessary use
directives for this to work properly:
extern crate libafl; extern crate libafl_bolts; use libafl::{ corpus::{InMemoryCorpus, OnDiskCorpus}, events::SimpleEventManager, executors::{inprocess::InProcessExecutor, ExitKind}, feedbacks::{CrashFeedback, MaxMapFeedback}, fuzzer::StdFuzzer, generators::RandPrintablesGenerator, inputs::{BytesInput, HasTargetBytes}, monitors::SimpleMonitor, observers::StdMapObserver, schedulers::QueueScheduler, state::StdState, }; use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice}; use std::path::PathBuf; // Coverage map with explicit assignments due to the lack of instrumentation static mut SIGNALS: [u8; 16] = [0; 16]; fn signals_set(idx: usize) { unsafe { SIGNALS[idx] = 1 }; } fn main() { // The closure that we want to fuzz let mut harness = |input: &BytesInput| { let target = input.target_bytes(); let buf = target.as_slice(); signals_set(0); // set SIGNALS[0] if buf.len() > 0 && buf[0] == 'a' as u8 { signals_set(1); // set SIGNALS[1] if buf.len() > 1 && buf[1] == 'b' as u8 { signals_set(2); // set SIGNALS[2] if buf.len() > 2 && buf[2] == 'c' as u8 { panic!("=)"); } } } ExitKind::Ok }; // To test the panic: let input = BytesInput::new(Vec::from("abc")); #[cfg(feature = "panic")] harness(&input); // Create an observation channel using the signals map let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) }; // Feedback to rate the interestingness of an input let mut feedback = MaxMapFeedback::new(&observer); // A feedback to choose if an input is a solution or not let mut objective = CrashFeedback::new(); // create a State from scratch let mut state = StdState::new( // RNG StdRand::with_seed(current_nanos()), // Corpus that will be evolved, we keep it in memory for performance InMemoryCorpus::new(), // Corpus in which we store solutions (crashes in this example), // on disk so the user can get them after stopping the fuzzer OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(), &mut feedback, &mut objective, ) .unwrap(); // The Monitor trait defines how the fuzzer stats are displayed to the user let mon = SimpleMonitor::new(|s| println!("{s}")); // The event manager handles the various events generated during the fuzzing loop // such as the notification of the addition of a new item to the corpus let mut mgr = SimpleEventManager::new(mon); // A queue policy to get testcasess from the corpus let scheduler = QueueScheduler::new(); // A fuzzer with feedbacks and a corpus scheduler let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective); // Create the executor for an in-process function with just one observer let mut executor = InProcessExecutor::new( &mut harness, tuple_list!(observer), &mut fuzzer, &mut state, &mut mgr, ) .expect("Failed to create the Executor"); // Generator of printable bytearrays of max size 32 let mut generator = RandPrintablesGenerator::new(32); // Generate 8 initial inputs state .generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8) .expect("Failed to generate the initial corpus"); }
The actual fuzzing
Now, we can run the program, but the outcome is not so different from the previous one as the random generator does not take into account what we save as interesting in the corpus. To do that, we need to plug a Mutator.
Stages perform actions on individual inputs, taken from the corpus.
For instance, the MutationalStage
executes the harness several times in a row, every time with mutated inputs.
As the last step, we create a MutationalStage that uses a mutator inspired by the havoc mutator of AFL.
extern crate libafl;
extern crate libafl_bolts;
use libafl::{
corpus::{InMemoryCorpus, OnDiskCorpus},
events::SimpleEventManager,
executors::{inprocess::InProcessExecutor, ExitKind},
feedbacks::{CrashFeedback, MaxMapFeedback},
fuzzer::{Fuzzer, StdFuzzer},
generators::RandPrintablesGenerator,
inputs::{BytesInput, HasTargetBytes},
monitors::SimpleMonitor,
mutators::scheduled::{havoc_mutations, StdScheduledMutator},
observers::StdMapObserver,
schedulers::QueueScheduler,
stages::mutational::StdMutationalStage,
state::StdState,
};
use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice};
use std::path::PathBuf;
// Coverage map with explicit assignments due to the lack of instrumentation
static mut SIGNALS: [u8; 16] = [0; 16];
fn signals_set(idx: usize) {
unsafe { SIGNALS[idx] = 1 };
}
fn main() {
// The closure that we want to fuzz
let mut harness = |input: &BytesInput| {
let target = input.target_bytes();
let buf = target.as_slice();
signals_set(0); // set SIGNALS[0]
if buf.len() > 0 && buf[0] == 'a' as u8 {
signals_set(1); // set SIGNALS[1]
if buf.len() > 1 && buf[1] == 'b' as u8 {
signals_set(2); // set SIGNALS[2]
if buf.len() > 2 && buf[2] == 'c' as u8 {
panic!("=)");
}
}
}
ExitKind::Ok
};
// To test the panic:
let input = BytesInput::new(Vec::from("abc"));
#[cfg(feature = "panic")]
harness(&input);
// Create an observation channel using the signals map
let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) };
// Feedback to rate the interestingness of an input
let mut feedback = MaxMapFeedback::new(&observer);
// A feedback to choose if an input is a solution or not
let mut objective = CrashFeedback::new();
// create a State from scratch
let mut state = StdState::new(
// RNG
StdRand::with_seed(current_nanos()),
// Corpus that will be evolved, we keep it in memory for performance
InMemoryCorpus::new(),
// Corpus in which we store solutions (crashes in this example),
// on disk so the user can get them after stopping the fuzzer
OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(),
&mut feedback,
&mut objective,
)
.unwrap();
// The Monitor trait defines how the fuzzer stats are displayed to the user
let mon = SimpleMonitor::new(|s| println!("{s}"));
// The event manager handles the various events generated during the fuzzing loop
// such as the notification of the addition of a new item to the corpus
let mut mgr = SimpleEventManager::new(mon);
// A queue policy to get testcasess from the corpus
let scheduler = QueueScheduler::new();
// A fuzzer with feedbacks and a corpus scheduler
let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective);
// Create the executor for an in-process function with just one observer
let mut executor = InProcessExecutor::new(
&mut harness,
tuple_list!(observer),
&mut fuzzer,
&mut state,
&mut mgr,
)
.expect("Failed to create the Executor");
// Generator of printable bytearrays of max size 32
let mut generator = RandPrintablesGenerator::new(32);
// Generate 8 initial inputs
state
.generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8)
.expect("Failed to generate the initial corpus");
// Setup a mutational stage with a basic bytes mutator
let mutator = StdScheduledMutator::new(havoc_mutations());
let mut stages = tuple_list!(StdMutationalStage::new(mutator));
fuzzer
.fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr)
.expect("Error in the fuzzing loop");
}
fuzz_loop
will request a testcase for each iteration to the fuzzer using the scheduler and then it will invoke the stage.
Again, we need to add the new use
directives:
extern crate libafl;
extern crate libafl_bolts;
use libafl::{
corpus::{InMemoryCorpus, OnDiskCorpus},
events::SimpleEventManager,
executors::{inprocess::InProcessExecutor, ExitKind},
feedbacks::{CrashFeedback, MaxMapFeedback},
fuzzer::{Fuzzer, StdFuzzer},
generators::RandPrintablesGenerator,
inputs::{BytesInput, HasTargetBytes},
monitors::SimpleMonitor,
mutators::scheduled::{havoc_mutations, StdScheduledMutator},
observers::StdMapObserver,
schedulers::QueueScheduler,
stages::mutational::StdMutationalStage,
state::StdState,
};
use libafl_bolts::{current_nanos, rands::StdRand, tuples::tuple_list, AsSlice};
use std::path::PathBuf;
// Coverage map with explicit assignments due to the lack of instrumentation
static mut SIGNALS: [u8; 16] = [0; 16];
fn signals_set(idx: usize) {
unsafe { SIGNALS[idx] = 1 };
}
fn main() {
// The closure that we want to fuzz
let mut harness = |input: &BytesInput| {
let target = input.target_bytes();
let buf = target.as_slice();
signals_set(0); // set SIGNALS[0]
if buf.len() > 0 && buf[0] == 'a' as u8 {
signals_set(1); // set SIGNALS[1]
if buf.len() > 1 && buf[1] == 'b' as u8 {
signals_set(2); // set SIGNALS[2]
if buf.len() > 2 && buf[2] == 'c' as u8 {
panic!("=)");
}
}
}
ExitKind::Ok
};
// To test the panic:
let input = BytesInput::new(Vec::from("abc"));
#[cfg(feature = "panic")]
harness(&input);
// Create an observation channel using the signals map
let observer = unsafe { StdMapObserver::new("signals", &mut SIGNALS) };
// Feedback to rate the interestingness of an input
let mut feedback = MaxMapFeedback::new(&observer);
// A feedback to choose if an input is a solution or not
let mut objective = CrashFeedback::new();
// create a State from scratch
let mut state = StdState::new(
// RNG
StdRand::with_seed(current_nanos()),
// Corpus that will be evolved, we keep it in memory for performance
InMemoryCorpus::new(),
// Corpus in which we store solutions (crashes in this example),
// on disk so the user can get them after stopping the fuzzer
OnDiskCorpus::new(PathBuf::from("./crashes")).unwrap(),
&mut feedback,
&mut objective,
)
.unwrap();
// The Monitor trait defines how the fuzzer stats are displayed to the user
let mon = SimpleMonitor::new(|s| println!("{s}"));
// The event manager handles the various events generated during the fuzzing loop
// such as the notification of the addition of a new item to the corpus
let mut mgr = SimpleEventManager::new(mon);
// A queue policy to get testcasess from the corpus
let scheduler = QueueScheduler::new();
// A fuzzer with feedbacks and a corpus scheduler
let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective);
// Create the executor for an in-process function with just one observer
let mut executor = InProcessExecutor::new(
&mut harness,
tuple_list!(observer),
&mut fuzzer,
&mut state,
&mut mgr,
)
.expect("Failed to create the Executor");
// Generator of printable bytearrays of max size 32
let mut generator = RandPrintablesGenerator::new(32);
// Generate 8 initial inputs
state
.generate_initial_inputs(&mut fuzzer, &mut executor, &mut generator, &mut mgr, 8)
.expect("Failed to generate the initial corpus");
// Setup a mutational stage with a basic bytes mutator
let mutator = StdScheduledMutator::new(havoc_mutations());
let mut stages = tuple_list!(StdMutationalStage::new(mutator));
fuzzer
.fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr)
.expect("Error in the fuzzing loop");
}
After adding this code, we have a proper fuzzer, that can run and find the input that panics the function in less than a second.
$ cargo run
Compiling baby_fuzzer v0.1.0 (/home/andrea/Desktop/baby_fuzzer)
Finished dev [unoptimized + debuginfo] target(s) in 1.56s
Running `target/debug/baby_fuzzer`
[New Testcase] clients: 1, corpus: 2, objectives: 0, executions: 1, exec/sec: 0
[LOG Debug]: Loaded 1 over 8 initial testcases
[New Testcase] clients: 1, corpus: 3, objectives: 0, executions: 804, exec/sec: 0
[New Testcase] clients: 1, corpus: 4, objectives: 0, executions: 1408, exec/sec: 0
thread 'main' panicked at '=)', src/main.rs:35:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Crashed with SIGABRT
Child crashed!
[Objective] clients: 1, corpus: 4, objectives: 1, executions: 1408, exec/sec: 0
Waiting for broker...
Bye!
As you can see, after the panic message, the objectives
count of the log increased by one and you will find the crashing input in crashes/
.
The complete code can be found in ./fuzzers/baby_fuzzer
alongside other baby_
fuzzers.
More Examples
Examples can be found under ./fuzzer
.
fuzzer name | usage |
---|---|
baby_fuzzer_gramatron | Gramatron is a fuzzer that uses grammar automatons in conjunction with aggressive mutation operators to synthesize complex bug triggers |
baby_fuzzer_grimoire | Grimoire is a fully automated coverage-guided fuzzer which works without any form of human interaction or pre-configuration |
baby_fuzzer_nautilus | nautilus is a coverage guided, grammar based fuzzer |
baby_fuzzer_tokens | basic token level fuzzer with token level mutations |
baby_fuzzer_with_forkexecutor | example for InProcessForkExecutor |
baby_no_std | a minimalistic example how to create a libafl based fuzzer that works on no_std environments like TEEs, Kernels or on bare metal |
Core Concepts
LibAFL is designed around some core concepts that we think can effectively abstract most of the other fuzzers designs.
Here, we discuss these concepts and provide some examples related to other fuzzers.
Observer
An Observer is an entity that provides an information observed during the execution of the program under test to the fuzzer.
The information contained in the Observer is not preserved across executions, but it may be serialized and passed on to other nodes if an Input
is considered interesting
, and added to the Corpus
.
As an example, the coverage map, filled during the execution to report the executed edges used by fuzzers such as AFL and HonggFuzz
can be considered an observation. Another Observer
can collect the time spent executing a run, the program output, or a more advanced observation, like maximum stack depth at runtime.
This information is an observation of a dynamic property of the program.
In terms of code, in the library this entity is described by the Observer
trait.
In addition to holding the volatile data connected with the last execution of the target, the structures implementing this trait can define some execution hooks that are executed before and after each fuzz case. In these hooks, the observer can modify the fuzzer's state.
The fuzzer will act based on these observers through a Feedback
, that reduces the observation to the choice if a testcase is interesting
for the fuzzer, or not.
Executor
In different fuzzers, this concept of executing the program under test means each run is now always the same. For instance, for in-memory fuzzers like libFuzzer an execution is a call to an harness function, for hypervisor-based fuzzers like kAFL instead an entire operating system is started from a snapshot for each run.
In our model, an Executor is the entity that defines not only how to execute the target, but all the volatile operations that are related to just a single run of the target.
So the Executor is for instance responsible to inform the program about the input that the fuzzer wants to use in the run, writing to a memory location for instance or passing it as a parameter to the harness function.
In our model, it can also hold a set of Observers connected with each execution.
In Rust, we bind this concept to the Executor
trait. A structure implementing this trait must implement HasObservers
too if wants to hold a set of Observers.
By default, we implement some commonly used Executors such as InProcessExecutor
in which the target is a harness function providing in-process crash detection. Another Executor is the ForkserverExecutor
that implements an AFL-like mechanism to spawn child processes to fuzz.
A common pattern when creating an Executor is wrapping an existing one, for instance TimeoutExecutor
wraps an executor and installs a timeout callback before calling the original run
function of the wrapped executor.
InProcessExecutor
Let's begin with the base case; InProcessExecutor
.
This executor executes the harness program (function) inside the fuzzer process.
When you want to execute the harness as fast as possible, you will most probably want to use this InprocessExecutor
.
One thing to note here is, when your harness is likely to have heap corruption bugs, you want to use another allocator so that corrupted heap does not affect the fuzzer itself. (For example, we adopt MiMalloc in some of our fuzzers.). Alternatively you can compile your harness with address sanitizer to make sure you can catch these heap bugs.
ForkserverExecutor
Next, we'll take a look at the ForkserverExecutor
. In this case, it is afl-cc
(from AFL/AFLplusplus) that compiles the harness code, and therefore, we can't use EDGES_MAP
anymore. Fortunately we have a way to tell the forkserver which map to record the coverage in.
As you can see from the forkserver example,
//Coverage map shared between observer and executor
let mut shmem = StdShMemProvider::new().unwrap().new_shmem(MAP_SIZE).unwrap();
//let the forkserver know the shmid
shmem.write_to_env("__AFL_SHM_ID").unwrap();
let mut shmem_buf = shmem.as_mut_slice();
Here we make a shared memory region; shmem
, and write this to environmental variable __AFL_SHM_ID
. Then the instrumented binary, or the forkserver, finds this shared memory region (from the aforementioned env var) to record its coverage. On your fuzzer side, you can pass this shmem map to your Observer
to obtain coverage feedbacks combined with any Feedback
.
Another feature of the ForkserverExecutor
to mention is the shared memory testcases. In normal cases, the mutated input is passed between the forkserver and the instrumented binary via .cur_input
file. You can improve your forkserver fuzzer's performance by passing the input with shared memory.
If the target is configured to use shared memory testcases, the ForkserverExecutor
will notice this during the handshake and will automatically set up things accordingly.
See AFL++'s documentation or the fuzzer example in forkserver_simple/src/program.c
for reference.
InprocessForkExecutor
Finally, we'll talk about the InProcessForkExecutor
.
InProcessForkExecutor
has only one difference from InprocessExecutor
; It forks before running the harness and that's it.
But why do we want to do so? Well, under some circumstances, you may find your harness pretty unstable or your harness wreaks havoc on the global states. In this case, you want to fork it before executing the harness runs in the child process so that it doesn't break things.
However, we have to take care of the shared memory, it's the child process that runs the harness code and writes the coverage to the map.
We have to make the map shared between the parent process and the child process, so we'll use shared memory again. You should compile your harness with pointer_maps
(for libafl_targets
) features enabled, this way, we can have a pointer; EDGES_MAP_PTR
that can point to any coverage map.
On your fuzzer side, you can allocate a shared memory region and make the EDGES_MAP_PTR
point to your shared memory.
let mut shmem;
unsafe{
shmem = StdShMemProvider::new().unwrap().new_shmem(MAX_EDGES_NUM).unwrap();
}
let shmem_buf = shmem.as_mut_slice();
unsafe{
EDGES_PTR = shmem_buf.as_ptr();
}
Again, you can pass this shmem map to your Observer
and Feedback
to obtain coverage feedbacks.
Feedback
The Feedback is an entity that classifies the outcome of an execution of the program under test as interesting or not. Typically, if an execution is interesting, the corresponding input used to feed the target program is added to a corpus.
Most of the time, the notion of Feedback is deeply linked to the Observer, but they are different concepts.
The Feedback, in most of the cases, processes the information reported by one or more observers to decide if the execution is interesting. The concept of "interestingness" is abstract, but typically it is related to a novelty search (i.e. interesting inputs are those that reach a previously unseen edge in the control flow graph).
As an example, given an Observer that reports all the sizes of memory allocations, a maximization Feedback can be used to maximize these sizes to sport pathological inputs in terms of memory consumption.
In terms of code, the library offers the Feedback
trait.
It is used to implement functors that, given the state of the observers from the last execution, tells if the execution was interesting.
So to speak, it reduces the observations to a boolean result of is_interesting
- or not.
For this, a Feedback
can store anything it wants to persist in the fuzzers's state.
This might be, for instance, the cumulative map of all edges seen so far, in the case of a feedback based on edge coverage.
This can be achieved by adding Metadata
in init_state
and accessing it later in is_interesting
.
Feedback
can also add custom metadata to a newly created Testcase
using append_metadata
.
Multiple Feedbacks can be combined into a boolean expression, considering for instance an execution as interesting if it triggers new code paths or execute in less time compared to the average execution time using feedback_or
.
On top, logic operators like feedback_or
and feedback_and
have a _fast
variant (e.g. feedback_or_fast
) where the second feedback will not be evaluated, if the value of the first feedback operand already answers the interestingness
question so as to save precious performance.
Using feedback_and_fast
in combination with ConstFeedback
, certain feedbacks can be disabled dynamically.
Objectives
While feedbacks are commonly used to decide if an Input
should be kept for future mutations, they serve a double-purpose, as so-called Objective Feedbacks
.
In this case, the interestingness
of a feedback indicates if an Objective
has been hit.
Commonly, these objectives would be a crash or a timeout, but they can also be used to detect if specific parts of the program have been reached, for sanitization, or a differential fuzzing success.
Objectives use the same trait as a normal Feedback
and the implementations can be used interchangeably.
The only difference is that interesting
Objectives won't be mutated further, and are counted as Solutions
, a successful fuzzing campaign.
Input
Formally, the input of a program is the data taken from external sources that affect the program behavior.
In our model of an abstract fuzzer, we define the Input as the internal representation of the program input (or a part of it).
In the straightforward case, the input of the program is a byte array and in fuzzers such as AFL we store and manipulate exactly these byte arrays.
But it is not always the case. A program can expect inputs that are not linear byte arrays (e.g. a sequence of syscalls forming a use case or protocol) and the fuzzer does not represent the Input in the same way that the program consumes it.
In case of a grammar fuzzer for instance, the Input is generally an Abstract Syntax Tree because it is a data structure that can be easily manipulated while maintaining the validity, but the program expects a byte array as input, so just before the execution, the tree is serialized to a sequence of bytes.
In the Rust code, an Input
is a trait that can be implemented only by structures that are serializable and have only owned data as fields.
While most fuzzers use a normal BytesInput
, more advanced ones use inputs that include special inputs for grammar fuzzing (GramatronInput or NautilusInput
on Rust nightly), as well as the token-level EncodedInput.
Corpus
The Corpus is where testcases are stored. We define a Testcase as an Input and a set of related metadata like execution time for instance.
A Corpus can store testcases in different ways, for example on disk, or in memory, or implement a cache to speedup on disk storage.
Usually, a testcase is added to the Corpus when it is considered as interesting, but a Corpus is used also to store testcases that fulfill an objective (like crashing the program under test for instance).
Related to the Corpus is the way in which the next testcase (the fuzzer would ask for) is retrieved from the Corpus. The taxonomy for this handling in LibAFL is Scheduler, the entity representing the policy to pop testcases from the Corpus, in a FIFO fashion for instance.
Speaking about the code, Corpus
and Scheduler
are traits.
Mutator
The Mutator is an entity that takes one or more Inputs and generates a new instance of Input derived by its inputs.
Mutators can be composed, and they are generally linked to a specific Input type.
There can be, for instance, a Mutator that applies more than a single type of mutation to the input. Consider a generic Mutator for a byte stream, bit flip is just one of the possible mutations but not the only one, there is also, for instance, the random replacement of a byte of the copy of a chunk.
In LibAFL, Mutator
is a trait.
Generator
A Generator is a component designed to generate an Input from scratch.
Typically, a random generator is used to generate random inputs.
Generators are traditionally less used in Feedback-driven Fuzzing, but there are exceptions, like Nautilus, that uses a Grammar generator to create the initial corpus and a sub-tree Generator as a mutation of its grammar Mutator.
In the code, Generator
is a trait.
Stage
A Stage is an entity that operates on a single Input received from the Corpus.
For instance, a Mutational Stage, given an input of the corpus, applies a Mutator and executes the generated input one or more times. How many times this has to be done can be scheduled, AFL for instance uses a performance score of the input to choose how many times the havoc mutator should be invoked. This can depend also on other parameters, for instance, the length of the input if we want to just apply a sequential bitflip, or a fixed value.
A stage can also be an analysis stage, for instance, the Colorization stage of Redqueen that aims to introduce more entropy in a testcase or the Trimming stage of AFL that aims to reduce the size of a testcase.
There are several stages in the LibAFL codebase implementing the Stage
trait.
Design
In this chapter, we discuss how we designed the library taking into account the core concepts while allowing code reuse and extensibility.
Architecture
The LibAFL architecture is built around some entities to allow code reuse and low-cost abstractions.
Initially, we started thinking about implementing LibAFL in a traditional Object-Oriented language, like C++. When we switched to Rust, we immediately changed our idea as we realized that, we can build the library using a more rust-y approach, namely the one described in this blogpost about game design in Rust.
The LibAFL code reuse mechanism is based on components, rather than sub-classes, but there are still some OOP patterns in the library.
Thinking about similar fuzzers, you can observe that most of the time the data structures that are modified are the ones related to testcases and the fuzzer global state.
Beside the entities previously described, we introduce the Testcase
and State
entities. The Testcase is a container for an Input stored in the Corpus and its metadata (so, in the implementation, the Corpus stores Testcases) and the State contains all the metadata that are evolved while running the fuzzer, Corpus included.
The State, in the implementation, contains only owned objects that are serializable, and it is serializable itself. Some fuzzers may want to serialize their state when pausing or just, when doing in-process fuzzing, serialize on crash and deserialize in the new process to continue to fuzz with all the metadata preserved.
Additionally, we group the entities that are "actions", like the CorpusScheduler
and the Feedbacks
, in a common place, the `Fuzzer'.
Metadata
A metadata in LibAFL is a self-contained structure that holds associated data to the State or to a Testcase.
In terms of code, a metadata can be defined as a Rust struct registered in the SerdeAny register.
#![allow(unused)] fn main() { extern crate libafl_bolts; extern crate serde; use libafl_bolts::SerdeAny; use serde::{Serialize, Deserialize}; #[derive(Debug, Serialize, Deserialize, SerdeAny)] pub struct MyMetadata { //... } }
The struct must be static, so it cannot hold references to borrowed objects.
As an alternative to derive(SerdeAny)
which is a proc-macro in libafl_derive
the user can use libafl_bolts::impl_serdeany!(MyMetadata);
.
Usage
Metadata objects are primarly intended to be used inside SerdeAnyMap
and NamedSerdeAnyMap
.
With these maps, the user can retrieve instances by type (and name). Internally, the instances are stored as SerdeAny trait objects.
Structs that want to have a set of metadata must implement the HasMetadata
trait.
By default, Testcase and State implement it and hold a SerdeAnyMap testcase.
(De)Serialization
We are interested to store State's Metadata to not lose them in case of crash or stop of a fuzzer. To do that, they must be serialized and unserialized using Serde.
As Metadata is stored in a SerdeAnyMap as trait objects, they cannot be deserialized using Serde by default.
To cope with this problem, in LibAFL each SerdeAny struct must be registered in a global registry that keeps track of types and allows the (de)serialization of the registered types.
Normally, the impl_serdeany
macro does that for the user creating a constructor function that fills the registry. However, when using LibAFL in no_std mode, this operation must be carried out manually before any other operation in the main
function.
To do that, the developer needs to know each metadata type that is used inside the fuzzer and call RegistryBuilder::register::<MyMetadata>()
for each of them at the beginning of main
.
Migrating from LibAFL <0.9 to 0.9
Internal APIs of LibAFL have changed in version 0.9 to prefer associated types in cases where components were "fixed" to particular versions of other components. As a result, many existing custom components will not be compatible between versions prior to 0.9 and version 0.9.
Reasons for this change
When implementing a trait with a generic, it is possible to have more than one instantiation of that generic trait. As a
result, everywhere where consistency across generic types was required to implement a trait, it needed to be properly
and explicitly constrained at every point. This led to impl
s which were at best difficult to debug and, at worst,
incorrect and caused confusing bugs for users.
For example, consider the MapCorpusMinimizer
implementation (from <0.9) below:
impl<E, I, O, S, TS> CorpusMinimizer<I, S> for MapCorpusMinimizer<E, I, O, S, TS>
where
E: Copy + Hash + Eq,
I: Input,
for<'a> O: MapObserver<Entry = E> + AsIter<'a, Item = E>,
S: HasMetadata + HasCorpus<I>,
TS: TestcaseScore<I, S>,
{
fn minimize<CS, EX, EM, OT, Z>(
&self,
fuzzer: &mut Z,
executor: &mut EX,
manager: &mut EM,
state: &mut S,
) -> Result<(), Error>
where
CS: Scheduler<I, S>,
EX: Executor<EM, I, S, Z> + HasObservers<I, OT, S>,
EM: EventManager<EX, I, S, Z>,
OT: ObserversTuple<S>,
Z: Evaluator<EX, EM, I, S> + HasScheduler<CS, I, S>,
{
// --- SNIP ---
}
}
It was previously necessary to constrain every generic using a slew of other generics; above, it is necessary to
constrain the input type (I
) for every generic, despite the fact that this was already made clear by the state (S
)
and that the input will necessarily be the same over every implementation for that type.
Below is the same code, but with the associated types changes (note that some generic names have changed):
impl<E, O, T, TS> CorpusMinimizer<E> for MapCorpusMinimizer<E, O, T, TS>
where
E: UsesState,
for<'a> O: MapObserver<Entry = T> + AsIter<'a, Item = T>,
E::State: HasMetadata + HasCorpus,
T: Copy + Hash + Eq,
TS: TestcaseScore<E::State>,
{
fn minimize<CS, EM, Z>(
&self,
fuzzer: &mut Z,
executor: &mut E,
manager: &mut EM,
state: &mut E::State,
) -> Result<(), Error>
where
E: Executor<EM, Z> + HasObservers,
CS: Scheduler<State=E::State>,
EM: UsesState<State=E::State>,
Z: HasScheduler<CS, State=E::State>,
{
// --- SNIP ---
}
}
The executor is constrained to EM
and Z
, with each of their respective states being constrained to E
's state. It
is no longer necessary to explicitly define a generic for the input type, the state type, or the generic type, as these
are all present as associated types for E
. Additionally, we don't even need to specify any details about the observers
(OT
in the previous version) as the type does not need to be constrained and is not shared by other types.
Scope
You are affected by this change if:
- You specified explicit generics for a type (e.g.,
MaxMapFeedback::<_, (), _>::new(...)
) - You implemented a custom component (e.g.,
Mutator
,Executor
,State
,Fuzzer
,Feedback
,Observer
, etc.)
If you did neither of these, congrats! You are likely unaffected by these changes.
Migrating explicit generics
Migrating specific generics should be a quite simple process; you should review the API documentation for details on the order of generics and replace them accordingly. Generally speaking, it should no longer be necessary to specify these generics.
See fuzzers/
for examples of these changes.
Migrating component types
If you implemented a Mutator, Executor, State, or another kind of component, you must update your implementation. The main changes to the API are in the use of "Uses*" for associated types.
In many scenarios, Input, Observer, and State generics have been moved into traits with associated types (namely, "UsesInput", "UsesObservers", and "UsesState". These traits are required for many existing traits now and are very straightforward to implement. In a majority of cases, you will have generics on your custom implementation or a fixed type to implement this with. Thankfully, Rust will let you know when you need to implement this type.
As an example, InMemoryCorpus
before 0.9 looked like this:
#[derive(Default, Serialize, Deserialize, Clone, Debug)]
#[serde(bound = "I: serde::de::DeserializeOwned")]
pub struct InMemoryCorpus<I>
where
I: Input,
{
entries: Vec<RefCell<Testcase<I>>>,
current: Option<usize>,
}
impl<I> Corpus<I> for InMemoryCorpus<I>
where
I: Input,
{
// --- SNIP ---
}
After 0.9, all Corpus
implementations are required to implement UsesInput
. Also Corpus
no longer has a generic for
the input type (as it is now provided by the UsesInput impl). The migrated implementation is shown below:
#[derive(Default, Serialize, Deserialize, Clone, Debug)]
#[serde(bound = "I: serde::de::DeserializeOwned")]
pub struct InMemoryCorpus<I>
where
I: Input,
{
entries: Vec<RefCell<Testcase<I>>>,
current: Option<usize>,
}
impl<I> UsesInput for InMemoryCorpus<I>
where
I: Input,
{
type Input = I;
}
impl<I> Corpus for InMemoryCorpus<I>
where
I: Input,
{
// --- SNIP ---
}
Now, Corpus
cannot be accidentally implemented for another type other than that specified by InMemoryCorpus
, as it
is fixed to the associated type for UsesInput
.
A more complex example of migration can be found in the "Reasons for this change" section of this document.
Observer Changes
Additionally, we changed the Observer API, as the API in 0.8 led to undefined behavior.
At the same time, we used the change to simplify the common case: creating an StdMapObserver
from libafl_target's EDGES_MAP
.
In the future, instead of using:
let edges = unsafe { &mut EDGES_MAP[0..MAX_EDGES_NUM] };
let edges_observer = StdMapObserver::new("edges", edges);
creating the edges observer is as simple as using the new std_edges_map_observer
function.
let edges_observer = unsafe { std_edges_map_observer("edges") };
Alternatively, StdMapObserver::new
will still work, but now the whole method is marked as unsafe
.
The reason is that the caller has to make sure EDGES_MAP
(or other maps) are not moved or freed in memory,
for the lifetime of the MapObserver
.
This means that the buffer should either be static
or Pin
.
Migrating from LibAFL <0.11 to 0.11
We moved the old libafl::bolts
module to its own crate called libafl_bolts
.
For this, imports for types in LibAFL bolts have changed in version 0.11, everything else should remain the same.
Reasons for This Change
With the change we can now use a lot of low-level features of LibAFL for projects that are unrelated to fuzzing, or just completely different to LibAFL. Some cross-platform things in bolts include
- SerdeAnyMap: a map that stores and retrieves elements by type and is serializable and deserializable
- ShMem: A cross-platform (Windows, Linux, Android, MacOS) shared memory implementation
- LLMP: A fast, lock-free IPC mechanism via SharedMap
- Core_affinity: A maintained version of
core_affinity
that can be used to get core information and bind processes to cores - Rands: Fast random number generators for fuzzing (like RomuRand)
- MiniBSOD: get and print information about the current process state including important registers.
- Tuples: Haskel-like compile-time tuple lists
- Os: OS specific stuff like signal handling, windows exception handling, pipes, and helpers for
fork
What changed
You will need to move all libafl::bolts::
imports to libafl_bolts:::
and add the crate dependency in your Cargo.toml (and specify feature flags there).
As only exception, the libafl::bolts::launcher::Launcher
has moved to libafl::events::launcher::Launcher
since it has fuzzer and EventManager
specific code.
If you are using prelude
, you may need to also ad libafl_bolts::prelude
.
That's it.
Enjoy using libafl_bolts
in other projects.
Message Passing
LibAFL offers a standard mechanism for message passing between processes and machines with a low overhead.
We use message passing to inform the other connected clients/fuzzers/nodes about new testcases, metadata, and statistics about the current run.
Depending on individual needs, LibAFL can also write testcase contents to disk, while still using events to notify other fuzzers, using the CachedOnDiskCorpus
or similar.
In our tests, message passing scales very well to share new testcases and metadata between multiple running fuzzer instances for multi-core fuzzing.
Specifically, it scales a lot better than using memory locks on a shared corpus, and a lot better than sharing the testcases via the filesystem, as AFL traditionally does.
Think "all cores are green" in htop
, aka., no kernel interaction.
The EventManager
interface is used to send Events over the wire using Low Level Message Passing
, a custom message passing mechanism over shared memory or TCP.
Low Level Message Passing (LLMP)
LibAFL comes with a reasonably lock-free message passing mechanism that scales well across cores and, using its broker2broker mechanism, even to connected machines via TCP.
Most example fuzzers use this mechanism, and it is the best EventManager
if you want to fuzz on more than a single core.
In the following, we will describe the inner workings of LLMP
.
LLMP
has one broker
process that can forward messages sent by any client process to all other clients.
The broker can also intercept and filter the messages it receives instead of forwarding them.
A common use-case for messages filtered by the broker are the status messages sent from each client to the broker directly.
The broker used this information to paint a simple UI, with up-to-date information about all clients, however the other clients don't need to receive this information.
Speedy Local Messages via Shared Memory
Throughout LibAFL, we use a wrapper around different operating system's shared maps, called ShMem
.
Shared maps, called shared memory for the sake of not colliding with Rust's map()
functions, are the backbone of LLMP
.
Each client, usually a fuzzer trying to share stats and new testcases, maps an outgoing ShMem
map.
With very few exceptions, only this client writes to this map, therefore, we do not run in race conditions and can live without locks.
The broker reads from all client's ShMem
maps.
It periodically checks all incoming client maps and then forwards new messages to its outgoing broadcast-ShMem
, mapped by all connected clients.
To send new messages, a client places a new message at the end of their shared memory and then updates a static field to notify the broker.
Once the outgoing map is full, the sender allocates a new ShMem
using the respective ShMemProvider
.
It then sends the information needed to map the newly-allocated page in connected processes to the old page, using an end of page (EOP
) message.
Once the receiver maps the new page, it flags it as safe for unmapping by the sending process (to avoid race conditions if we have more than a single EOP in a short time), and then continues to read from the new ShMem
.
The schema for client's maps to the broker is as follows:
[client0] [client1] ... [clientN]
| | /
[client0_out] [client1_out] ... [clientN_out]
| / /
|________________/ /
|________________________________/
\|/
[broker]
The broker loops over all incoming maps, and checks for new messages.
On std
builds, the broker will sleep a few milliseconds after a loop, since we do not need the messages to arrive instantly.
After the broker received a new message from clientN, (clientN_out->current_id != last_message->message_id
) the broker copies the message content to its own broadcast shared memory.
The clients periodically, for example after finishing n
mutations, check for new incoming messages by checking if (current_broadcast_map->current_id != last_message->message_id
).
While the broker uses the same EOP mechanism to map new ShMem
s for its outgoing map, it never unmaps old pages.
This additional memory resources serve a good purpose: by keeping all broadcast pages around, we make sure that new clients can join in on a fuzzing campaign at a later point in time.
They just need to re-read all broadcasted messages from start to finish.
So the outgoing messages flow is like this over the outgoing broadcast Shmem
:
[broker]
|
[current_broadcast_shmem]
|
|___________________________________
|_________________ \
| \ \
| | |
\|/ \|/ \|/
[client0] [client1] ... [clientN]
To use LLMP
in LibAFL, you usually want to use an LlmpEventManager
or its restarting variant.
They are the default if using LibAFL's Launcher
.
If you should want to use LLMP
in its raw form, without any LibAFL
abstractions, take a look at the llmp_test
example in ./libafl/examples.
You can run the example using cargo run --example llmp_test
with the appropriate modes, as indicated by its help output.
First, you will have to create a broker using LlmpBroker::new()
.
Then, create some LlmpClient
s in other threads and register them with the main thread using LlmpBroker::register_client
.
Finally, call LlmpBroker::loop_forever()
.
B2B: Connecting Fuzzers via TCP
For broker2broker
communication, all broadcast messages are additionally forwarded via network sockets.
To facilitate this, we spawn an additional client thread in the broker, that reads the broadcast shared memory, just like any other client would.
For broker2broker communication, this b2b client listens for TCP connections from other, remote brokers.
It keeps a pool of open sockets to other, remote, b2b brokers around at any time.
When receiving a new message on the local broker shared memory, the b2b client will forward it to all connected remote brokers via TCP.
Additionally, the broker can receive messages from all connected (remote) brokers, and forward them to the local broker over a client ShMem
.
As a sidenote, the tcp listener used for b2b communication is also used for an initial handshake when a new client tries to connect to a broker locally, simply exchanging the initial ShMem
descriptions.
Spawning Instances
Multiple fuzzer instances can be spawned using different ways.
Manually, via a TCP port
The straightforward way to do Multi-Threading is to use the LlmpRestartingEventManager
, specifically to use setup_restarting_mgr_std
.
It abstracts away all the pesky details about restarts on crash handling (for in-memory fuzzers) and multi-threading.
With it, every instance you launch manually tries to connect to a TCP port on the local machine.
If the port is not yet bound, this instance becomes the broker, binding itself to the port to await new clients.
If the port is already bound, the EventManager will try to connect to it. The instance becomes a client and can now communicate with all other nodes.
Launching nodes manually has the benefit that you can have multiple nodes with different configurations, such as clients fuzzing with and without ASan
.
While it's called "restarting" manager, it uses fork
on Unix-like operating systems as optimization and only actually restarts from scratch on Windows.
Automated, with Launcher
The Launcher is the lazy way to do multiprocessing. You can use the Launcher builder to create a fuzzer that spawns multiple nodes with one click, all using restarting event managers and the same configuration.
To use launcher, first you need to write an anonymous function let mut run_client = |state: Option<_>, mut mgr, _core_id|{}
, which uses three parameters to create an individual fuzzer. Then you can specify the shmem_provider
,broker_port
,monitor
,cores
and other stuff through Launcher::builder()
:
Launcher::builder()
.configuration(EventConfig::from_name(&configuration))
.shmem_provider(shmem_provider)
.monitor(mon)
.run_client(&mut run_client)
.cores(cores)
.broker_port(broker_port)
.stdout_file(stdout_file)
.remote_broker_addr(broker_addr)
.build()
.launch()
This first starts a broker, then spawns n
clients, according to the value passed to cores
.
The value is a string indicating the cores to bind to, for example, 0,2,5
or 0-3
.
For each client, run_client
will be called.
If the launcher uses fork
, it will hide child output, unless the settings indicate otherwise, or the LIBAFL_DEBUG_OUTPUT
env variable is set.
On Windows, the Launcher will restart each client, while on Unix-alikes, it will use fork
.
Advanced use-cases:
- To connect multiple nodes together via TCP, you can use the
remote_broker_addr
. this requires thellmp_bind_public
compile-time feature forLibAFL
. - To use multiple launchers for individual configurations, you can set
spawn_broker
tofalse
on all instances but one. - Launcher will not select the cores automatically, so you need to specify the
cores
that you want. - On
Unix
, you can chose between a forking and non-forking version of Launcher by setting thefork
feature in LibAFL. Some targets may not like forking, but it is faster than restarting processes from scratch. Windows will never fork. - For simple debugging, first set the
LIBAFL_DEBUG_OUTPUT
env variable to see if a child process printed anything. - For further debugging of fuzzer failures, it may make sense to replace
Launcher
temporarily with aSimpleEventManager
and call your harness fn (run_client(None, mgr, 0);
) directly, so that fuzzing runs in the same thread and is easier to debug, before moving back toLauncher
after the bugfix.
For more examples, you can check out qemu_launcher
and libfuzzer_libpng_launcher
in ./fuzzers/
.
Other ways
The LlmpEventManager
family is the easiest way to spawn instances, but for obscure targets, you may need to come up with other solutions.
LLMP is even, in theory, no_std
compatible, and even completely different EventManagers can be used for message passing.
If you are in this situation, please either read through the current implementations and/or reach out to us.
Configurations
Configurations for individual fuzzer nodes are relevant for multi node fuzzing. The chapter describes how to run nodes with different configurations in one fuzzing cluster. This allows, for example, a node compiled with ASan, to know that it needs to rerun new testcases for a node without ASan, while the same binary/configuration does not.
Fuzzers with the same configuration can exchange Observers for new testcases and reuse them without rerunning the input. A different configuration indicates, that only the raw input can be exchanged, it must be rerun on the other node to capture relevant observations.
Tutorial
In this chapter, we will build a custom fuzzer using the Lain mutator in Rust.
This tutorial will introduce you to writing extensions to LibAFL like Feedbacks and Testcase's metadata.
Introduction
Under Construction!
This section is under construction. Please check back later (or open a PR)
In the meantime, find the final Lain-based fuzzer in the fuzzers folder
Advanced Features
In addition to core building blocks for fuzzers, LibAFL also has features for more advanced/niche fuzzing techniques. The following sections are dedicated to some of these features.
Binary-only Fuzzing with Frida
LibAFL supports different instrumentation engines for binary-only fuzzing. A potent cross-platform (Windows, MacOS, Android, Linux, iOS) option for binary-only fuzzing is Frida; the dynamic instrumentation tool.
In this section, we will talk about the components in fuzzing with libafl_frida
.
You can take a look at a working example in our fuzzers/frida_libpng
folder for Linux, and fuzzers/frida_gdiplus
for Windows.
Dependencies
If you are on Linux or OSX, you'll need libc++ for libafl_frida
in addition to libafl's dependencies.
If you are on Windows, you'll need to install llvm tools.
Harness & Instrumentation
LibAFL uses Frida's Stalker to trace the execution of your program and instrument your harness. Thus, you have to compile your harness to a dynamic library. Frida instruments your PUT after dynamically loading it.
In our frida_libpng
example, we load the dynamic library and find the symbol to harness as follows:
let lib = libloading::Library::new(module_name).unwrap();
let target_func: libloading::Symbol<
unsafe extern "C" fn(data: *const u8, size: usize) -> i32,
> = lib.get(symbol_name.as_bytes()).unwrap();
FridaInstrumentationHelper
and Runtimes
To use functionalities that Frida offers, we'll first need to obtain a Gum
object by Gum::obtain()
.
In LibAFL, we use the FridaInstrumentationHelper
struct to manage frida-related state. FridaInstrumentationHelper
is a key component that sets up the Transformer that is used to generate the instrumented code. It also initializes the Runtimes
that offer various instrumentations.
We have CoverageRuntime
that can track the edge coverage, AsanRuntime
for address sanitizer, DrCovRuntime
that uses DrCov for coverage collection (to be imported in coverage tools like Lighthouse, bncov, dragondance,...), and CmpLogRuntime
for cmplog instrumentation.
All of these runtimes can be slotted into FridaInstrumentationHelper
at build time.
Combined with any Runtime
you'd like to use, you can initialize the FridaInstrumentationHelper
like this:
let gum = Gum::obtain();
let frida_options = FridaOptions::parse_env_options();
let coverage = CoverageRuntime::new();
let mut frida_helper = FridaInstrumentationHelper::new(
&gum,
&frida_options,
module_name,
modules_to_instrument,
tuple_list!(coverage),
);
Running the Fuzzer
After setting up the FridaInstrumentationHelper
you can obtain the pointer to the coverage map by calling map_mut_ptr()
.
let edges_observer = HitcountsMapObserver::new(StdMapObserver::from_mut_ptr(
"edges",
frida_helper.map_mut_ptr().unwrap(),
MAP_SIZE,
));
You can then link this observer to FridaInProcessExecutor
as follows:
let mut executor = FridaInProcessExecutor::new(
&gum,
InProcessExecutor::new(
&mut frida_harness,
tuple_list!(
edges_observer,
time_observer,
AsanErrorsObserver::new(&ASAN_ERRORS)
),
&mut fuzzer,
&mut state,
&mut mgr,
)?,
&mut frida_helper,
);
And finally you can run the fuzzer.
See the frida_
examples in ./fuzzers
for more information and, for linux or full-system, play around with libafl_qemu
, another binary-only tracer.
Concolic Tracing and Hybrid Fuzzing
LibAFL has support for concolic tracing based on the SymCC instrumenting compiler.
For those uninitiated, the following text attempts to describe concolic tracing from the ground up using an example. Then, we'll go through the relationship of SymCC and LibAFL concolic tracing. Finally, we'll walk through building a basic hybrid fuzzer using LibAFL.
Concolic Tracing by Example
Suppose you want to fuzz the following program:
#![allow(unused)] fn main() { fn target(input: &[u8]) -> i32 { match &input { // fictitious crashing input &[1, 3, 3, 7] => 1337, // standard error handling code &[] => -1, // representative of normal execution _ => 0 } } }
A simple coverage-maximizing fuzzer that generates new inputs somewhat randomly will have a hard time finding an input that triggers the fictitious crashing input. Many techniques have been proposed to make fuzzing less random and more directly attempt to mutate the input to flip specific branches, such as the ones involved in crashing the above program.
Concolic tracing allows us to construct an input that exercises a new path in the program (such as the crashing one in the example) analytically instead of stochastically (ie. guessing). In principle, concolic tracing works by observing all executed instructions in an execution of the program that depend on the input. To understand what this entails, we'll run an example with the above program.
First, we'll simplify the program to simple if-then-else-statements:
#![allow(unused)] fn main() { fn target(input: &[u8]) -> i32 { if input.len() == 4 { if input[0] == 1 { if input[1] == 3 { if input[2] == 3 { if input[3] == 7 { return 1337; } else { return 0; } } else { return 0; } } else { return 0; } } else { return 0; } } else { if input.len() == 0 { return -1; } else { return 0; } } } }
Next, we'll trace the program on the input []
.
The trace would look like this:
Branch { // if input.len() == 4
condition: Equals {
left: Variable { name: "input_len" },
right: Integer { value: 4 }
},
taken: false // This condition turned out to be false...
}
Branch { // if input.len() == 0
condition: Equals {
left: Variable { name: "input_len" },
right: Integer { value: 0 }
},
taken: true // This condition turned out to be true!
}
Using this trace, we can easily deduce that we can force the program to take a different path by having an input of length 4 or having an input with non-zero length. We do this by negating each branch condition and analytically solving the resulting 'expression'. In fact, we can create these expressions for any computation and give them to an SMT-Solver that will generate an input that satisfies the expression (as long as such an input exists).
In hybrid fuzzing, we combine this tracing + solving approach with more traditional fuzzing techniques.
Concolic Tracing in LibAFL, SymCC and SymQEMU
The concolic tracing support in LibAFL is implemented using SymCC. SymCC is a compiler plugin for clang that can be used as a drop-in replacement for a normal C or C++ compiler. SymCC will instrument the compiled code with callbacks into a runtime that can be supplied by the user. These callbacks allow the runtime to construct a trace that is similar to the previous example.
SymCC and its Runtimes
SymCC ships with 2 runtimes:
- A 'simple' runtime that attempts to negate and analytically solve any branch conditions it comes across using Z3 and
- A QSym-based runtime, which does a bit more filtering on the expressions and also solves them using Z3.
The integration with LibAFL, however, requires you to BYORT (bring your own runtime) using the symcc_runtime
crate.
This crate allows you to easily build a custom runtime out of the built-in building blocks or create entirely new runtimes with full flexibility.
Check out the symcc_runtime
docs for more information on how to build your own runtime.
SymQEMU
SymQEMU is a sibling project to SymCC.
Instead of instrumenting the target at compile-time, it inserts instrumentation via dynamic binary translation, building on top of the QEMU
emulation stack.
This means that using SymQEMU, any (x86) binary can be traced without the need to build in instrumentation ahead of time.
The symcc_runtime
crate supports this use case and runtimes built with symcc_runtime
also work with SymQEMU.
Hybrid Fuzzing in LibAFL
The LibAFL repository contains an example hybrid fuzzer.
There are three main steps involved with building a hybrid fuzzer using LibAFL:
- Building a runtime,
- choosing an instrumentation method and
- building the fuzzer.
Note that the order of these steps is important. For example, we need to have a runtime ready before we can do instrumentation with SymCC.
Building a Runtime
Building a custom runtime can be done easily using the symcc_runtime
crate.
Note, that a custom runtime is a separate shared object file, which means that we need a separate crate for our runtime.
Check out the example hybrid fuzzer's runtime and the symcc_runtime
docs for inspiration.
Instrumentation
There are two main instrumentation methods to make use of concolic tracing in LibAFL:
- Using a compile-time instrumented target with SymCC. This only works when the source is available for the target and the target is reasonably easy to build using the SymCC compiler wrapper.
- Using SymQEMU to dynamically instrument the target at runtime. This avoids building a separate instrumented target with concolic tracing instrumentation and so does not require source code.
It should be noted, however, that the 'quality' of the generated expressions can be significantly worse and SymQEMU generally produces significantly more and significantly more convoluted expressions than SymCC. Therefore, it is recommended to use SymCC over SymQEMU when possible.
Using SymCC
The target needs to be instrumented ahead of fuzzing using SymCC.
How exactly this is done does not matter.
However, the SymCC compiler needs to be made aware of the location of the runtime that it should instrument against.
This is done by setting the SYMCC_RUNTIME_DIR
environment variable to the directory which contains the runtime (typically the target/(debug|release)
folder of your runtime crate).
The example hybrid fuzzer instruments the target in its build.rs
build script.
It does this by cloning and building a copy of SymCC and then using this version to instrument the target.
The symcc_libafl
crate contains helper functions for cloning and building SymCC.
Make sure you satisfy the build requirements of SymCC before attempting to build it.
Using SymQEMU
Build SymQEMU according to its build instructions.
By default, SymQEMU looks for the runtime in a sibling directory.
Since we don't have a runtime there, we need to explicitly set the --symcc-build
argument of the configure
script to the path of your runtime.
Building the Fuzzer
No matter the instrumentation method, the interface between the fuzzer and the instrumented target should now be consistent.
The only difference between using SymCC and SymQEMU should be the binary that represents the target:
In the case of SymCC it will be the binary that was build with instrumentation and with SymQEMU it will be the emulator binary (eg. x86_64-linux-user/symqemu-x86_64
), followed by your uninstrumented target binary and its arguments.
You can use the CommandExecutor
to execute your target (example).
When configuring the command, make sure you pass the SYMCC_INPUT_FILE
environment variable (set to the input file path), if your target reads input from a file (instead of standard input).
Serialization and Solving
While it is perfectly possible to build a custom runtime that also performs the solving step of hybrid fuzzing in the context of the target process, the intended use of the LibAFL concolic tracing support is to serialize the (filtered and pre-processed) branch conditions using the TracingRuntime
.
This serialized representation can be deserialized in the fuzzer process for solving using a ConcolicObserver
wrapped in a ConcolicTracingStage
, which will attach a ConcolicMetadata
to every TestCase
.
The ConcolicMetadata
can be used to replay the concolic trace and to solve the conditions using an SMT-Solver.
Most use-cases involving concolic tracing, however, will need to define some policy around which branches they want to solve.
The SimpleConcolicMutationalStage
can be used for testing purposes.
It will attempt to solve all branches, like the original simple backend from SymCC, using Z3.
Example
The example fuzzer shows how to use the ConcolicTracingStage
together with the SimpleConcolicMutationalStage
to build a basic hybrid fuzzer.
Using LibAFL in no_std
environments
It is possible to use LibAFL in no_std
environments e.g. on custom platforms like microcontrollers, kernels, hypervisors, and more.
You can simply add LibAFL to your Cargo.toml
file:
libafl = { path = "path/to/libafl/", default-features = false}
Then build your project e.g. for aarch64-unknown-none
using:
cargo build --no-default-features --target aarch64-unknown-none
Use custom timing
The minimum amount of support LibAFL needs for a no_std
environment is a monotonically increasing timestamp.
For this, anywhere in your project you need to implement the external_current_millis
function, which returns the current time in milliseconds.
// Assume this a clock source from a custom stdlib, which you want to use, which returns current time in seconds.
int my_real_seconds(void)
{
return *CLOCK;
}
Here, we use it in Rust. external_current_millis
is then called from LibAFL.
Note that it needs to be no_mangle
in order to get picked up by LibAFL at linktime:
#[no_mangle]
pub extern "C" fn external_current_millis() -> u64 {
unsafe { my_real_seconds()*1000 }
}
See ./fuzzers/baby_no_std for an example.
Snapshot Fuzzing in Nyx
NYX supports both source-based and binary-only fuzzing.
Currently, libafl_nyx
only supports afl++'s instruction type. To install it, you can use sudo apt install aflplusplus
. Or compile from the source:
git clone https://github.com/AFLplusplus/AFLplusplus
cd AFLplusplus
make all # this will not compile afl's additional extensions
Then you should compile the target with the afl++ compiler wrapper:
export CC=afl-clang-fast
export CXX=afl-clang-fast++
# the following line depends on your target
./configure --enable-shared=no
make
For binary-only fuzzing, Nyx uses intel-PT(Intel® Processor Trace). You can find the list of supported CPUs at https://www.intel.com/content/www/us/en/support/articles/000056730/processors.html.
Preparing the Nyx working directory
This step is used to pack the target into Nyx's kernel. Don't worry, we have a template shell script in our example:
the parameter's meaning is listed below:
git clone https://github.com/nyx-fuzz/packer
python3 "./packer/packer/nyx_packer.py" \
./libxml2/xmllint \ # your target binary
/tmp/nyx_libxml2 \ # the nyx work directory
afl \ # instruction type
instrumentation \
-args "/tmp/input" \ # the args of the program, means that we will run `xmllint /tmp/input` in each run.
-file "/tmp/input" \ # the input will be generated in `/tmp/input`. If no `--file`, then input will be passed through stdin
--fast_reload_mode \
--purge || exit
Then, you can generate the config file:
python3 ./packer/packer/nyx_config_gen.py /tmp/nyx_libxml2/ Kernel || exit
Standalone fuzzing
In the example fuzzer you first need to run ./setup_libxml2.sh
. It will prepare your target and create your nyx work directory in /tmp/libxml2
. After that, you can start to write your code.
First, to create Nyxhelper
:
let share_dir = Path::new("/tmp/nyx_libxml2/");
let cpu_id = 0; // use first cpu
let parallel_mode = false; // close parallel_mode
let mut helper = NyxHelper::new(share_dir, cpu_id, true, parallel_mode, None).unwrap(); // we don't need to set the last parameter in standalone mode, we just use None, here
Then, fetch trace_bits
, create an observer and the NyxExecutor
:
let observer = unsafe { StdMapObserver::from_mut_ptr("trace", helper.trace_bits, helper.map_size) };
let mut executor = NyxExecutor::new(&mut helper, tuple_list!(observer)).unwrap();
Finally, use them normally and pass them into fuzzer.fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr)
to start fuzzing.
Parallel fuzzing
In the example fuzzer you first need to run ./setup_libxml2.sh
as described before.
Parallel fuzzing relies on Launcher
, so spawn logic should be written in the scoop of anonymous function run_client
:
let mut run_client = |state: Option<_>, mut restarting_mgr, _core_id: usize| {}
In run_client
, you need to create NyxHelper
first:
let share_dir = Path::new("/tmp/nyx_libxml2/");
let cpu_id = _core_id as u32;
let parallel_mode = true;
let mut helper = NyxHelper::new(
share_dir, // nyx work directory
cpu_id, // current cpu id
true, // open snap_mode
parallel_mode, // open parallel mode
Some(parent_cpu_id.id as u32), // the cpu-id of main instance, there is only one main instance, other instances will be treated as secondaries
)
.unwrap();
Then you can fetch the trace_bits and create an observer and NyxExecutor
let observer = unsafe { StdMapObserver::from_mut_ptr("trace", helper.trace_bits, helper.map_size) }
let mut executor = NyxExecutor::new(&mut helper, tuple_list!(observer)).unwrap();
Finally, open a Launcher
as usual to start fuzzing:
match Launcher::builder()
.shmem_provider(shmem_provider)
.configuration(EventConfig::from_name("default"))
.monitor(monitor)
.run_client(&mut run_client)
.cores(&cores)
.broker_port(broker_port)
// .stdout_file(Some("/dev/null"))
.build()
.launch()
{
Ok(()) => (),
Err(Error::ShuttingDown) => println!("Fuzzing stopped by user. Good bye."),
Err(err) => panic!("Failed to run launcher: {err:?}"),
}