Leaf
Concolic execution for Rust through MIR instrumentation.
Welcome to project Leaf, a tool to perform dynamic symbolic execution for Rust programs. Leaf helps you with tracking the constraints put on particular variables of interest during the execution of the program.
Leaf aims to be robust, extensible, and easily-integrable within real world Rust testing stacks.
Please refer to Introduction for more details about the workflow of the tool, and to Getting Started to learn how to use it.
Introduction
(TBD)
Getting Started
Although currently working with the first releases of Leaf, we aim to provide a rather straightforward workflow for the users.
Please follow the instruction below to try Leaf in your environment.
Installing Leaf
Requirements
- Rust (
rustup
) - Python
You can install Leaf using cargo
and by building the source code.
- Clone the repository.
$ git clone https://github.com/sfu-rsl/leaf.git $ cd leaf
- Install Leaf's compiler named as
leafc
using$ cargo install --path ./compiler
Performing Concolic Execution
As an instrumentation-based dynamic analyzer, Leaf instruments programs such that they expose information about their behavior during each execution. Therefore, concolic execution is achieved by compiling the target program and running the executable.
-
Pick a program you want to do concolic execution for. Some are available in
samples
directory of the source tree.fn main() { let x: u8 = 10; if x < 5 { println!("Hello, world!"); } }
-
Mark a variable interest as symbolic.
// samples/hello_world.rs use leaf::annotations::*; fn main() { let x: u8 = 10.mark_symbolic(); if x < 5 { println!("Hello, world!"); } }
-
Compile your target program using
leafc
as you would do withrustc
. (The first compilation takes longer, bear with it!)$ leafc ./hello_world.rs
-
Prepare the environment to observe the execution through the standard error by setting
LEAF_LOG
variable. e.g.,export LEAF_LOG="info"
-
Execute the compiled program.
$ ./hello_world
-
An output similar to the following is expected from the execution.
2024-12-10 00:40:55 INFO leafrt Initializing runtime library 2024-12-10 00:40:55 INFO leafrt::pri::basic::instance Initializing basic backend 2024-12-10 00:40:55 INFO leafrt::backends::basic::outgen Setting up binary output writing to directory: output 2024-12-10 00:40:55 INFO leafrt::pri::basic::instance Basic backend initialized 2024-12-10 00:40:55 INFO leafrt::backends::basic::sym_vars Added a new symbolic variable: <Var1: u8> = 10u8 2024-12-10 00:40:55 INFO leafrt::trace::log Notified about constraint {!(<(<Var1: u8>, 5u8))} at step Def(0:5)[2] 2024-12-10 00:40:55 INFO leafrt::outgen Found a solution: { "1": 0u8, }
The logs include information about the constraints put on the variables marked as symbolic for the path taken in the execution.
-
Given the current default configuration, a folder named
leaf_out
also gets generated which contains alternative values for the symbolic variables that should cause the execution take other paths than the one taken, which we call diverging inputs.$ ls ./leaf_out 0.bin
More details about each step is provided in the rest of the book.
Recipes
(TBD)
Building Cargo Packages
As mentioned before, you can look at leafc
as a wrapper around rustc
and
it should be work in any command using rustc
.
Building packages is no exception. As supported by cargo
you can set
build.rust
configuration key
to override the compiler.
$ export RUSTC=leafc
$ cargo build
Diverging Input Generation
Each instance of concolic execution of a program, records a trace of the constraints put on symbolic variables at each step of the execution. Conditional branches are the major source of these constraints and whether they are held or not determines the target of the branch. Therefore, concolic execution can be used to find concrete values for the symbolic variables such that the execution diverges at conditional branches compared to the previous execution. We refer to these concrete values as diverging inputs, counterexamples, or generally answers.
The current default configuration of Leaf tries to find a diverging input whenever
a new constraint is observed.
For instance, in the following program, the input (x = 10
) does not satisfy
the branch condition (x < 5
), which the backend reports as {!(<(<Var1: u8>, 5u8))}
.
It tries to find an input that would satisfy it (so the execution would diverge at this point),
and reports back value 0u8
as a possible one.
Diverging Standard Input Generation
If all symbolic variables are from u8
, the default configuration puts the found answers
in binary files in which each byte corresponds to a symbolic variable ordered by
when they were marked as symbolic.
This is mainly meant for situations where we want to mark a file symbolic, e.g., standard input.
Although the backend (and execution of the instrumented program) is enough to obtain the diverging standard input, the one-time orchestrator is provided to facilitate this process with further control.
You can install it by running the following command in Leaf's root folder.
leaf$ cargo install --path ./orchestrator
Then you provide the path to the instrumented program and the desired path to put the diverging inputs at. For example,
$ leafo_onetime --program ./hello_world --outdir ./next
next/diverging_0.bin
It runs the target program, and prints the names of the files generated as diverging input.
Pure Concolic Testing
By repeating the above procedure for each generated input, more possible paths in the target program will be covered. We use the term pure concolic testing, as used by SymCC for this method of testing.
If you are interested in finding possible crashes in your program using pure concolic testing,
the project comes with a utility named leaff_pure_conc
, which runs the loop for programs and captures those that cause a crash
in them. Currently, it only supports programs with symbolic standard input (uses the one-time orchestrator).
To run pure concolic testing loop for a target:
- Install the tool.
leaf$ cargo install --path ./integration/libafl/fuzzers/pure_concolic
- Build your program with
leafc
. - Pass the executable path to the tool. e.g.,
$ leaff_pure_conc --conc-program ./hello_world
- The loop stops if no more new distinct input is found to be given to the program (can possibly run forever).
We recommend looking at the options available for tuning the loop by passing --help
.
Fuzzing
One of the use cases of concolic execution, which is demonstrated to be effective, is hybrid fuzzing, in which fuzzing is aided with solver-found inputs generated by symbolic execution to take certain paths inside the program that other techniques are inefficient to find.
Leaf is aimed to be suitable for this purpose and comes with a built-in support for LibAFL,
a customizable fuzzing framework with modern architecture written in Rust.
Crate libafl_leaf
contains facilities for using Leaf-instrumented programs with
the fuzzers written using this library.
Hybrid Fuzzing for libFuzzer
In an abstract manner, hybrid fuzzing for LibAFL-based fuzzers is achievable using a stage that generates diverging inputs from the current test case. This stage should perform the concolic execution using the current test case to derive the diverging inputs and offer them to the fuzzer for evaluation. Thus, the following steps are presumable for an execution-based concolic executor like Leaf.
- Build an executable equivalent to the fuzz target, which is suitable for concolic execution.
- Define a mutator stage that runs the built executable and obtains new inputs.
- Add the stage to the fuzzer.
The mentioned ingredients are provided by Leaf;
leafc
instruments your target program,
leafo_onetime
helps with collecting the diverging inputs,
and libafl_leaf
provides the stage.
As libFuzzer
(through cargo-fuzz
) is one the most-used tools to perform fuzzing
for Rust projects, a rudimentary support is also provided for harnesses
written based on libfuzzer-sys
to upgrade them to a hybrid fuzzer.
It is developed as an extension of LibAFL's implementation of libFuzzer
, so
the same instructions and options
apply.
Recipe
With an understanding of the general procedure above, you can follow the instruction below to upgrade your existing fuzzer to a hybrid one.
- Prerequisites
-
The one-time orchestrator (
leafo_onetime
) is installed in your environment. If not, install it similarly toleafc
using the following command in Leaf's root folder.leaf$ cargo install --path ./orchestrator
-
You have a fuzzer written using
libfuzzer-sys
. We assume it is namedfuzz_target_1
and has the following template.#![allow(unused)] #![no_main] fn main() { use libfuzzer_sys::fuzz_target; fuzz_target!(|data: &[u8]| { // fuzzed code goes here }); }
-
-
Replace
libfuzzer-sys
source to Leaf's implementation inCargo.toml
of your fuzz project.# From libfuzzer-sys = { version = "...", features = ["your", "features", "here"] } # To libfuzzer-sys = { git = "https://github.com/sfu-rsl/leaf.git", package = "libafl_libfuzzer", features = ["your", "features", "here"]}
-
Change
fuzz_target
macro invocation tohybrid_fuzz_target
, and makeno_main
attribute conditional based on compilation withleafc
.#![allow(unused)] #![cfg_attr(not(leafc), no_main)] fn main() { use libfuzzer_sys::hybrid_fuzz_target; hybrid_fuzz_target!(|data: &[u8]| { // fuzzed code goes here }); }
(
hybrid_fuzz_target
additionally writes a program with amain
function that reads the whole standard input, marks it as symbolic, and passes to the closure.) -
Build your fuzz target with
leafc
in a separate cargo target directory like below.fuzz$ RUSTC=leafc cargo build --bin fuzz_target_1 --target-dir ./target/leaf
-
Build your fuzzer normally, e.g.,
fuzz$ cargo fuzz build fuzz_target_1
-
Run your fuzzer with the additional argument
conc_program
which points to the instrumented executable built usingleafc
.fuzz$ cargo fuzz run fuzz_target_1 -- -conc_program=./target/leaf/debug/fuzz_target_1
Please refer to the technical documentation for further details about the components and steps mentioned above.
Configurations
(TBD)