Mutations

RustMizan pairs the dataset with an extensible mutation framework. Every mutation is semantically preserving: it changes code syntax without altering program behavior, so the underlying vulnerability is intact but its surface form differs.

Mutations serve two purposes. Contamination mutations break token-level memorization to test whether a model recalls a benchmark rather than reasoning about it. Robustness mutations inject misleading cues to test whether a model resists surface-level deception.

For the before/after form of each mutation, see Mutation specification. For the underlying Rust AST tool, see mizan-mut.

Categories

Mutations are grouped into three categories, which map to the dataset variants used on the Leaderboard.

Contamination (benign)

Strip or rewrite surface syntax so memorized snippets no longer match.

MutationDescription
remove-commentsRemove all Rust comments
format-compactApply compact rustfmt formatting
format-expandedApply expanded rustfmt formatting
mizan-mut-for-to-whileConvert for loops to while loops
mizan-mut-while-to-loopConvert while loops to loop blocks with breaks
mizan-mut-if-else-reorderReorder if-else branches by negating conditions
benign-commentsInsert neutral comments around vulnerable lines
benign-blocksInsert neutral code blocks around vulnerable lines
benign-rename-fnRename functions to neutral names (e.g. fn_1_abc123)
benign-rename-varRename variables to neutral names (e.g. var_1_xyz789)

Robustness (malignant)

Inject adversarial cues that falsely suggest the code is safe.

MutationDescription
malignant-commentsInsert comments falsely suggesting the code is safe
malignant-blocksInsert code blocks falsely suggesting safety
malignant-rename-fnRename functions to safety-implying names (e.g. safe_fn_1)
malignant-rename-varRename variables to safety-implying names (e.g. secure_var_1)

Rust-specific

Structural transformations that leverage Rust syntax, implemented as AST transformations in mizan-mut.

MutationDescription
derive-reorderReorder traits in #[derive(...)] attributes
trait-bound-reorderReorder trait bounds in where clauses
use-reorderReorder items in use statements
arithmetic-identityWrap integer literals with a multiplication identity (N * 1)
explicit-whereAdd an explicit where clause to a signature
explicit-where-to-type-paramsMove simple type bounds from a where clause into the type parameters
rename-lifetimeRename lifetime parameters consistently
impl-trait-to-genericConvert impl Trait bounds into generic parameters
option-wrapWrap expressions in a redundant Some(...).unwrap()
maybeuninit-wrapRound-trip a value through MaybeUninit<T>
manuallydrop-wrapWrap an owned variable in ManuallyDrop, then unwrap it
explicit-returnConvert implicit returns to explicit return statements
unreachable-panicGuard a function body with an unreachable panic!() arm
repeated-shadowingAdd redundant repeated shadows for let bindings

See the specification for before/after examples.

Mutations prefixed with mizan-mut- and all rename mutations call the mizan-mut binary, which must be installed and on your PATH.

The pipeline

For each sample, the framework backs up the original, applies the mutation, then validates that the result still compiles and that the ground truth is preserved. If any step fails, it rolls back to the backup. Successful mutations are saved; the rest are logged.

Mutation pipeline

Ground-truth tracking

Mutations change the ground truth: renaming a function invalidates annotations that reference it by name, and inserting code shifts line numbers. The framework keeps annotations accurate with three mechanisms.

  • Marker tracking. For most mutations, a unique comment marker (e.g. // MIZAN_MARKER_vuln0001) is inserted before each vulnerable line. After the mutation, the marker's new position gives the corrected line number, and the marker is removed.
  • Content-based tracking. AST-based mizan-mut-* mutations remove all comments (including markers) when they parse and regenerate the code, so vulnerable lines are tracked by their content instead. If a line appears multiple times or cannot be found after mutation, that file is excluded and the mutation is re-applied. Such cases are recorded as partial_mutations.
  • Rename tracking. Rename mutations legitimately change line content, so the validator allows content differences for them.

Ground-truth tracking

Output files

  • Updated mizan.json with corrected vulnerable line numbers.
  • mizan_mutations.json logging mutations_applied, skipped (mutations or samples that were skipped), and partial_mutations.

A "successful" mutation means the process completed without error, not that code necessarily changed. Applying for-to-while to code with no for loops succeeds without making changes.

Ordering caveats

Mutations are applied in the order you list them. Be deliberate:

  • Don't run for-to-while then while-to-loop unless you intend to turn for loops into loop blocks.
  • Don't run benign-comments then remove-comments; the inserted comments will be stripped.

To add a new mutation, see Add a mutation.