Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #1, #2, #7 and #9. Propose solutions for #4 #5 and #6 #10

Merged
merged 2 commits into from
Nov 9, 2024

Conversation

jma-qb
Copy link

@jma-qb jma-qb commented Nov 9, 2024

Hello I backported some patches I made to this version of Lain.
It should fix #1, #2, #7 and #9. It also address my comments on #4 #5 and #6 about favoring some mutations that I think are better.

I ran a small micro benchmark with a very big structure, involving a main structure only containing a list of "commands" which were made of other structs and enum, that I can't share.
I also ran it on some dummy examples to show there is little difference in the results.

The benchmark looks like so:

const MAX_SIZE: usize = 11;

fn main() {
    let rng1 = Xoshiro256StarStar::seed_from_u64(123456);
    let mut mutator = Mutator::new(rng1);

    let mut hash_set = FnvHashSet::default();
    let mut collision: usize = 0;
    let mut empty_collision: usize = 0;

    let mut instance = MyStruct::new_fuzzed(&mut mutator, None);

    let mut repartition = [0usize; MAX_SIZE];
    repartition[instance.data.len()]  = 1;

    let total_loop = 1_000_000;

    for i in 0..total_loop {
        instance.mutate(&mut mutator, None);

        let mut serialized_data = Vec::with_capacity(instance.serialized_size());
        instance.binary_serialize::<_, LittleEndian>(&mut serialized_data);
        repartition[instance.data.len()]  = 1;

        if serialized_data == [] {
            empty_collision  = 1;
        } else if hash_set.contains(&serialized_data) {
            collision  = 1;
        } else {
            hash_set.insert(serialized_data);
        }
    }
    println!(
        "collisions: {} ({}%)",
        collision,
        (collision as f64 / total_loop as f64) * 100.0
    );
    println!(
        "collisions empty: {} ({}%)",
        empty_collision,
        (empty_collision as f64 / total_loop as f64) * 100.0
    );
    println!("Size repartition of the main list:",);
    for (i, num) in repartition.iter().enumerate() {
        println!("{i}: {}", (*num as f64 / total_loop as f64) * 100.0);
    }
}

With my private structure I have these results.

collisions: 28166 (2.8166%)
collisions with empty binary serialization: 3585 (0.3585%)
Size repartitions of the main list of commands:
0: 0.3585
1: 17.6358
2: 15.2746
3: 10.3906
4: 9.6714
5: 8.2377
6: 8.390400000000001
7: 6.4815
8: 6.7188
9: 4.8241
10: 12.0167

With a dummy structure with the top structure also only containing a Vec like this:

pub struct MyStruct {
    #[lain(min = 0, max = 10)]
    pub data: Vec<MyInnerStruct>,
}
pub struct MyInnerStruct {
    pub tag: u32,
    #[lain(ignore)]
    pub len: u32,
}

The results are pretty similar, there are more collisions as the space of the possible inputs is small and I believe it matches with the dangerous number generation chance:

collisions: 57062 (5.7062%)
collisions with empty binary serialization: 3626 (0.3626%)
Size repartitions of the main list:
0: 0.3626
1: 18.1247
2: 15.4408
3: 9.8559
4: 9.9487
5: 8.8891
6: 8.0154
7: 6.0756
8: 7.108200000000001
9: 4.5399
10: 11.639199999999999

The repartition is uniform when I replace the instance.mutate line in the loop with let instance = MyStruct::new_fuzzed.

collisions: 42233 (4.2233%)
collisions with empty binary serialization: 99726 (9.9726%)
Size repartitions of the main list:
0: 9.9726
1: 10.0299
2: 10.0398
3: 9.971
4: 9.9712
5: 10.005799999999999
6: 10.0024
7: 10.0438
8: 9.987
9: 9.9766
10: 0

When mutating this dummy structure which looks a bit similar and could correspond to a TLV encoded example:

pub struct MyStruct {
    pub tag: u32,
    #[lain(ignore)]
    pub len: u32,
    #[lain(min = 0, max = 10)]
    pub data: Vec<u8>,
}

The results are still similar to the first benchmark (note that it's impossible to serialize to something empty in that case):

collisions: 5330 (0.5329999999999999%)
collisions with empty binary serialization: 0 (0%)
Size repartitions of the main list of commands:
0: 0.3669
1: 17.455499999999997
2: 16.6173
3: 9.6759
4: 9.1338
5: 8.708200000000001
6: 8.2139
7: 6.8551
8: 6.7258
9: 4.884399999999999
10: 11.3633

@domenukk domenukk merged commit 68f6657 into AFLplusplus:main Nov 9, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

min/max fields are not always applied
2 participants