Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YJIT: implement call fuzzer script #9129

Merged
merged 13 commits into from
Dec 11, 2023
Merged

Conversation

maximecb
Copy link
Contributor

@maximecb maximecb commented Dec 5, 2023

Attempt to detect bugs in YJIT call implementation.

This is very basic at this point and so I'll make it a draft PR, but would like some suggestions on where to go next @XrXr

I was thinking I could compute the sum and/or product of the arguments to make sure that the generated methods produce the result we expect, signifying that arguments were passed correctly. I could also pass things like strings or objects that require heap pointers, to make sure that we track types correctly.

I'll try to make it as dynamic as possible, and add more tricky edge cases (suggestions welcome), but somewhat afraid that this script will become complicated and hard to follow fairly quickly 😅

Another potential approach to this would be to fuzz by concatenating random tokens, and then forking both a CRuby interpreter and a YJIT process, and checking that the output from both is the same. This is more inefficient in terms of compute and time

Sample output:

Iteration 997
Defining
"def m(p0, p1, p2, p3, p4, *rest, k0:, k1: 31, k2:, k3:)\nlocal0 = 0\nlocal1 = 1\nlocal2 = 2\nif p0 > 4; end\nif p1 > 4; end\nif p2 > 4; end\nif p3 > 4; end\nif p4 > 4; end\nchecksum = 0\nchecksum  = 1 * arg_val(p0)\nchecksum  = 2 * arg_val(p1)\nchecksum  = 3 * arg_val(p2)\nchecksum  = 4 * arg_val(p3)\nchecksum  = 5 * arg_val(p4)\nchecksum  = 1 * arg_val(k0)\nchecksum  = 2 * arg_val(k1)\nchecksum  = 3 * arg_val(k2)\nchecksum  = 4 * arg_val(k3)\nif block_given?; r = yield; checksum  = arg_val(r); end\nraise 'rest is not array' unless rest.kind_of?(Array)\nraise 'rest size not integer' unless rest.size.kind_of?(Integer)\nchecksum\nend"
Calling
"f.m(97, 75, 64, 91, 51, k0: 6, k1: 28, k2: 11, k3: 3) { 81 }"
checksum=1246

Iteration 998
Defining
"def m(p0, p1, p2, p3, p4, p5, p6, p7, p8 = 17, k0: 7, k1:, k2: 83, k3: 37, k4: 80, **kwrest)\nlocal0 = 0\nlocal1 = 1\nlocal2 = 2\nlocal3 = 3\nlocal4 = 4\nlocal5 = 5\nlocal6 = 6\nlocal7 = 7\nif p0 > 4; end\nif p1 > 4; end\nif p2 > 4; end\nif p3 > 4; end\nif p4 > 4; end\nif p5 > 4; end\nif p6 > 4; end\nif p7 > 4; end\nif p8 > 4; end\nchecksum = 0\nchecksum  = 1 * arg_val(p0)\nchecksum  = 2 * arg_val(p1)\nchecksum  = 3 * arg_val(p2)\nchecksum  = 4 * arg_val(p3)\nchecksum  = 5 * arg_val(p4)\nchecksum  = 6 * arg_val(p5)\nchecksum  = 7 * arg_val(p6)\nchecksum  = 8 * arg_val(p7)\nchecksum  = 9 * arg_val(p8)\nchecksum  = 1 * arg_val(k0)\nchecksum  = 2 * arg_val(k1)\nchecksum  = 3 * arg_val(k2)\nchecksum  = 4 * arg_val(k3)\nchecksum  = 5 * arg_val(k4)\nif block_given?; r = yield; checksum  = arg_val(r); end\nraise 'kwrest is not a hash' unless kwrest.kind_of?(Hash)\nraise 'kwrest size not integer' unless kwrest.size.kind_of?(Integer)\nchecksum\nend"
Calling
"f.m(39, 77, 13, 84, 52, 12, 16, 32, 42, k1: 48, k3: 66) { 22 }"
checksum=2684

Iteration 999
Defining
"def m(k0:, k1: 87, k2:, k3: 7, k4: 37, k5: 6, k6: 62, k7: 10, k8: 28)\nlocal0 = 0\nlocal1 = 1\nlocal2 = 2\nlocal3 = 3\nlocal4 = 4\nlocal5 = 5\nlocal6 = 6\nlocal7 = 7\nchecksum = 0\nchecksum  = 1 * arg_val(k0)\nchecksum  = 2 * arg_val(k1)\nchecksum  = 3 * arg_val(k2)\nchecksum  = 4 * arg_val(k3)\nchecksum  = 5 * arg_val(k4)\nchecksum  = 6 * arg_val(k5)\nchecksum  = 7 * arg_val(k6)\nchecksum  = 8 * arg_val(k7)\nchecksum  = 9 * arg_val(k8)\nif block_given?; r = yield; checksum  = arg_val(r); end\nchecksum\nend"
Calling
"f.m(k0: 94, k1: 75, k2: 82, k6: 15)"
checksum=1176

Code region size: 15,417,344
79.3 iterations/s
285,433 iterations/hour

Attempt to detect bugs in YJIT call implementation.
@maximecb maximecb self-assigned this Dec 5, 2023
@maximecb maximecb marked this pull request as draft December 5, 2023 21:50
@XrXr
Copy link
Member

XrXr commented Dec 6, 2023

This is looking pretty good!

I was thinking I could compute the sum and/or product of the arguments to make sure that the generated methods produce the result we expect, signifying that arguments were passed correctly.

Alternatively, you could make the method return its parameters as an array and assert that. Gives you more flexibility as to what you could pass, as you mentioned you might want to do.

Another potential approach to this would be to fuzz by concatenating random tokens, and then forking both a CRuby interpreter and a YJIT process, and checking that the output from both is the same.

I think the current approach should expose most problems, but if we want to compare agains the interpreter later, it seems easy to add. Running with --yjit-disable, we could do:

result = testbed.call
fork do
  RubyVM::YJIT.enable
  yjit_result = testbed.call
  # send result to parent
end
Process.wait
assert_eq(result, yjit_result)

... and add more tricky edge cases (suggestions welcome)

We've seen number of local variables in a method giving causing bugs, so we could fuzz that by adding unused locals. You can add block related constructs, like defining the method using define_method in block form rather using def, and/or yielding to blocks (has additional semantics on top of normal method parameter lists).

misc/call_fuzzer.rb Outdated Show resolved Hide resolved
@maximecb
Copy link
Contributor Author

maximecb commented Dec 6, 2023

We've seen number of local variables in a method giving causing bugs, so we could fuzz that by adding unused locals. You can add block related constructs, like defining the method using define_method in block form rather using def, and/or yielding to blocks (has additional semantics on top of normal method parameter lists).

Thanks for the suggestions. Making notes and will likely add some of these :)

I'm in the process of refactoring the code to make the call fuzzer more maintainable and powerful.

@maximecb maximecb marked this pull request as ready for review December 7, 2023 21:13
@maximecb maximecb requested a review from k0kubun December 7, 2023 21:13
@maximecb
Copy link
Contributor Author

maximecb commented Dec 7, 2023

I completed a first version of this script. I refactored it in a way that I think makes it easier to understand and extend with new functionality.

For the moment, with the experiments I've done, no errors have been detected, which is encouraging.

Next I would like to add the ability to define methods using define_method in block form as Alan suggested (can you link to some documentation/examples @XrXr ?).

Another thing I'm not yet really testing is rest/splat arguments.

@XrXr
Copy link
Member

XrXr commented Dec 7, 2023

This is the method doc for Module#define_method: https://docs.ruby-lang.org/en/3.2/Module.html#method-i-define_method
For examples you can take a look in test/ruby/test_method.rb and/or git grep "define_method(:". The gist of it is that the parameter list is the same as if you used def, but put between || instead of ().

@maximecb
Copy link
Contributor Author

maximecb commented Dec 7, 2023

Ty Alan!

@k0kubun
Copy link
Member

k0kubun commented Dec 7, 2023

Does this take a lot of time to run? If not, we could run this on CI too.

@maximecb
Copy link
Contributor Author

maximecb commented Dec 8, 2023

Does this take a lot of time to run? If not, we could run this on CI too.

Depends how many iterations you want to run. Right now it runs 80 iterations per second with YJIT. So I guess it would depend what our time budget is in terms of minutes of execution time. The other factor is that the tests are randomly generated, so non-deterministic by default. We could also make them deterministic by setting the random seed if we prefer.

At this time though, this test has not detected any new bugs. I may still try to test more things next week though.

@maximecb maximecb merged commit 3f25c08 into ruby:master Dec 11, 2023
98 of 99 checks passed
@maximecb maximecb deleted the yjit_call_fuzzer branch December 11, 2023 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants