Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations #96

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Optimizations #96

wants to merge 5 commits into from

Conversation

turol
Copy link
Contributor

@turol turol commented Aug 6, 2024

A few cleanups, a benchmark script and some optimizations.

Disassembling a ~60MB binary on my Intel i7-12700H laptop:

    N           Min           Max        Median           Avg        Stddev
x  20         1.111         1.136         1.118        1.1192  0.0059701009
   20         1.093         1.115           1.1       1.10075  0.0058478516
Difference at 95.0% confidence
	-0.01845  /- 0.00378221
	-1.6485%  /- 0.337939%
	(Student's t, pooled s = 0.00590929)

@turol
Copy link
Contributor Author

turol commented Aug 9, 2024

Triggers UBSAN failures. You should probably have that in your CI...

@turol turol marked this pull request as draft August 9, 2024 11:59
@ianichitei
Copy link
Contributor

ianichitei commented Aug 9, 2024

Yes, the GitHub CI does not run any tests. It is on my TODO list, but haven't got the time to add that yet. Is UBSAN/ASAN triggered only with these changes?

I experimented with this and I'm not sure I'm running it right, or that I'm interpreting the results the right way.

For example, comparing a disasmtool compiled from 8b4cc48 with the one from this PR, I get:

$ ./benchmark.sh ./old ./new in.bin 10
x before.txt
  after.txt
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
|                       x                                                                                                                                                                       |
|x       x              x      x           x          x                                         x   *      x                                                                                    |
|         |________________________________M_____A________|_____________________________|_______________M__A________________________________________________|                                   |
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
    N           Min           Max        Median           Avg        Stddev
x  10         1.406         1.434         1.417        1.4186   0.010297788
   10         1.416         1.456         1.433        1.4339   0.012931014
Difference at 95.0% confidence
        0.0153  /- 0.0109827
        1.07853%  /- 0.774195%
        (Student's t, pooled s = 0.0116888)

Doesn't this mean that the new version is actually slower?

Compiled for release with gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, running on a Ubuntu 22.04.4 LTS (via WSL2 on Windows 11 23H2), with a 13th Gen Intel(R) Core(TM) i7-1370P CPU.

@turol
Copy link
Contributor Author

turol commented Aug 9, 2024

UBSAN was triggered with these changes, the previous code is clean. The branchless sign extension is not OK when the input is already 64-bit.

That output means that it is indeed slower. I was using GCC 13 on Debian testing running on real hardware. It could also still be a measurement artifact. I was using 20 samples.

@turol
Copy link
Contributor Author

turol commented Aug 9, 2024

What is the size of your in.bin?

Do you get the expected null result if old and new are identical binaries?

On an AMD 5950x with Gentoo, GCC 13 and the same ~60 MB binary with 100 samples:
Two identical copies of the old binary

    N           Min           Max        Median           Avg        Stddev
x 100         1.421         1.547         1.465       1.46747   0.020762464
  100         1.422         1.506         1.462       1.46093   0.020405117
Difference at 95.0% confidence
        -0.00654  /- 0.00570575
        -0.445665%  /- 0.388815%
        (Student's t, pooled s = 0.0205846)

Old vs new code

    N           Min           Max        Median           Avg        Stddev
x 100         1.427         1.574         1.493       1.48831   0.023063678
  100         1.386         1.524         1.424       1.42498   0.021953109
Difference at 95.0% confidence
        -0.06333  /- 0.00624091
        -4.25516%  /- 0.419328%
        (Student's t, pooled s = 0.0225152)

The performance characteristics of modern systems are just completely f***** up. Maybe I should configure it for 99% confidence and always do more samples but that takes forever to run.

@ianichitei
Copy link
Contributor

I used a 58M file I obtained by concatenating a bunch of shellcodes together.

If I use the same binary I get timings that are much closer:

$ ./benchmark.sh ./old ./old in.bin 10
    N           Min           Max        Median           Avg        Stddev
x  10         1.456         1.518         1.502        1.4941   0.021855332
   10         1.452         1.527         1.502        1.4937    0.02150478
No difference proven at 95.0% confidence

But this is just a small sample. These kinds of changes are small enough that if we want to accurately measure them we probably need to build some micro benchmarks.

@turol
Copy link
Contributor Author

turol commented Aug 10, 2024

I managed to fix UBSAN. It turns out that sometimes ND_SIGN_EX is called with sz == 0 ...

@turol turol marked this pull request as ready for review August 10, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants