Optimizations #96

turol · 2024-08-06T16:23:00Z

A few cleanups, a benchmark script and some optimizations.

Disassembling a ~60MB binary on my Intel i7-12700H laptop:

    N           Min           Max        Median           Avg        Stddev
x  20         1.111         1.136         1.118        1.1192  0.0059701009
   20         1.093         1.115           1.1       1.10075  0.0058478516
Difference at 95.0% confidence
	-0.01845  /- 0.00378221
	-1.6485%  /- 0.337939%
	(Student's t, pooled s = 0.00590929)

turol · 2024-08-09T11:44:26Z

Triggers UBSAN failures. You should probably have that in your CI...

ianichitei · 2024-08-09T12:22:01Z

Yes, the GitHub CI does not run any tests. It is on my TODO list, but haven't got the time to add that yet. Is UBSAN/ASAN triggered only with these changes?

I experimented with this and I'm not sure I'm running it right, or that I'm interpreting the results the right way.

For example, comparing a disasmtool compiled from 8b4cc48 with the one from this PR, I get:

$ ./benchmark.sh ./old ./new in.bin 10
x before.txt
  after.txt
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
|                       x                                                                                                                                                                       |
|x       x              x      x           x          x                                         x   *      x                                                                                    |
|         |________________________________M_____A________|_____________________________|_______________M__A________________________________________________|                                   |
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
    N           Min           Max        Median           Avg        Stddev
x  10         1.406         1.434         1.417        1.4186   0.010297788
   10         1.416         1.456         1.433        1.4339   0.012931014
Difference at 95.0% confidence
        0.0153  /- 0.0109827
        1.07853%  /- 0.774195%
        (Student's t, pooled s = 0.0116888)

Doesn't this mean that the new version is actually slower?

Compiled for release with gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, running on a Ubuntu 22.04.4 LTS (via WSL2 on Windows 11 23H2), with a 13th Gen Intel(R) Core(TM) i7-1370P CPU.

turol · 2024-08-09T12:48:44Z

UBSAN was triggered with these changes, the previous code is clean. The branchless sign extension is not OK when the input is already 64-bit.

That output means that it is indeed slower. I was using GCC 13 on Debian testing running on real hardware. It could also still be a measurement artifact. I was using 20 samples.

turol · 2024-08-09T13:16:42Z

What is the size of your in.bin?

Do you get the expected null result if old and new are identical binaries?

On an AMD 5950x with Gentoo, GCC 13 and the same ~60 MB binary with 100 samples:
Two identical copies of the old binary

    N           Min           Max        Median           Avg        Stddev
x 100         1.421         1.547         1.465       1.46747   0.020762464
  100         1.422         1.506         1.462       1.46093   0.020405117
Difference at 95.0% confidence
        -0.00654  /- 0.00570575
        -0.445665%  /- 0.388815%
        (Student's t, pooled s = 0.0205846)

Old vs new code

    N           Min           Max        Median           Avg        Stddev
x 100         1.427         1.574         1.493       1.48831   0.023063678
  100         1.386         1.524         1.424       1.42498   0.021953109
Difference at 95.0% confidence
        -0.06333  /- 0.00624091
        -4.25516%  /- 0.419328%
        (Student's t, pooled s = 0.0225152)

The performance characteristics of modern systems are just completely f***** up. Maybe I should configure it for 99% confidence and always do more samples but that takes forever to run.

ianichitei · 2024-08-09T15:58:27Z

I used a 58M file I obtained by concatenating a bunch of shellcodes together.

If I use the same binary I get timings that are much closer:

$ ./benchmark.sh ./old ./old in.bin 10
    N           Min           Max        Median           Avg        Stddev
x  10         1.456         1.518         1.502        1.4941   0.021855332
   10         1.452         1.527         1.502        1.4937    0.02150478
No difference proven at 95.0% confidence

But this is just a small sample. These kinds of changes are small enough that if we want to accurately measure them we probably need to build some micro benchmarks.

turol · 2024-08-10T15:19:39Z

I managed to fix UBSAN. It turns out that sometimes ND_SIGN_EX is called with sz == 0 ...

turol added 4 commits August 6, 2024 19:13

Fix comment

a6ddf52

Only set bcstSize when it's required

30e3aa8

Add benchmark script

5dab7c7

Jump directly to exit on error and eliminate the stop variable

3c10d54

turol marked this pull request as draft August 9, 2024 11:59

Replace ND_SIGN_EX with less branchy version

0c676b7

turol force-pushed the optimize branch from 5a87ef9 to 0c676b7 Compare August 10, 2024 15:18

turol marked this pull request as ready for review August 10, 2024 15:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations #96

Optimizations #96

turol commented Aug 6, 2024

turol commented Aug 9, 2024

ianichitei commented Aug 9, 2024 •

edited

Loading

turol commented Aug 9, 2024

turol commented Aug 9, 2024

ianichitei commented Aug 9, 2024

turol commented Aug 10, 2024

Optimizations #96

Are you sure you want to change the base?

Optimizations #96

Conversation

turol commented Aug 6, 2024

turol commented Aug 9, 2024

ianichitei commented Aug 9, 2024 • edited Loading

turol commented Aug 9, 2024

turol commented Aug 9, 2024

ianichitei commented Aug 9, 2024

turol commented Aug 10, 2024

ianichitei commented Aug 9, 2024 •

edited

Loading