-
Notifications
You must be signed in to change notification settings - Fork 842
Computation of the weighted geometric mean
For results before chrome 118 the overall result for the CPU benchmarks was simply the geometric mean of the slowdown factor for each implementation and benchmark. The slowdown factor is the duration for the implementation and benchmark divided by the duration of the fastest implementation for that benchmark.
If you look at the results you see that the benchmarks have unequal spread of the factors. Create row factors are closer than select row and so on. If we simple take the geometric mean thus we emphasize the influence for those benchmarks that have a large spread. (Sadly those are even the weakest benchmarks in terms of variance and stability...) So it seems like an idea to use a weighted geometric mean (https://en.wikipedia.org/wiki/Weighted_geometric_mean).
What weights could we use for that purpose. For each benchmark we could take the 1/factor for the slowest implementation. Select row would then have a weight of 1/47.6684 and create row a weight of 1/3.267. But what if choo changed it's (obviously pretty slow) implementation for select row and let's say performs as good as blazor-wasm? Then the weight would drop to 1/24.7354, which is obviously a big change.
Thus we're using the 90% percentile of the factors, which results in a weight of 1/5.16529 for select rows and thus is a much less drastic weight.
Those are the current weights:
benchmark | fastest | 90% percentile | 90% factor | weight | slowest | factor for slowest |
---|---|---|---|---|---|---|
01_run1k | 38.72 | 60.24 | 1.56 | 0.64 | 126.51 | 3.27 |
02_replace1k | 39.15 | 69.82 | 1.78 | 0.56 | 192.16 | 4.91 |
03_update10th1k_x16 | 19.34 | 34.26 | 1.77 | 0.56 | 168.47 | 8.71 |
04_select1k | 3.29 | 17.06 | 5.19 | 0.19 | 156.59 | 47.67 |
05_swap1k | 22.61 | 171.26 | 7.58 | 0.13 | 328.72 | 14.54 |
06_remove-one-1k | 17.72 | 33.58 | 1.89 | 0.53 | 163.77 | 9.24 |
07_create10k | 386.77 | 685.22 | 1.77 | 0.56 | 2415.89 | 6.25 |
08_create1k-after1k_x2 | 40.88 | 74.22 | 1.82 | 0.55 | 160.16 | 3.92 |
09_clear1k_x8 | 13.14 | 31.10 | 2.37 | 0.42 | 53.41 | 4.06 |
Those weights might be readjusted at some point in future.
If you want to take a closer look you can play in an excel file with the weights: results.xlsx