-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENHANCEMENT] PromQL: use Kahan summation for sum() #14074
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This can give a more precise result, by keeping a separate running compensation value to accumulate small errors. See https://en.wikipedia.org/wiki/Kahan_summation_algorithm Signed-off-by: Bryan Boreham <[email protected]>
beorn7
approved these changes
May 15, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
One day, we should apply all the Kahan tricks to histograms, too…
Current feeling is the slow-down will not be noticeable compared to overheads such as disk IO. |
beorn7
added a commit
that referenced
this pull request
Jul 4, 2024
This expands the idea implemented in #14074, where Kahan summation was applied to sum(). This comes at the price of not dual-using `floatMean` anymore, because we need both `floatMean` and `floatKahanC` for `avg`. This creates a small overhead, but also makes the Kahan parts easier to read. The trade-off in terms of performance is similar to the one made in `sum` with a following division. Things that work out for `sum` thanks to Kahan summation should equally work out for `avg`. This is reflected in the added tests. Signed-off-by: beorn7 <[email protected]>
beorn7
added a commit
that referenced
this pull request
Jul 4, 2024
This expands the idea implemented in #14074, where Kahan summation was applied to sum(). This comes at the price of not dual-using `floatMean` anymore, because we need both `floatMean` and `floatKahanC` for `avg`. This creates a small overhead, but also makes the Kahan parts easier to read. The trade-off in terms of performance is similar to the one made in essentially syntactic sugar for `sum` with a following division. Things that work out for `sum` thanks to Kahan summation should therefory equally work out for `avg`. This is reflected in the added tests. Signed-off-by: beorn7 <[email protected]>
beorn7
added a commit
that referenced
this pull request
Jul 4, 2024
The basic idea here is that the previous code was always doing incremental calculation of the mean value, which is more costly and can be less precise. It protects against overflows, but in most cases, an overflow doesn't happen anyway. The other idea applied here is to expand on #14074, where Kahan summation was applied to sum(). With this commit, the average is calculated in a conventional way (adding everything up and divide in the end) as long as the sum isn't overflowing float64. This is combined with Kahan summation so that the avg aggregation, in most cases, is really equivalent to the sum aggregation with a following division (which is the user's expectation as avg is supposed to be syntactic sugar for sum with a following divison). If the sum hits ±Inf, the calculation reverts to incremental calculation of the mean value. Kahan summation is also applied here, although it cannot fully compensate for the numerical errors introduced by the incremental mean calculation. (The tests added in this commit would fail if incremental mean calculation was always used.) Signed-off-by: beorn7 <[email protected]>
beorn7
added a commit
that referenced
this pull request
Jul 4, 2024
The basic idea here is that the previous code was always doing incremental calculation of the mean value, which is more costly and can be less precise. It protects against overflows, but in most cases, an overflow doesn't happen anyway. The other idea applied here is to expand on #14074, where Kahan summation was applied to sum(). With this commit, the average is calculated in a conventional way (adding everything up and divide in the end) as long as the sum isn't overflowing float64. This is combined with Kahan summation so that the avg aggregation, in most cases, is really equivalent to the sum aggregation with a following division (which is the user's expectation as avg is supposed to be syntactic sugar for sum with a following divison). If the sum hits ±Inf, the calculation reverts to incremental calculation of the mean value. Kahan summation is also applied here, although it cannot fully compensate for the numerical errors introduced by the incremental mean calculation. (The tests added in this commit would fail if incremental mean calculation was always used.) Signed-off-by: beorn7 <[email protected]>
beorn7
added a commit
that referenced
this pull request
Jul 4, 2024
The basic idea here is that the previous code was always doing incremental calculation of the mean value, which is more costly and can be less precise. It protects against overflows, but in most cases, an overflow doesn't happen anyway. The other idea applied here is to expand on #14074, where Kahan summation was applied to sum(). With this commit, the average is calculated in a conventional way (adding everything up and divide in the end) as long as the sum isn't overflowing float64. This is combined with Kahan summation so that the avg aggregation, in most cases, is really equivalent to the sum aggregation with a following division (which is the user's expectation as avg is supposed to be syntactic sugar for sum with a following divison). If the sum hits ±Inf, the calculation reverts to incremental calculation of the mean value. Kahan summation is also applied here, although it cannot fully compensate for the numerical errors introduced by the incremental mean calculation. (The tests added in this commit would fail if incremental mean calculation was always used.) Signed-off-by: beorn7 <[email protected]>
beorn7
added a commit
that referenced
this pull request
Jul 4, 2024
The basic idea here is that the previous code was always doing incremental calculation of the mean value, which is more costly and can be less precise. It protects against overflows, but in most cases, an overflow doesn't happen anyway. The other idea applied here is to expand on #14074, where Kahan summation was applied to sum(). With this commit, the average is calculated in a conventional way (adding everything up and divide in the end) as long as the sum isn't overflowing float64. This is combined with Kahan summation so that the avg aggregation, in most cases, is really equivalent to the sum aggregation with a following division (which is the user's expectation as avg is supposed to be syntactic sugar for sum with a following divison). If the sum hits ±Inf, the calculation reverts to incremental calculation of the mean value. Kahan summation is also applied here, although it cannot fully compensate for the numerical errors introduced by the incremental mean calculation. (The tests added in this commit would fail if incremental mean calculation was always used.) Signed-off-by: beorn7 <[email protected]>
beorn7
added a commit
that referenced
this pull request
Jul 10, 2024
The basic idea here is that the previous code was always doing incremental calculation of the mean value, which is more costly and can be less precise. It protects against overflows, but in most cases, an overflow doesn't happen anyway. The other idea applied here is to expand on #14074, where Kahan summation was applied to sum(). With this commit, the average is calculated in a conventional way (adding everything up and divide in the end) as long as the sum isn't overflowing float64. This is combined with Kahan summation so that the avg aggregation, in most cases, is really equivalent to the sum aggregation with a following division (which is the user's expectation as avg is supposed to be syntactic sugar for sum with a following divison). If the sum hits ±Inf, the calculation reverts to incremental calculation of the mean value. Kahan summation is also applied here, although it cannot fully compensate for the numerical errors introduced by the incremental mean calculation. (The tests added in this commit would fail if incremental mean calculation was always used.) Signed-off-by: beorn7 <[email protected]>
2 tasks
jhesketh
added a commit
to jhesketh/mimir
that referenced
this pull request
Aug 7, 2024
Implements the improvements from prometheus/prometheus#14074 prometheus/prometheus#14362
4 tasks
jhesketh
added a commit
to grafana/mimir
that referenced
this pull request
Aug 7, 2024
* MQE: Use kahan in sum aggregation Implements the improvements from prometheus/prometheus#14074 prometheus/prometheus#14362 * Update CHANGELOG
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This can give a more precise result, by keeping a separate running compensation value to accumulate small errors.
See https://en.wikipedia.org/wiki/Kahan_summation_algorithm
Possible improvement for #14052 (I won't call it a fix).
I'm re-using
floatMean
which is otherwise unused forsum()
; it would be clearer to add a new field but would cost 8 bytes per output value.Benchmarks show a bigger slow-down than I expected.