-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass fmt::Arguments
by reference to PanicInfo
and PanicMessage
#129491
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
@StackOverflowExcept1on: 🔑 Insufficient privileges: not in try users |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` Resolves rust-lang#129330 For some reason after rust-lang#115974 and rust-lang#126732 optimizations applied to panic handler became worse and compiler stopped removing panic locations if they are not used in the panic message. This PR fixes that and maybe we can merge it into beta before rust 1.81 is released. Note: optimization only works with `lto = "fat"`. r? libs-api
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (4d6f35a): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (secondary 2.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResults (secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 751.449s -> 751.215s (-0.03%) |
I don't know why bot rejected PR. Maybe it needs some special case like |
That isn't really strong evidence, it's not uncommon for things that shouldn't really affect size much in a release-compiled binary to nonetheless introduce a variance of about 100KB, like here: #129498 |
I'm shocked that this produces the desired optimization. Usually references get significantly worse optimizations than value because the optimizer has to reason about aliasing, so the fact that adding references makes things optimize better seems extremely fragile. |
@saethlin I tried to use Although it's still weird because |
r? @m-ou-se |
Is the impact inflated code size? If so, how much is the change? If it is not negligible (or even if it is), an rmake test would probably be good to make sure it doesn't regress without us noticing. Especially considering this change is all that is required to get it pruned, seems a bit fragile. |
@tgross35 How to run rmake test?
See an example of changing the binary size for I also opened the example from #129330 in the decompiler with and without this PR change. In addition to removing location panics, the compiler generates less code. Before:void __fastcall rust_begin_unwind(__int128 *a1, __int64 a2, __int64 a3, __int64 a4, __int64 a5, __int64 a6)
{
__int128 v6; // xmm0
__int128 v7; // xmm1
_QWORD v8[2]; // [rsp 0h] [rbp-478h] BYREF
_OWORD v9[3]; // [rsp 10h] [rbp-468h] BYREF
_QWORD v10[6]; // [rsp 40h] [rbp-438h] BYREF
unsigned int v11; // [rsp 74h] [rbp-404h] BYREF
_BYTE v12[1024]; // [rsp 78h] [rbp-400h] BYREF
v6 = *a1;
v7 = a1[1];
v9[2] = a1[2];
v9[1] = v7;
v9[0] = v6;
v11 = 0;
v8[0] = v9;
v10[0] = &off_2CC8;
v10[1] = 2LL;
v10[4] = 0LL;
v8[1] = <core::fmt::Arguments as core::fmt::Display>::fmt;
v10[2] = v8;
v10[3] = 1LL;
(core::fmt::Write::write_fmt)(&v11, v10, a3, v8, a5, a6);
gr_panic(v12, v11);
JUMPOUT(0x1C32LL);
} After:void __fastcall rust_begin_unwind(__int64 a1)
{
__int64 v1; // [rsp 8h] [rbp-450h] BYREF
_QWORD v2[2]; // [rsp 10h] [rbp-448h] BYREF
_QWORD v3[6]; // [rsp 20h] [rbp-438h] BYREF
unsigned int v4; // [rsp 54h] [rbp-404h] BYREF
_BYTE v5[1024]; // [rsp 58h] [rbp-400h] BYREF
v1 = a1;
v4 = 0;
v2[0] = &v1;
v3[0] = &off_2C10;
v3[1] = 2LL;
v3[4] = 0LL;
v2[1] = <&T as core::fmt::Display>::fmt;
v3[2] = v2;
v3[3] = 1LL;
core::fmt::Write::write_fmt((__int64)&v4, v3);
gr_panic(v5, v4);
JUMPOUT(0x1B91LL);
}
However, this change has worked for the last few years (only tested nightly rust 2022). |
Please just show me the assembly. |
Pointing to a decompiler's output is actively misleading because it reconstructs the program's information into a language that does not have a 1-1 match to the actual instruction set. So a single line of decompiled C can be 8 assembly instructions, or 8 lines of C can be emitted from a single assembly instruction. We can at least reconstruct a partially-accurate binary size from assembly. |
@workingjubilee I tried to emulate something similar on godbolt and it didn't work. I think it makes sense to just attach 2 ELF files (rename it to Source: https://github.com/StackOverflowExcept1on/rust-regression/blob/master/program/src/lib.rs
// c2fdb3435fe
__int64 init()
{
core::panicking::panic_fmt();
return rust_begin_unwind();
}
// eef00c8be8d
__int64 init()
{
__int128 v1; // [rsp 8h] [rbp-30h] BYREF
__int64 v2; // [rsp 18h] [rbp-20h]
__int128 v3; // [rsp 20h] [rbp-18h]
*(_QWORD *)&v1 = &off_2BE0;
*((_QWORD *)&v1 1) = 1LL;
v2 = 8LL;
v3 = 0LL;
core::panicking::panic_fmt();
return rust_begin_unwind(&v1);
} From the pseudocode of the rust/library/core/src/panicking.rs Line 122 in 748c548
|
Do you have any idea why taking |
@m-ou-se I don't know, maybe because of |
@m-ou-se my best guess why this happens is that When using only references to create rust/library/core/src/panicking.rs Lines 59 to 74 in eb33b43
|
It seems unlikely this will make anything worse, so let's merge this for now. Once we reduce the size of fmt::Arguments, we should re-evaulate all the references to @bors r rollup=never |
☀️ Test successful - checks-actions |
Finished benchmarking commit (aaed38b): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary -1.7%, secondary -3.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 5.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 765.544s -> 765.7s (0.02%) |
Resolves #129330
For some reason after #115974 and #126732 optimizations applied to panic handler became worse and compiler stopped removing panic locations if they are not used in the panic message. This PR fixes that and maybe we can merge it into beta before rust 1.81 is released.
Note: optimization only works with
lto = "fat"
.r? libs-api