Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` #129491

StackOverflowExcept1on · 2024-08-23T23:34:16Z

For some reason after #115974 and #126732 optimizations applied to panic handler became worse and compiler stopped removing panic locations if they are not used in the panic message. This PR fixes that and maybe we can merge it into beta before rust 1.81 is released.

Note: optimization only works with lto = "fat".

r? libs-api

StackOverflowExcept1on · 2024-08-23T23:36:31Z

@bors try @rust-timer queue

bors · 2024-08-23T23:36:34Z

@StackOverflowExcept1on: 🔑 Insufficient privileges: not in try users

workingjubilee · 2024-08-24T01:18:55Z

@bors try @rust-timer queue

Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` Resolves rust-lang#129330 For some reason after rust-lang#115974 and rust-lang#126732 optimizations applied to panic handler became worse and compiler stopped removing panic locations if they are not used in the panic message. This PR fixes that and maybe we can merge it into beta before rust 1.81 is released. Note: optimization only works with `lto = "fat"`. r? libs-api

bors · 2024-08-24T01:20:06Z

⌛ Trying commit c2fdb34 with merge 4d6f35a...

bors · 2024-08-24T03:14:27Z

☀️ Try build successful - checks-actions
Build commit: 4d6f35a (4d6f35af7e092aa393d8058d662ae8613385e3b9)

rust-timer · 2024-08-24T04:41:45Z

Finished benchmarking commit (4d6f35a): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (secondary 2.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.3%	[2.3%, 2.3%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	1
All ❌✅ (primary)	-	-	0

Bootstrap: 751.449s -> 751.215s (-0.03%)
Artifact size: 339.00 MiB -> 338.95 MiB (-0.01%)

StackOverflowExcept1on · 2024-08-24T10:20:09Z

I don't know why bot rejected PR. Maybe it needs some special case like crate-type = (c)dylib with lto = "fat" and custom panic handler, but as we can see the binary size of librustc_driver.so has decreased (-99.06 KiB, -0.074%).

workingjubilee · 2024-08-24T19:15:05Z

That isn't really strong evidence, it's not uncommon for things that shouldn't really affect size much in a release-compiled binary to nonetheless introduce a variance of about 100KB, like here: #129498

saethlin · 2024-08-24T22:45:47Z

I'm shocked that this produces the desired optimization. Usually references get significantly worse optimizations than value because the optimizer has to reason about aliasing, so the fact that adding references makes things optimize better seems extremely fragile.

StackOverflowExcept1on · 2024-08-25T11:02:06Z

so the fact that adding references makes things optimize better seems extremely fragile.

@saethlin I tried to use panic_info_message feature with nightly rust, which was released 2 years ago (it uses references) and even on older versions it worked fine. Maybe the reason is size_of::<fmt::Arguments>() = 48, and also it is a Copy type?

Although it's still weird because PanicInfo<'a> has 'a lifetime, and Location::caller() returns &'static Location<'static>, i.e. the compiler will probably infer the type PanicInfo<'static> and still not be able to optimize the code.

Amanieu · 2024-08-26T23:17:20Z

r? @m-ou-se

tgross35 · 2024-08-27T20:49:05Z

Is the impact inflated code size? If so, how much is the change?

If it is not negligible (or even if it is), an rmake test would probably be good to make sure it doesn't regress without us noticing. Especially considering this change is all that is required to get it pruned, seems a bit fragile.

StackOverflowExcept1on · 2024-08-27T21:31:18Z

@tgross35 How to run rmake test?

Is the impact inflated code size? If so, how much is the change?

See an example of changing the binary size for librustc_driver.so: #129491 (comment). I think it depends on length of directory where cargo build is run and number of places with panic!().

I also opened the example from #129330 in the decompiler with and without this PR change. In addition to removing location panics, the compiler generates less code.

Before:

void __fastcall rust_begin_unwind(__int128 *a1, __int64 a2, __int64 a3, __int64 a4, __int64 a5, __int64 a6)
{
  __int128 v6; // xmm0
  __int128 v7; // xmm1
  _QWORD v8[2]; // [rsp 0h] [rbp-478h] BYREF
  _OWORD v9[3]; // [rsp 10h] [rbp-468h] BYREF
  _QWORD v10[6]; // [rsp 40h] [rbp-438h] BYREF
  unsigned int v11; // [rsp 74h] [rbp-404h] BYREF
  _BYTE v12[1024]; // [rsp 78h] [rbp-400h] BYREF

  v6 = *a1;
  v7 = a1[1];
  v9[2] = a1[2];
  v9[1] = v7;
  v9[0] = v6;
  v11 = 0;
  v8[0] = v9;
  v10[0] = &off_2CC8;
  v10[1] = 2LL;
  v10[4] = 0LL;
  v8[1] = <core::fmt::Arguments as core::fmt::Display>::fmt;
  v10[2] = v8;
  v10[3] = 1LL;
  (core::fmt::Write::write_fmt)(&v11, v10, a3, v8, a5, a6);
  gr_panic(v12, v11);
  JUMPOUT(0x1C32LL);
}

After:

void __fastcall rust_begin_unwind(__int64 a1)
{
  __int64 v1; // [rsp 8h] [rbp-450h] BYREF
  _QWORD v2[2]; // [rsp 10h] [rbp-448h] BYREF
  _QWORD v3[6]; // [rsp 20h] [rbp-438h] BYREF
  unsigned int v4; // [rsp 54h] [rbp-404h] BYREF
  _BYTE v5[1024]; // [rsp 58h] [rbp-400h] BYREF

  v1 = a1;
  v4 = 0;
  v2[0] = &v1;
  v3[0] = &off_2C10;
  v3[1] = 2LL;
  v3[4] = 0LL;
  v2[1] = <&T as core::fmt::Display>::fmt;
  v3[2] = v2;
  v3[3] = 1LL;
  core::fmt::Write::write_fmt((__int64)&v4, v3);
  gr_panic(v5, v4);
  JUMPOUT(0x1B91LL);
}

Especially considering this change is all that is required to get it pruned, seems a bit fragile.

However, this change has worked for the last few years (only tested nightly rust 2022).

workingjubilee · 2024-08-28T02:19:52Z

Please just show me the assembly.

workingjubilee · 2024-08-28T02:20:36Z

Pointing to a decompiler's output is actively misleading because it reconstructs the program's information into a language that does not have a 1-1 match to the actual instruction set. So a single line of decompiled C can be 8 assembly instructions, or 8 lines of C can be emitted from a single assembly instruction.

We can at least reconstruct a partially-accurate binary size from assembly.

StackOverflowExcept1on · 2024-08-28T08:59:46Z

@workingjubilee I tried to emulate something similar on godbolt and it didn't work. I think it makes sense to just attach 2 ELF files (rename it to .so). Use objdump or some other tools like IDA Pro. I also attach .c pseudocode and .asm.

Source: https://github.com/StackOverflowExcept1on/rust-regression/blob/master/program/src/lib.rs

Commit c2fdb34
libprogram-stage1-c2fdb3435fe.so (7352 bytes)
libprogram-stage1-c2fdb3435fe.so.asm
libprogram-stage1-c2fdb3435fe.so.c
Previous commit eef00c8
libprogram-stage1-eef00c8be8d.so (7488 bytes)
libprogram-stage1-eef00c8be8d.so.asm
libprogram-stage1-eef00c8be8d.so.c

// c2fdb3435fe
__int64 init()
{
  core::panicking::panic_fmt();
  return rust_begin_unwind();
}

// eef00c8be8d
__int64 init()
{
  __int128 v1; // [rsp 8h] [rbp-30h] BYREF
  __int64 v2; // [rsp 18h] [rbp-20h]
  __int128 v3; // [rsp 20h] [rbp-18h]

  *(_QWORD *)&v1 = &off_2BE0;
  *((_QWORD *)&v1   1) = 1LL;
  v2 = 8LL;
  v3 = 0LL;
  core::panicking::panic_fmt();
  return rust_begin_unwind(&v1);
}

From the pseudocode of the init function you can see that the compiler probably uses comptime instead of runtime.

rust/library/core/src/panicking.rs

Line 122 in 748c548

 super::intrinsics::const_eval_select((fmt, force_no_backtrace), comptime, runtime); 

m-ou-se · 2024-09-05T12:20:11Z

Do you have any idea why taking message by reference would allow for optimizing away unused locations?

StackOverflowExcept1on · 2024-09-05T12:27:46Z

@m-ou-se I don't know, maybe because of const_eval_select. It might also make sense to compile the same binary with and without lto to see what effect it has.

StackOverflowExcept1on · 2024-09-05T20:25:41Z

@m-ou-se my best guess why this happens is that panic_impl is marked as extern "Rust" and the call site of this function is probably less optimized. For example, why did the compiler remove panic locations, but not by default, but only with lto="fat"?. This looks like some optimization issue near FFI boundaries.

When using only references to create PanicInfo, the compiler can probably make more assumptions about the use of panic locations by the panic_impl implementation.

rust/library/core/src/panicking.rs

Lines 59 to 74 in eb33b43

 // NOTE This function never crosses the FFI boundary; it's a Rust-to-Rust call 

 // that gets resolved to the `#[panic_handler]` function. 

 extern "Rust" { 

 #[lang = "panic_impl"] 

 fn panic_impl(pi: &PanicInfo<'_>) -> !; 

 } 

 let pi = PanicInfo::new( 

 fmt, 

 Location::caller(), 

 /* can_unwind */ true, 

 /* force_no_backtrace */ false, 

 ); 

 // SAFETY: `panic_impl` is defined in safe Rust code and thus is safe to call. 

 unsafe { panic_impl(&pi) }

m-ou-se · 2024-09-18T09:47:34Z

It seems unlikely this will make anything worse, so let's merge this for now.

Once we reduce the size of fmt::Arguments, we should re-evaulate all the references to fmt::Arguments whether they should be by value instead.

@bors r rollup=never

bors · 2024-09-18T09:47:36Z

📌 Commit c2fdb34 has been approved by m-ou-se

It is now in the queue for this repository.

bors · 2024-09-18T11:57:35Z

⌛ Testing commit c2fdb34 with merge aaed38b...

bors · 2024-09-18T14:23:57Z

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing aaed38b to master...

rust-timer · 2024-09-18T17:06:23Z

Finished benchmarking commit (aaed38b): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -1.7%, secondary -3.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.7%	[-1.8%, -1.6%]	2
Improvements ✅ (secondary)	-3.1%	[-3.1%, -3.1%]	1
All ❌✅ (primary)	-1.7%	[-1.8%, -1.6%]	2

Cycles

Results (secondary 5.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	5.9%	[5.4%, 6.7%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 765.544s -> 765.7s (0.02%)
Artifact size: 341.31 MiB -> 341.29 MiB (-0.01%)

Pass fmt::Arguments by reference to PanicInfo and PanicMessage

c2fdb34

rustbot assigned Amanieu Aug 23, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 23, 2024

This comment has been minimized.

Sign in to view

StackOverflowExcept1on mentioned this pull request Aug 23, 2024

feat(gstd) use stabilized panic_info_message feature from rust 1.81 gear-tech/gear#4098

Merged

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 24, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 24, 2024

rustbot assigned m-ou-se and unassigned Amanieu Aug 26, 2024

joshtriplett added the I-libs-nominated Nominated for discussion during a libs team meeting. label Aug 28, 2024

Amanieu removed the I-libs-nominated Nominated for discussion during a libs team meeting. label Sep 4, 2024

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 18, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 18, 2024

bors merged commit aaed38b into rust-lang:master Sep 18, 2024
7 checks passed

rustbot added this to the 1.83.0 milestone Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` #129491

Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` #129491

StackOverflowExcept1on commented Aug 23, 2024 •

edited

Loading

StackOverflowExcept1on commented Aug 23, 2024

This comment has been minimized.

bors commented Aug 23, 2024

workingjubilee commented Aug 24, 2024

This comment has been minimized.

bors commented Aug 24, 2024

bors commented Aug 24, 2024

This comment has been minimized.

rust-timer commented Aug 24, 2024

StackOverflowExcept1on commented Aug 24, 2024 •

edited

Loading

workingjubilee commented Aug 24, 2024 •

edited

Loading

saethlin commented Aug 24, 2024

StackOverflowExcept1on commented Aug 25, 2024 •

edited

Loading

Amanieu commented Aug 26, 2024

tgross35 commented Aug 27, 2024

StackOverflowExcept1on commented Aug 27, 2024 •

edited

Loading

workingjubilee commented Aug 28, 2024

workingjubilee commented Aug 28, 2024 •

edited

Loading

StackOverflowExcept1on commented Aug 28, 2024 •

edited

Loading

m-ou-se commented Sep 5, 2024

StackOverflowExcept1on commented Sep 5, 2024

StackOverflowExcept1on commented Sep 5, 2024 •

edited

Loading

m-ou-se commented Sep 18, 2024

bors commented Sep 18, 2024

bors commented Sep 18, 2024

bors commented Sep 18, 2024

rust-timer commented Sep 18, 2024

Pass fmt::Arguments by reference to PanicInfo and PanicMessage #129491

Pass fmt::Arguments by reference to PanicInfo and PanicMessage #129491

Conversation

StackOverflowExcept1on commented Aug 23, 2024 • edited Loading

StackOverflowExcept1on commented Aug 23, 2024

This comment has been minimized.

bors commented Aug 23, 2024

workingjubilee commented Aug 24, 2024

This comment has been minimized.

bors commented Aug 24, 2024

bors commented Aug 24, 2024

This comment has been minimized.

rust-timer commented Aug 24, 2024

Overall result: no relevant changes - no action needed

StackOverflowExcept1on commented Aug 24, 2024 • edited Loading

workingjubilee commented Aug 24, 2024 • edited Loading

saethlin commented Aug 24, 2024

StackOverflowExcept1on commented Aug 25, 2024 • edited Loading

Amanieu commented Aug 26, 2024

tgross35 commented Aug 27, 2024

StackOverflowExcept1on commented Aug 27, 2024 • edited Loading

workingjubilee commented Aug 28, 2024

workingjubilee commented Aug 28, 2024 • edited Loading

StackOverflowExcept1on commented Aug 28, 2024 • edited Loading

m-ou-se commented Sep 5, 2024

StackOverflowExcept1on commented Sep 5, 2024

StackOverflowExcept1on commented Sep 5, 2024 • edited Loading

m-ou-se commented Sep 18, 2024

bors commented Sep 18, 2024

bors commented Sep 18, 2024

bors commented Sep 18, 2024

rust-timer commented Sep 18, 2024

Overall result: no relevant changes - no action needed

Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` #129491

Pass `fmt::Arguments` by reference to `PanicInfo` and `PanicMessage` #129491

StackOverflowExcept1on commented Aug 23, 2024 •

edited

Loading

StackOverflowExcept1on commented Aug 24, 2024 •

edited

Loading

workingjubilee commented Aug 24, 2024 •

edited

Loading

StackOverflowExcept1on commented Aug 25, 2024 •

edited

Loading

StackOverflowExcept1on commented Aug 27, 2024 •

edited

Loading

workingjubilee commented Aug 28, 2024 •

edited

Loading

StackOverflowExcept1on commented Aug 28, 2024 •

edited

Loading

StackOverflowExcept1on commented Sep 5, 2024 •

edited

Loading