Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to optimize out branches in drop function in unwind landing pad #87055

Closed
oxalica opened this issue Jul 11, 2021 · 8 comments · Fixed by #92419 or #102099
Closed

Unable to optimize out branches in drop function in unwind landing pad #87055

oxalica opened this issue Jul 11, 2021 · 8 comments · Fixed by #92419 or #102099
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@oxalica
Copy link
Contributor

oxalica commented Jul 11, 2021

In this code:

pub fn foo(g: fn()) {
    let _var: Option<Box<i32>> = None;
    g();
}

Assembly:

core::ptr::drop_in_place<core::option::Option<alloc::boxed::Box<i32>>>:
        mov     rdi, qword ptr [rdi]
        test    rdi, rdi
        je      .LBB0_1
        mov     esi, 4
        mov     edx, 4
        jmp     qword ptr [rip   __rust_dealloc@GOTPCREL]
.LBB0_1:
        ret

example::foo:
        push    rbx
        sub     rsp, 16
        mov     qword ptr [rsp   8], 0
        call    rdi
        add     rsp, 16
        pop     rbx
        ret
        mov     rbx, rax
        lea     rdi, [rsp   8]
        call    core::ptr::drop_in_place<core::option::Option<alloc::boxed::Box<i32>>>
        mov     rdi, rbx
        call    _Unwind_Resume@PLT
        ud2

DW.ref.rust_eh_personality:
        .quad   rust_eh_personality

_var is known to be None so the deallocation is optimized away in the normal flow. But in landing pad for unwinding, it still generates a branch to check if _var is null. But it's a local variable which is not able to be modified in anywhere else and should be known to be null.

I'm expecting the drop of _var should be completely elided so that g() can be a tail call.

@nikic
Copy link
Contributor

nikic commented Jul 11, 2021

I'd say the main problem here is that drop_in_place isn't inlined. If it were, it would be optimized away.

@nikic nikic added I-slow Issue: Problems and improvements with respect to performance of generated code. A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. labels Jul 11, 2021
@tmiasko
Copy link
Contributor

tmiasko commented Jul 11, 2021

#69715 tracks alternative MIR solution which is to track active enum variants during drop elaboration.

@nikic nikic added the A-codegen Area: Code generation label Jul 11, 2021
@nikic
Copy link
Contributor

nikic commented Jul 11, 2021

After taking a closer look, the reason it doesn't get inlined is that we explicitly mark it as noinline here:

bx.do_not_inline(llret);
Generally sensible, but backfires here.

Though argument promotion doesn't happen either, not sure why that is.

@oxalica
Copy link
Contributor Author

oxalica commented Jul 11, 2021

#69715 tracks alternative MIR solution which is to track active enum variants during drop elaboration.

It's aiming at enum variant which doesn't need drop. I'm not sure if it can also eliminate conditional de-allocation inside drop like drop(String::new()).

@nikic
Copy link
Contributor

nikic commented Jul 11, 2021

Though argument promotion doesn't happen either, not sure why that is.

Apparently because a bitcast is sitting before the load.

; not promoted
  %0 = bitcast i32** %_1 to {}**
  %1 = load {}*, {}** %0, align 8
; promoted
  %0 = load i32*, i32** %_1, align 8
  %1 = bitcast i32* %0 to {}*

@tmiasko
Copy link
Contributor

tmiasko commented Jul 11, 2021

#69715 tracks alternative MIR solution which is to track active enum variants during drop elaboration.

It's aiming at enum variant which doesn't need drop. I'm not sure if it can also eliminate conditional de-allocation inside drop like drop(String::new()).

The idea is to solve a data flow problem that answers which enum variants might reach a drop terminator. In the case here, we know that Option doesn't implement Drop, and that the only variant that might reach drop is None which doesn't contain anything that needs to be dropped either. Thus it would be valid to remove the drop terminator.

This is de-facto implemented as part of const validation which statically ensures that there are no live drops. Following const variant already compiles successfully (even though destructors cannot be evaluated at compile-time):

pub const fn foo() {
    let _var: Option<Box<i32>> = None;
    g();
}

pub const fn g() {}

@oxalica
Copy link
Contributor Author

oxalica commented Jul 11, 2021

we explicitly mark it as noinline here

Seems this issue has the same root cause as #46515.

And the LLVM issue of heavily inlining, mentioned in #41696, which let us mark drop in unwinding noinline, is still NOT fixed. https://bugs.llvm.org/show_bug.cgi?id=33072

The idea is to solve a data flow problem that answers which enum variants might reach a drop terminator.

My point is that the MIR approach cannot handle manual drop impl which conditional does nothing. This code cannot have drop eliminated without inlining.

pub fn foo(g: fn()) {
    let _var = String::new(); // This does not alloc.
    g();
}

@bors bors closed this as completed in 4f49627 Jan 1, 2022
@nagisa nagisa reopened this Feb 26, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 1, 2022
Revert "Auto merge of rust-lang#92419 - erikdesjardins:coldland, r=nagisa"

Should fix (untested) rust-lang#94390

Reopens rust-lang#46515, rust-lang#87055

r? `@ehuss`
@bors bors closed this as completed in da4a392 Mar 27, 2022
@lqd
Copy link
Member

lqd commented Mar 28, 2022

bors also closing this issue in #95338 was unexpected, cc @nagisa @erikdesjardins.

@Noratrieb Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
TaKO8Ki added a commit to TaKO8Ki/rust that referenced this issue Jun 27, 2023
…=nikic

Rebased: Mark drop calls in landing pads cold instead of noinline

I noticed that certain inlining optimizations were missing while staring at some compiled code output. I'd like to see this relanded, so I rebased the PR from ``@erikdesjardins`` (PR rust-lang#94823).

This PR reapplies rust-lang#92419, which was reverted in rust-lang#94402 due to rust-lang#94390.

Fixes rust-lang#46515, fixes rust-lang#87055.

Update: fixes rust-lang#97217.
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 2, 2023
…ikic

Rebased: Mark drop calls in landing pads cold instead of noinline

I noticed that certain inlining optimizations were missing while staring at some compiled code output. I'd like to see this relanded, so I rebased the PR from `@erikdesjardins` (PR rust-lang#94823).

This PR reapplies rust-lang#92419, which was reverted in rust-lang#94402 due to rust-lang#94390.

Fixes rust-lang#46515, fixes rust-lang#87055.

Update: fixes rust-lang#97217.
@bors bors closed this as completed in 2e5a9dd Oct 2, 2023
antoyo pushed a commit to rust-lang/rustc_codegen_gcc that referenced this issue Oct 8, 2023
Rebased: Mark drop calls in landing pads cold instead of noinline

I noticed that certain inlining optimizations were missing while staring at some compiled code output. I'd like to see this relanded, so I rebased the PR from `@erikdesjardins` (PR #94823).

This PR reapplies rust-lang/rust#92419, which was reverted in rust-lang/rust#94402 due to rust-lang/rust#94390.

Fixes rust-lang/rust#46515, fixes rust-lang/rust#87055.

Update: fixes #97217.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
6 participants