Use BOLT in CI to optimize `librustc_driver` #102487

Kobzol · 2022-09-29T21:12:17Z

Based on #94381.

r? @ghost

Kobzol · 2022-09-29T21:12:26Z

@bors try @rust-timer queue

rust-timer · 2022-09-29T21:12:27Z

Awaiting bors try build completion.

@rustbot label: S-waiting-on-perf

bors · 2022-09-29T21:12:35Z

⌛ Trying commit 543c574ff7b86b3e8641a701d73f0b0c90b85a5d with merge 23a2a394654d7fe2fe4228351ed8485dca6aec98...

bors · 2022-09-29T22:51:48Z

💔 Test failed - checks-actions

Kobzol · 2022-09-30T21:54:50Z

@bors try @rust-timer queue

rust-timer · 2022-09-30T21:54:51Z

Awaiting bors try build completion.

@rustbot label: S-waiting-on-perf

bors · 2022-09-30T21:54:59Z

⌛ Trying commit 799dd5fc0c85929b600f677e5a1df07c7df3343c with merge 9f86f5f6789c609097d1af2b3c7fb1762d175dc7...

bors · 2022-09-30T23:45:19Z

☀️ Try build successful - checks-actions
Build commit: 9f86f5f6789c609097d1af2b3c7fb1762d175dc7 (9f86f5f6789c609097d1af2b3c7fb1762d175dc7)

rust-timer · 2022-09-30T23:45:21Z

Queued 9f86f5f6789c609097d1af2b3c7fb1762d175dc7 with parent 877877a, future comparison URL.

rust-timer · 2022-10-01T02:42:32Z

Finished benchmarking commit (9f86f5f6789c609097d1af2b3c7fb1762d175dc7): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: S-waiting-on-review -S-waiting-on-perf perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	2.2%	[0.2%, 6.9%]	13
Regressions ❌ (secondary)	1.8%	[0.2%, 6.2%]	75
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.0%	[-2.0%, -2.0%]	1
All ❌✅ (primary)	2.2%	[0.2%, 6.9%]	13

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	3.1%	[1.0%, 6.4%]	137
Regressions ❌ (secondary)	4.0%	[0.8%, 7.7%]	141
Improvements ✅ (primary)	-3.7%	[-9.7%, -0.8%]	29
Improvements ✅ (secondary)	-5.1%	[-12.6%, -1.5%]	61
All ❌✅ (primary)	1.9%	[-9.7%, 6.4%]	166

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	5.3%	[4.9%, 5.6%]	3
Improvements ✅ (primary)	-2.6%	[-3.8%, -1.5%]	28
Improvements ✅ (secondary)	-3.1%	[-5.8%, -2.2%]	9
All ❌✅ (primary)	-2.6%	[-3.8%, -1.5%]	28

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

Kobzol · 2022-10-01T20:08:29Z

The cycle results are not bad, but nothing too impresive sadly. Maybe this isn't really worth it. I think that the Google team that tried this has similar experiences. I wonder if BOLT is less effective for Rust libraries vs C/C in general.

Another reason might be that BOLT is more effective for binaries than for shared libraries. I'll try to statically link rustc and apply BOLT to it, to see what happens.

joshtriplett · 2022-10-11T15:11:26Z

Another reason might be that BOLT is more effective for binaries than for shared libraries. I'll try to statically link rustc and apply BOLT to it, to see what happens.

I'd love to see results for statically linking rustc in general, to see how much benefit that provides.

lqd · 2022-10-11T15:24:36Z

There are results of a few different variants involving static linking in #97154

Kobzol · 2022-10-24T11:53:36Z

@bors try @rust-timer queue

rust-timer · 2022-10-24T11:53:38Z

Awaiting bors try build completion.

@rustbot label: S-waiting-on-perf

bors · 2023-09-24T12:35:15Z

⌛ Trying commit bcff055 with merge 8debfc8...

Use BOLT in CI to optimize `librustc_driver` Based on rust-lang#94381. r? `@ghost`

bors · 2023-09-24T14:44:40Z

☀️ Try build successful - checks-actions
Build commit: 8debfc8 (8debfc87c56cd261062f8dc42431dbc18d6f9472)

rust-timer · 2023-09-24T17:22:48Z

Finished benchmarking commit (8debfc8): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.5%	[-3.5%, -3.5%]	1
Improvements ✅ (secondary)	-1.7%	[-2.0%, -1.4%]	2
All ❌✅ (primary)	-3.5%	[-3.5%, -3.5%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 629.545s -> 630.771s (0.19%)
Artifact size: 317.31 MiB -> 354.81 MiB (11.82%)

Kobzol · 2023-10-02T13:58:49Z

@bors try @rust-timer queue

bors · 2023-10-02T13:59:58Z

⌛ Trying commit 8d13015 with merge 9928903...

Use BOLT in CI to optimize `librustc_driver` Based on rust-lang#94381. r? `@ghost`

bors · 2023-10-02T16:21:12Z

☀️ Try build successful - checks-actions
Build commit: 9928903 (992890374d3eb8d36c30c7d4eea8fe911163d2f6)

rust-timer · 2023-10-02T17:42:51Z

Finished benchmarking commit (9928903): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	5.9%	[5.6%, 6.2%]	4
Regressions ❌ (secondary)	2.2%	[0.3%, 5.4%]	60
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	1
All ❌✅ (primary)	5.9%	[5.6%, 6.2%]	4

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.6%	[0.7%, 2.0%]	11
Regressions ❌ (secondary)	2.9%	[2.4%, 3.7%]	13
Improvements ✅ (primary)	-2.7%	[-6.4%, -0.4%]	44
Improvements ✅ (secondary)	-3.5%	[-8.9%, -0.5%]	132
All ❌✅ (primary)	-1.8%	[-6.4%, 2.0%]	55

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.8%	[-2.6%, -0.9%]	61
Improvements ✅ (secondary)	-2.1%	[-2.4%, -1.7%]	9
All ❌✅ (primary)	-1.8%	[-2.6%, -0.9%]	61

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 628.193s -> 626.601s (-0.25%)
Artifact size: 273.37 MiB -> 309.08 MiB (13.06%)

Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB ( 48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang#114649). - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB ( 53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.

…ulacrum Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB ( 48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang#114649). - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB ( 53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.

Kobzol · 2023-10-15T08:34:58Z

Implemented in #116352.

Optimize `librustc_driver.so` with BOLT This PR optimizes `librustc_driver.so` on 64-bit Linux CI with BOLT. ### Code One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to: 1) Only pass it when we actually plan to use BOLT. How to best do that? `config.toml` entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done by `opt-dist`, therefore bootstrap doesn't know about it by default. 2) Only pass it to `librustc_driver.so` (see performance below). Some discussion of this flag already happened on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Adding.20a.20one-off.20linker.20flag). ### Performance Latest perf. results can be found [here](rust-lang/rust#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C libstd (?). Summary: - ✔️ `-1.8%` mean improvement in cycle counts across many primary benchmarks. - ✔️ `-1.8%` mean Max-RSS improvement. - ✖️ 34 MiB ( 48%) artifact size regression of `librustc_driver.so`. - This is caused by building `librustc_driver.so` with relocations (which are required for BOLT). Hopefully, it will be [fixed](https://discourse.llvm.org/t/bolt-rfc-a-new-mode-to-rewrite-entire-binary/68674) in the future with BOLT improvements, but now trying to reduce this size increase is [tricky](rust-lang/rust#114649). - Note that the size of this file was recently reduced in rust-lang/rust#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year. - ✖️ 1.4 MiB ( 53%) artifact size regression of `rustc`. - This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to `librustc_driver.so` (where they are needed) and for `rustc` (where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstrap `rustc` shim to only apply the relocation flag for the shared library and not for `rustc`. ### CI time CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and `librustc_driver` BOLT profile gathering at the same time (now they are gathered separately for LLVM and `librustc_driver`). r? `@Mark-Simulacrum` Also CC `@onur-ozkan,` primarily for the bootstrap linker flag issue.

rustbot added T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Sep 29, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 29, 2022

bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Sep 29, 2022

cuviper mentioned this pull request Sep 30, 2022

Slice fill_explicit rust-lang/libs-team#113

Closed

This comment has been minimized.

Sign in to view

Kobzol force-pushed the rustc-bolt branch from 8e8a4e9 to 799dd5f Compare September 30, 2022 21:54

rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 1, 2022

Kobzol mentioned this pull request Oct 1, 2022

Use BOLT in CI to optimize LLVM #94381

Merged

8 tasks

Kobzol force-pushed the rustc-bolt branch from 799dd5f to 02bac2d Compare October 24, 2022 11:52

rustbot added the A-testsuite Area: The testsuite used to check the correctness of rustc label Oct 24, 2022

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 24, 2023

Auto merge of rust-lang#102487 - Kobzol:rustc-bolt, r=<try>

8debfc8

Use BOLT in CI to optimize `librustc_driver` Based on rust-lang#94381. r? `@ghost`

This comment has been minimized.

Sign in to view

rustbot removed S-waiting-on-perf Status: Waiting on a perf run to be completed. perf-regression Performance regression. labels Sep 24, 2023

Kobzol added 4 commits October 2, 2023 15:47

Use BOLT to optimize librustc_driver

e8e00f5

Build GCC with support for BOLT

b99761d

Specify directory for BOLT profiles

8bffefa

Optimize librustc_driver with BOLT

8d13015

Kobzol force-pushed the rustc-bolt branch from bcff055 to 8d13015 Compare October 2, 2023 13:55

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 2, 2023

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 2, 2023

Auto merge of rust-lang#102487 - Kobzol:rustc-bolt, r=<try>

9928903

Use BOLT in CI to optimize `librustc_driver` Based on rust-lang#94381. r? `@ghost`

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 2, 2023

Kobzol mentioned this pull request Oct 2, 2023

Optimize librustc_driver.so with BOLT #116352

Merged

Kobzol closed this Oct 15, 2023

Kobzol deleted the rustc-bolt branch October 15, 2023 08:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use BOLT in CI to optimize `librustc_driver` #102487

Use BOLT in CI to optimize `librustc_driver` #102487

Kobzol commented Sep 29, 2022

Kobzol commented Sep 29, 2022

rust-timer commented Sep 29, 2022

bors commented Sep 29, 2022

bors commented Sep 29, 2022

This comment has been minimized.

Kobzol commented Sep 30, 2022

rust-timer commented Sep 30, 2022

bors commented Sep 30, 2022

bors commented Sep 30, 2022

rust-timer commented Sep 30, 2022

rust-timer commented Oct 1, 2022

Kobzol commented Oct 1, 2022

joshtriplett commented Oct 11, 2022

lqd commented Oct 11, 2022

Kobzol commented Oct 24, 2022

rust-timer commented Oct 24, 2022

bors commented Sep 24, 2023

bors commented Sep 24, 2023

This comment has been minimized.

rust-timer commented Sep 24, 2023

Kobzol commented Oct 2, 2023

This comment has been minimized.

bors commented Oct 2, 2023

bors commented Oct 2, 2023

This comment has been minimized.

rust-timer commented Oct 2, 2023

Kobzol commented Oct 15, 2023

Use BOLT in CI to optimize librustc_driver #102487

Use BOLT in CI to optimize librustc_driver #102487

Conversation

Kobzol commented Sep 29, 2022

Kobzol commented Sep 29, 2022

rust-timer commented Sep 29, 2022

bors commented Sep 29, 2022

bors commented Sep 29, 2022

This comment has been minimized.

Kobzol commented Sep 30, 2022

rust-timer commented Sep 30, 2022

bors commented Sep 30, 2022

bors commented Sep 30, 2022

rust-timer commented Sep 30, 2022

rust-timer commented Oct 1, 2022

Overall result: ❌ regressions - ACTION NEEDED

Footnotes

Kobzol commented Oct 1, 2022

joshtriplett commented Oct 11, 2022

lqd commented Oct 11, 2022

Kobzol commented Oct 24, 2022

rust-timer commented Oct 24, 2022

bors commented Sep 24, 2023

bors commented Sep 24, 2023

This comment has been minimized.

rust-timer commented Sep 24, 2023

Overall result: no relevant changes - no action needed

Kobzol commented Oct 2, 2023

This comment has been minimized.

bors commented Oct 2, 2023

bors commented Oct 2, 2023

This comment has been minimized.

rust-timer commented Oct 2, 2023

Overall result: ❌ regressions - ACTION NEEDED

Kobzol commented Oct 15, 2023

Use BOLT in CI to optimize `librustc_driver` #102487

Use BOLT in CI to optimize `librustc_driver` #102487