Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include frame pointer for bare metal RISC-V targets #61675

Merged
merged 1 commit into from
Jun 13, 2019

Conversation

fintelia
Copy link
Contributor

@fintelia fintelia commented Jun 8, 2019

This changes the default setting to enable the use of the frame pointer register when targeting RISC-V. On that architecture there is a dedicated frame pointer register which LLVM would otherwise never use so there is no increase in register pressure. Further, since these are bare metal targets, getting backtraces without the frame pointer is considerably more difficult (you can't just ask the OS to load the ELF executable and parse DWARF symbols). It is true that this setting can also be changed with the -C force-frame-pointers flag but that won't impact the compilation of the standard library, meaning that backtraces from, say, a panic handler would be useless.

@rust-highfive
Copy link
Collaborator

r? @matthewjasper

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 8, 2019
@Centril
Copy link
Contributor

Centril commented Jun 10, 2019

r? @nagisa

@nagisa
Copy link
Member

nagisa commented Jun 11, 2019

I’m trying to understand the motivation better here, please bear with me :)

On that architecture there is a dedicated frame pointer register which LLVM would otherwise never use so there is no increase in register pressure.

Looking at the LLVM code (here and here, it seems that LLVM will not reserve x8/s0/fp if a given frame does not need a frame pointer for any reason, which suggests that it will gladly use it for other purposes, unless I’m missing something. Is there any way to prove that LLVM indeed is unable to use x8 for other purposes even if frame pointer is not necessary?

Further, since these are bare metal targets, getting backtraces without the frame pointer is considerably more difficult (you can't just ask the OS to load the ELF executable and parse DWARF symbols).

My understanding and experience in embedded scene suggests that the ideal approach is indeed to load an ELF executable in your debugger and let it handle all sorts of mappings that may be necessary between the loaded ELF and binary code loaded into ROM. Reading this, it looks to me that you’re looking to generate stack traces internally (a-la RUST_BACKTRACE=1). What is the use-case for that in the bare-metal context?

It is true that this setting can also be changed with the -C force-frame-pointers flag but that won't impact the compilation of the standard library, meaning that backtraces from, say, a panic handler would be useless.

On the other hand, won’t landing this increase the size of generated code by default? Do RISC-V (unlike ARM) targets have an excess of storage for machine code?

cc @rkruppe

@fintelia
Copy link
Contributor Author

I'm happy to explain.

On that architecture there is a dedicated frame pointer register which LLVM would otherwise never use so there is no increase in register pressure.

Looking at the LLVM code (here and here, it seems that LLVM will not reserve x8/s0/fp if a given frame does not need a frame pointer for any reason, which suggests that it will gladly use it for other purposes, unless I’m missing something. Is there any way to prove that LLVM indeed is unable to use x8 for other purposes even if frame pointer is not necessary?

I might be completely wrong on this. My claim was based on compiling a bunch of Rust code and observing the output from rustc seemed to never include x8/s0/fp.

My understanding and experience in embedded scene suggests that the ideal approach is indeed to load an ELF executable in your debugger and let it handle all sorts of mappings that may be necessary between the loaded ELF and binary code loaded into ROM. Reading this, it looks to me that you’re looking to generate stack traces internally (a-la RUST_BACKTRACE=1). What is the use-case for that in the bare-metal context?

On the other hand, won’t landing this increase the size of generated code by default? Do RISC-V (unlike ARM) targets have an excess of storage for machine code?

Ah, I think this the source of confusion: You are thinking of tiny embedded chips with a tiny amount of ROM and even smaller amount of RAM, whereas I've been thinking about writing kernels for systems with gigabytes of RAM (like the HiFive Unleashed) which are powerful enough to run a full desktop OS.

@nagisa
Copy link
Member

nagisa commented Jun 12, 2019

Ah, I think this the source of confusion: You are thinking of tiny embedded chips with a tiny amount of ROM and even smaller amount of RAM, whereas I've been thinking about writing kernels for systems with gigabytes of RAM (like the HiFive Unleashed) which are powerful enough to run a full desktop OS.

Fair. It still seems to me that these targets might be used for embedded chips, but I have no idea if those even exist, so @bors r+

@bors
Copy link
Contributor

bors commented Jun 12, 2019

📌 Commit 493d1b4 has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 12, 2019
@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Jun 12, 2019

RV32IM(A)C cores with tiny program memories for embedded applications certainly exist, I think they vastly outnumber the big linux-capable cores both in number of designs available and units shipped. I don't know whether that is reflected in the usage statistics of the Rust triples, but maybe it would be justified to keep frame pointers disabled at least on the smallest (riscv32imc) target? HiFive Unleashed is 64 bit anyway (with no way to run 32 bit code), so if @fintelia's targeting that, they wouldn't be affected even if they didn't use the riscv64gc triple (e.g. riscv64imac to disable hard float).

Centril added a commit to Centril/rust that referenced this pull request Jun 12, 2019
Include frame pointer for bare metal RISC-V targets

This changes the default setting to enable the use of the frame pointer register when targeting RISC-V. On that architecture there is a dedicated frame pointer register which LLVM would otherwise never use so there is no increase in register pressure. Further, since these are bare metal targets, getting backtraces without the frame pointer is considerably more difficult (you can't just ask the OS to load the ELF executable and parse DWARF symbols). It is true that this setting can also be changed with the `-C force-frame-pointers` flag but that won't impact the compilation of the standard library, meaning that backtraces from, say, a panic handler would be useless.
Centril added a commit to Centril/rust that referenced this pull request Jun 12, 2019
Include frame pointer for bare metal RISC-V targets

This changes the default setting to enable the use of the frame pointer register when targeting RISC-V. On that architecture there is a dedicated frame pointer register which LLVM would otherwise never use so there is no increase in register pressure. Further, since these are bare metal targets, getting backtraces without the frame pointer is considerably more difficult (you can't just ask the OS to load the ELF executable and parse DWARF symbols). It is true that this setting can also be changed with the `-C force-frame-pointers` flag but that won't impact the compilation of the standard library, meaning that backtraces from, say, a panic handler would be useless.
bors added a commit that referenced this pull request Jun 13, 2019
Rollup of 9 pull requests

Successful merges:

 - #60376 (Stabilize Option::xor)
 - #61398 (Stabilize copy_within)
 - #61629 (Hygienize macros in the standard library)
 - #61675 (Include frame pointer for bare metal RISC-V targets)
 - #61750 (Fix x.py install)
 - #61761 (Add an alias for x86_64-sun-solaris target tuple)
 - #61762 (rustbuild: fix libtest_stamp)
 - #61763 (ci: fix ci stats upload condition)
 - #61776 (Fix typos in error_codes)

Failed merges:

r? @ghost
@bors bors merged commit 493d1b4 into rust-lang:master Jun 13, 2019
@fintelia
Copy link
Contributor Author

@rkruppe What you are describing seems reasonable. Slightly longer term rust-lang/rfcs#2663 or similar should hopefully resolve the issue of having to pick only one set of compile options for the standard library of each target.

lenary added a commit to lenary/rust that referenced this pull request May 30, 2020
We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

This commit does ensure that the standard library is built with unwind
tables, so that users do not need to rebuild the standard library in
order to get a backtrace that includes standard library calls (which is
the original reason for forcing frame pointers).
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 17, 2020
…=hanna-kruppe,Mark-Simulacrum

[RISC-V] Do not force frame pointers

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

---

I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 17, 2020
… r=hanna-kruppe,Mark-Simulacrum

[RISC-V] Do not force frame pointers

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

---

I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 17, 2020
… r=hanna-kruppe,Mark-Simulacrum

[RISC-V] Do not force frame pointers

We have been seeing some very inefficient code that went away when using
`-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at
`-Oz` was compiled into a function which consisted entirely of saving
registers to the stack, then using the frame pointer to restore the same
registers (without any instructions between the prolog and epilog).

The RISC-V LLVM backend supports frame pointer elimination, so it makes
sense to allow this to happen when using Rust. It's not clear to me that
frame pointers have ever been required in the general case.

In rust-lang#61675 it was pointed out that this made reassembling
stack traces easier, which is true, but there is a code generation
option for forcing frame pointers, and I feel the default should not be
to require frame pointers, given it demonstrably makes code size worse
(around 10% in some embedded applications).

The kinds of targets mentioned in rust-lang#61675 are popular, but
should not dictate that code generation should be worse for all RISC-V
targets, especially as there is a way to use CFI information to
reconstruct the stack when the frame pointer is eliminated. It is also
a misconception that `fp` is always used for the frame pointer. `fp` is
an ABI name for `x8` (aka `s0`), and if no frame pointer is required,
`x8` may be used for other callee-saved values.

---

I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere.

There are more details on the code size savings seen in Tock here: tock/tock#1660
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants