- Feature Name:
used
- Start Date: 2018-04-03
- RFC PR: rust-lang/rfcs#2386
- Rust Issue: rust-lang/rust#40289
Summary
Stabilize the #[used]
attribute which is used to force the compiler to keep static variables,
even if not referenced by any other part of the program, in the output object file.
Motivation
Bare metal applications, like kernels, bootloaders and other firmware, usually need precise control over the memory layout of the program. These programs usually need to place data structures like vector (interrupt) tables in certain memory locations for the system to operate properly.
The final memory layout of the program is decided by the linker; bare metal applications make use of
linker scripts to control the placement of (linker) sections in memory. But for all this to work
the vector table must be present in the object files passed to the linker. That’s where the
#[used]
attribute comes in: without it the compiler will optimize away the vector table, as it’s
not directly used by the program, and it will never reach the linker.
It’s possible to work around the lack of the #[used]
attribute by declaring the vector table as
public:
// public items are exposed in the object file
#[link_section = ".vector_table.exceptions"]
pub static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */];
But this is brittle because the compiler can still optimize the symbol away when compiling with LTO
enabled – with LTO the compiler has global knowledge about the program, and will see that
EXCEPTIONS
is unused by the program and discard it.
Yet another workaround is to force a volatile load of the vector table in some part of the program, usually before main. The compiler will always keep the vector table in this case but this alternative incurs in the cost of a load operation that will never be optimized away by the compiler.
#[link_section = ".vector_table.exceptions"]
static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */];
// entry point of the firmware
fn reset() -> ! {
extern "C" {
// user entry point
fn main() -> !;
}
// this operation will never be optimized away
unsafe { ptr::read_volatile(&EXCEPTIONS[0]) };
main()
}
The proper solution to keeping the vector table is to mark the vector table as a used variable to force the compiler to keep in one of the emitted object files.
#[used] // will be present in the object file
#[link_section = ".vector_table.exceptions"]
static EXCEPTIONS: [extern "C" fn(); 14] = [/* .. */];
Guide-level explanation
We can think of the compilation process performed by rustc
as a two stage process. First, rustc
compiles a crate (source code) into object files, then rustc
invokes the linker on those object
files to produce a single executable, or shared library (e.g. .so
) if the crate type was set to
“cdylib”.
The #[used]
attribute can be applied to static variables to keep them in the object files
produced by rustc
, even in the presence of LTO. Note that this does not mean that the static
variable will make its way into the binary file emitted by the linker as the linker is free to drop
symbols that it deems unused. In other words, the #[used]
attribute does not affect the
behavior of the linker.
Consider the following program:
#[used]
static FOO: u32 = 0;
static BAR: u32 = 0;
fn main() {}
The variable FOO
marked with the #[used]
attribute will be kept in the emitted object file
regardless of the optimization level. On the other hand, the unused variable BAR
is always
optimized away.
$ cargo rustc -- --emit=obj # for simplicity incr. comp. has been disabled
$ nm -C $(find target -name '*.o')
(..)
0000000000000000 r foo::FOO
0000000000000000 t foo::main
0000000000000000 T std::rt::lang_start
(..)
$ cargo clean; cargo rustc --release --
$ nm -C $(find target -name '*.o')
0000000000000000 T main
0000000000000000 r foo::FOO
0000000000000000 t foo::main
0000000000000000 T std::rt::lang_start
(..)
$ cargo clean; cargo rustc --release -- --emit=obj -C lto
$ nm -C $(find target -name '*.o')
(..)
0000000000000000 r foo::FOO
0000000000000000 t foo::main
(..)
FOO
never makes it to the final executable because the linker sees that the call graph that stems
from the user entry point main
never makes use of FOO
and discards it.
$ cargo clean; cargo build
$ nm -C target/debug/foo | grep FOO || echo not found
not found
To keep FOO
in the final binary assistance from the linker is required; this usually means writing
a linker script.
Consider the following program:
#[used]
#[link_section = ".init_array"]
static FOO: extern "C" fn() = before_main;
extern "C" fn before_main() {
println!("Hello")
}
fn main() {
println!("World")
}
When dealing with ELF files the .init_array
section will usually be kept in the final binary by
the default linker script. If the system supports it, all function pointers stored in the
.init_array
section will be called before entering main
. Thus, the above program prints “Hello”
and then “World” to the console when run on a *nix system.
$ cargo run --release
Hello
World
$ nm -C target/release/foo | grep FOO
000000000026b620 t foo::FOO
If the #[used]
attribute is removed from the source code then only “World” is printed to the
console as the FOO
variable will get optimized away by the compiler.
Reference-level explanation
The #[used]
attribute can only be used on static variables. Static variables marked with this
attribute will be appended to the special @llvm.used
global variable when lowered to LLVM IR.
#[used]
static FOO: u32 = 0;
fn main() {}
$ cargo clean; cargo rustc -- --emit=llvm-ir
$ grep llvm.used $(find -name '*.ll')
@llvm.used = appending global [1 x i8*] [i8* getelementptr inbounds (<{ [4 x i8] }>, <{ [4 x i8] }>* @_ZN3foo3FOO17hf0af6b03a826c578E, i32 0, i32 0, i32 0)], section "llvm.metadata"
The semantics of this operation are (quoting LLVM reference):
If a symbol appears in the @llvm.used list, then the compiler, assembler,
and linkerare required to treat the symbol as if there is a reference to the symbol that it cannot see (which is why they have to be named). For example, if a variable has internal linkage and no references other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C.
strikethrough added by the author
The part about the linker is not true (*): from the point of view of the linker static variables
marked with #[used]
look exactly the same as variables that have not been marked with that
attribute – those are the implemented LLVM semantics. Also ELF object files have no mechanism to
prevent the linker from dropping its symbols if they are not referenced by other object files.
(*) unless “linker” is actually referring to llvm-link
(?)
Drawbacks
This is yet another low level feature that alternative rustc
implementations would have to
implement to be 100% compatible with the official LLVM based rustc
implementation. Also see
#[repr(align = "*")]
, #[repr(*)]
, #[link_section]
, etc.
Rationale and alternatives
Chosen design
This design pretty much matches how C compilers implement this feature. See prior art section below.
Not doing this
Not doing this means that people will continue to use the brittle workarounds presented in the motivation section.
Prior art
Most compilers provide a feature with the exact same semantics: usually in the form of a “used”
attribute (e.g. __attribute(used)__
) that can be applied to static variables.
The following C code is an example from the KEIL toolchain documentation:
static int lose_this = 1;
static int keep_this __attribute__((used)) = 2; // retained in object file
static int keep_this_too __attribute__((used)) = 3; // retained in object file
Unresolved questions
None so far.