-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
f16
generates code that uses the incorrect ABI for compiler-rt
#123885
Comments
@rustbot label -needs-triage |
sounds like something is not compiling correctly and it's just reading completely wrong data ... |
the full code #![feature(f16,f128)]
use std::mem::transmute;
#[inline(never)]
pub fn add(a: f16) -> f16 {
10.0f16 + a
}
#[inline(never)]
pub fn x() -> u16 {
let a = add(10.0);
unsafe { transmute::<_, u16>(a) }
} results in
The 10.0 for |
The LLVM IR contains the exact same constants for both cases... source_filename = "example.1b362aa86f0e2f27-cgu.0"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define noundef half @_ZN7example3add17hb3d3e297b1262388E(half noundef %a) unnamed_addr #0 {
%_0 = fadd half %a, 0xH4900
ret half %_0
}
define noundef i16 @_ZN7example1x17h714cc7590e57d450E() unnamed_addr #0 {
%a = tail call noundef half @_ZN7example3add17hb3d3e297b1262388E(half noundef 0xH4900)
%_0 = bitcast half %a to i16
ret i16 %_0
}
attributes #0 = { mustprogress nofree noinline norecurse nosync nounwind nonlazybind willreturn memory(none) uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" }
!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}
!0 = !{i32 8, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}
!2 = !{!"rustc version 1.79.0-nightly (a07f3eb43 2024-04-11)"} |
It seems like the bitcast is causing the weird codegen, define dso_local half @add(half noundef %a) {
entry:
%a.addr = alloca half, align 2
store half %a, ptr %a.addr, align 2
%0 = load half, ptr %a.addr, align 2
%ext = fpext half %0 to float
%add = fadd float 1.000000e+01, %ext
%unpromotion = fptrunc float %add to half
ret half %unpromotion
}
declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
define dso_local signext i16 @x() {
entry:
%a = alloca half, align 2
%b = alloca half, align 2
store half 0xH4900, ptr %a, align 2
%0 = load half, ptr %a, align 2
%call = call half @add(half noundef %0)
store half %call, ptr %b, align 2
%1 = load i16, ptr %b, align 2
ret i16 %1
} .LCPI0_0:
.long 0x41200000 # float 10
add: # @add
push rbp
mov rbp, rsp
sub rsp, 16
pextrw eax, xmm0, 0
mov word ptr [rbp - 2], ax
pinsrw xmm0, word ptr [rbp - 2], 0
call __extendhfsf2@PLT
movss xmm1, dword ptr [rip + .LCPI0_0] # xmm1 = [1.0E+1,0.0E+0,0.0E+0,0.0E+0]
addss xmm0, xmm1
call __truncsfhf2@PLT
add rsp, 16
pop rbp
ret
.LCPI1_0:
.short 0x4900 # half 10
x: # @x
push rbp
mov rbp, rsp
sub rsp, 16
pinsrw xmm0, word ptr [rip + .LCPI1_0], 0
pextrw eax, xmm0, 0
mov word ptr [rbp - 2], ax
pinsrw xmm0, word ptr [rbp - 2], 0
call add
pextrw eax, xmm0, 0
mov word ptr [rbp - 4], ax
movsx eax, word ptr [rbp - 4]
add rsp, 16
pop rbp
ret
|
@rustbot label +A-LLVM |
It's an ABI issue with #![feature(f16)]
#[inline(never)]
pub fn f16_as_f32_cast(x: f16) -> f32 {
x as f32
}
#[inline(never)]
pub fn f16_as_f32_intrinsic(x: f16) -> f32 {
extern "C" {
fn __extendhfsf2(x: u16) -> f32;
}
unsafe { __extendhfsf2(std::mem::transmute(x)) }
}
fn main() {
dbg!(f16_as_f32_cast(10.0));
dbg!(f16_as_f32_intrinsic(10.0));
}
|
Oh, yuck, if I am understanding this correctly it seems like compiler-rt emits a different ABI based on whether or not SSE2 is enabled llvm/llvm-project#56854. I guess that Rust must be building compiler-rt without SSE since the integer ABI works, but LLVM is expecting the +SSE rt that uses the float ABI. Seems like there isn't an easy way to force the integer ABI either, running with
Any idea what Rust could do here? Tangentially related #114479 |
Defining the |
I've posted a PR that should fix this at rust-lang/compiler-builtins#593. |
Note that this bug affects stable Rust when C code that uses // src/main.rs
fn main() {
extern "C" {
fn do_cast(ignored: u32, x: u16) -> f32;
}
dbg!(unsafe { do_cast(123456, 0x4900 /* 10.0_f16.to_bits() */) });
} // src/code.c
float do_cast(unsigned int ignored, unsigned short x) {
union { _Float16 f; unsigned short bits; } u = { .bits = x };
return (float) u.f;
} // build.rs
fn main() {
cc::Build::new().file("src/code.c").compile("code");
} # Cargo.toml
[package]
name = "example"
version = "0.1.0"
edition = "2021"
[build-dependencies]
cc = "1.0.94" This example should print 10.0, but doesn't on current stable on x86_64 due to the incorrectly-compiled builtin from Rust's build of |
related issue: #118813 |
Disabling such basic ABI-relevant target features is anyway wildly unsafe and should probably be rejected by rustc. Lucky enough, on 64bit targets LLVM catches this rather than generating code with a non-standard ABI. |
It seems to be worse, though, because the non-standard ABI here is the one the library is expecting. |
f16
seems non-deterministicf16
generates code that uses the incorrect ABI for compiler-rt
@rustbot label -T-libs |
This has been fixed by #125016. |
Amazing, thanks for fixing this! |
On the playground and godbolt, the following is non-deterministic:
Running different times gives me different values like
0x9200
,0x06e0
,0xc000
. The generated assembly is always the same:Not really sure what would be going on here, seems unlikely to be a bug in extend/trunc.
This shows up on both the playground (nightly 2024-04-12 7942405) and godbolt (2024-04-11), but I cannot reproduce locally (nightly 2024-04-12, aarch64-darwin).
Links: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=b261d218028b8bddfe769658c195f529, https://rust.godbolt.org/z/fT1xcMMWz
@rustbot label +F-f16_and_f128
The text was updated successfully, but these errors were encountered: