-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rollup of 7 pull requests #131581
Rollup of 7 pull requests #131581
Conversation
Co-authored-by: Josh Stone <[email protected]>
Co-authored-by: Josh Stone <[email protected]>
Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) + c`, to be fused if the code generator determines that (i) the target instruction set has support for a fused operation, and (ii) that the fused operation is more efficient than the equivalent, separate pair of `mul` and `add` instructions. https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic MIRI support is included for f32 and f64. The codegen_cranelift uses the `fma` function from libc, which is a correct implementation, but without the desired performance semantic. I think this requires an update to cranelift to expose a suitable instruction in its IR. I have not tested with codegen_gcc, but it should behave the same way (using `fma` from libc).
For the expr with attributes, like `let _ = (#[inline] || println!("Hello!"));`, the suggestion's span should contains the attributes, or the suggestion will remove them. fixes rust-lang#129833
…thlin intrinsics fmuladdf{32,64}: expose llvm.fmuladd.* semantics Add intrinsics `fmuladd{f32,f64}`. This computes `(a * b) + c`, to be fused if the code generator determines that (i) the target instruction set has support for a fused operation, and (ii) that the fused operation is more efficient than the equivalent, separate pair of `mul` and `add` instructions. https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic The codegen_cranelift uses the `fma` function from libc, which is a correct implementation, but without the desired performance semantic. I think this requires an update to cranelift to expose a suitable instruction in its IR. I have not tested with codegen_gcc, but it should behave the same way (using `fma` from libc). --- This topic has been discussed a few times on Zulip and was suggested, for example, by `@workingjubilee` in [Effect of fma disabled](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Effect.20of.20fma.20disabled/near/274179331).
Migrate lib's `&Option<T>` into `Option<&T>` Trying out my new lint rust-lang/rust-clippy#13336 - according to the [video](https://www.youtube.com/watch?v=6c7pZYP_iIE), this could lead to some performance and memory optimizations. Basic thoughts expressed in the video that seem to make sense: * `&Option<T>` in an API breaks encapsulation: * caller must own T and move it into an Option to call with it * if returned, the owner must store it as Option<T> internally in order to return it * Performance is subject to compiler optimization, but at the basics, `&Option<T>` points to memory that has `presence` flag + value, whereas `Option<&T>` by specification is always optimized to a single pointer.
…tgross35 stabilize duration_consts_float Waiting for FCP in rust-lang#72440 to pass. `as_millis_f32` and `as_millis_f64` are not stable at all yet, so I moved their const-stability together with their regular stability (tracked at rust-lang#122451). Fixes rust-lang#72440
Support clobber_abi in MSP430 inline assembly This supports `clobber_abi` which is one of the requirements of stabilization mentioned in rust-lang#93335. Refs: Section 3.2 "Register Conventions" in [MSP430 Embedded Application Binary Interface](https://www.ti.com/lit/an/slaa534a/slaa534a.pdf) cc ``@cr1901`` r? ``@Amanieu`` ``@rustbot`` label +O-msp430
Make unused_parens's suggestion considering expr's attributes. For the expr with attributes, like `let _ = (#[inline] || println!("Hello!"));`, the suggestion's span should contains the attributes, or the suggestion will remove them. fixes rust-lang#129833
…r=compiler-errors Remove deprecation note in the `non_local_definitions` lint This PR removes the edition deprecation note emitted by the `non_local_definitions` lint. Specifically this part: ``` = note: this lint may become deny-by-default in the edition 2024 and higher, see the tracking issue <rust-lang#120363> ``` because it [didn't make the cut](rust-lang#120363 (comment)) for the 2024 edition. `@rustbot` label +L-non_local_definitions
Flatten redundant test module `run_make_support::diff::tests::tests` This module is already named `tests`, and is already gated by `#[cfg(test)]`, so there's no need for it to also contain `mod tests`. r? jieyouxu
@bors r+ rollup=never p=7 |
Rollup of 7 pull requests Successful merges: - rust-lang#124874 (intrinsics fmuladdf{32,64}: expose llvm.fmuladd.* semantics) - rust-lang#130962 (Migrate lib's `&Option<T>` into `Option<&T>`) - rust-lang#131289 (stabilize duration_consts_float) - rust-lang#131310 (Support clobber_abi in MSP430 inline assembly) - rust-lang#131546 (Make unused_parens's suggestion considering expr's attributes.) - rust-lang#131565 (Remove deprecation note in the `non_local_definitions` lint) - rust-lang#131576 (Flatten redundant test module `run_make_support::diff::tests::tests`) r? `@ghost` `@rustbot` modify labels: rollup
The job Click to see the possible cause of the failure (guessed by this bot)
|
💔 Test failed - checks-actions |
MSVC failure @bors retry |
☀️ Test successful - checks-actions |
📌 Perf builds for each rolled up PR:
previous master: fb20e4d3b9 In the case of a perf regression, run the following command for each PR you suspect might be the cause: |
Finished benchmarking commit (8f8bee4): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 1.9%, secondary -2.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 7.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 781.192s -> 779.421s (-0.23%) |
Successful merges:
&Option<T>
intoOption<&T>
#130962 (Migrate lib's&Option<T>
intoOption<&T>
)non_local_definitions
lint #131565 (Remove deprecation note in thenon_local_definitions
lint)run_make_support::diff::tests::tests
#131576 (Flatten redundant test modulerun_make_support::diff::tests::tests
)r? @ghost
@rustbot modify labels: rollup
Create a similar rollup