-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for RFC 2948: Portable SIMD #86656
Comments
This enables programmers to use a safe alternative to the current `extern "platform-intrinsics"` API for writing portable SIMD code. This is `#![feature(portable_simd)]` as tracked in rust-lang#86656
…crum pub use core::simd; A portable abstraction over SIMD has been a major pursuit in recent years for several programming languages. In Rust, `std::arch` offers explicit SIMD acceleration via compiler intrinsics, but it does so at the cost of having to individually maintain each and every single such API, and is almost completely `unsafe` to use. `core::simd` offers safe abstractions that are resolved to the appropriate SIMD instructions by LLVM during compilation, including scalar instructions if that is all that is available. `core::simd` is enabled by the `#![portable_simd]` nightly feature tracked in rust-lang#86656 and is introduced here by pulling in the https://github.com/rust-lang/portable-simd repository as a subtree. We built the repository out-of-tree to allow faster compilation and a stochastic test suite backed by the proptest crate to verify that different targets, features, and optimizations produce the same result, so that using this library does not introduce any surprises. As these tests are technically non-deterministic, and thus can introduce overly interesting Heisenbugs if included in the rustc CI, they are visible in the commit history of the subtree but do nothing here. Some tests **are** introduced via the documentation, but these use deterministic asserts. There are multiple unsolved problems with the library at the current moment, including a want for better documentation, technical issues with LLVM scalarizing and lowering to libm, room for improvement for the APIs, and so far I have not added the necessary plumbing for allowing the more experimental or libm-dependent APIs to be used. However, I thought it would be prudent to open this for review in its current condition, as it is both usable and it is likely I am going to learn something else needs to be fixed when bors tries this out. The major types are - `core::simd::Simd<T, N>` - `core::simd::Mask<T, N>` There is also the `LaneCount` struct, which, together with the SimdElement and SupportedLaneCount traits, limit the implementation's maximum support to vectors we know will actually compile and provide supporting logic for bitmasks. I'm hoping to simplify at least some of these out of the way as the compiler and library evolve.
i'm sorry if this is the wrong place to ask but im rather new to rust and stumbled upon this issues as my compiler told me to if i want to use this feature as soon as my compiler supports it can i gate it like: #[cfg(feature = "portable_simd")]
use std::simd::Simd; or is that only for feautures regarding my package (set in toml or passed to cargo?) if so what would be the appropriate way to use simd as soon as this issue is resolved? |
The It's a language feature not a cargo feature so it works a little differently. It's unfortunate that they're both just "feature". Rust is often too terse when it counts. |
ok thanks a lot! just to make sure this means there is no (easy*) way to use this language feature if my compiler supports it and fall back to a custom implementation otherwise? *easy as in compile time guards / attribute-like macros or creating a custom wrapper module that either provides rusts simd or my own fallback or something else in that level of skill for anyone else stumbling upon this: language features are (unstable) features you can opt-in when using nightly rust (by putting the specified flag in your library root, the whole project will then be compiled with a compiler that uses this feature) |
This worked for me:
where "from_slice" is the name of my the-other-kind-of-feature, defined in Cargo.toml, that uses portable_simd.
So, I run tests, for example, via |
Is this on the 2024 edition roadmap, or will it be only for after that? I know it’s not related, but gives me a timeline range. |
I don't think anyone has a specific timeline, but we still need to draft a new RFC and go through the approval process, which can take some time. |
Is there a particular reason that |
Like deref into a slice? Usually that's not done because it's a huge performance footgun. |
It may be good to document what that footgun is and why the choice was made because people will ask this again in future. |
So, to add more detail: the problem is that (depending on SIMD used) you can't in general index to a particular lane of a SIMD register. So if you view the SIMD data as a slice and operate on an element of the slice, what the hardware must do is have the CPU stop the current SIMD processing, write the register to the stack, work on the stack value (however the slice is adjusted), and then load that back into a SIMD register. This is, in general, a performance disaster. As usual, the optimizer might be able to cut out this stall in the pipeline, in some cases, depending on circumstances, etc etc. But you should expect that the SIMD handling is totally stalled when trying to treat the data as a slice. |
I figured there was a reason, but I'm not familiar with how SIMD works under the hood. Given that indexing is the problem, why implement |
Oh, uh, well I haven't looked in a while! I guess I'm out of the loop on the current API details. I'm surprised that Index is in if Deref is out. Either both should be in or both should be out, would be my expectation. |
The basic idea is that we want a clear marker of the boundary between SIMD and non-SIMD operations. When using Index ( |
Maybe we should just always make people convert to an array to index elements? |
A while ago we didn't implement Index and we got requests for it, but this is the first time Deref has come up, so I think it's a good compromise. Maybe it's not particularly obvious that Index is the boundary, but Deref is completely invisible without consulting the docs. |
There are certain types of instructions where the output data type is different from the input data type like: _mm256_maddubs_epi16. I don't think there is a way to do that in portable simd without casting first which is slower? Are there any plans to support these instructions. Similar instructions also exist on arch: vdotq_s32 |
Hi, I was wondering if there had been any discussion or consideration of making a dynamically sized api for vector operations. The current api seems to be analogous to arrays, but perhaps a more elegant and convenient solution would be analogous to slices. I learned about this idea when researching risc-v's vector extension. Both this article and this one (fully rendered here) are good references on the motivation, from the perspective of an ISA. While the current api is already much better than traditional simd instructions, it seems to me that the logical conclusion is a runtime sized type; maybe a wrapper around Hopefully this can spark a useful discussion on the best design of simd/vector types and operations. Thank you for your consideration. |
That could be some additional API that lives along aside the fixed sized SIMD types, but for the main CPU arches a fixed sized simd type is what generally works best with optimizations. |
as_simd: fix doc comment to be in line with align_to In rust-lang#121201, the guarantees about `align_offset` and `align_to` were changed. This PR aims to correct the doc comment of `as_simd` to be in line with the new `align_to`. Tagging rust-lang#86656 for good measure.
Rollup merge of rust-lang#127422 - greaka:master, r=workingjubilee as_simd: fix doc comment to be in line with align_to In rust-lang#121201, the guarantees about `align_offset` and `align_to` were changed. This PR aims to correct the doc comment of `as_simd` to be in line with the new `align_to`. Tagging rust-lang#86656 for good measure.
Curious how ARM SVE and RISC-V V are meant to be used in Rust. The fixed-length abstraction is a nice one, and it's what .NET is going with in .NET 9 ( |
RISC-V offers extensions like Edit: fix extension name |
Would it it make sense to add a family of functions like "load_base_*" that take a slice and an isize index? It would account for buffer underflow as well as overflow. With this you can write things such as convolution with nice clean loops that don't have account for edge cases.
|
Is it possible to move This will help to write generic code that works for different primitive types. Got this idea while writing a fixed index map data structure that is expected to work with unsigned integer keys regardless of the width. Without the trait bound for |
I think you should probably be able to do what you want: fn generic<T>(v: Simd<T, 4>, m: Mask<T::Mask, 4>) -> bool
where
T: SimdElement + Default,
Simd<T, 4>: SimdPartialEq<Mask = Mask<T::Mask, 4>>,
{
(v.simd_eq(Simd::splat(Default::default())) ^ m).all()
} However, it would be nice if there were an easier way to do this without requiring that extra bound. |
@calebzulawski , thank you! It worked along with a couple of bounds from Maybe an example with generic code will be a useful demo of bounds usage. |
Feature gate:
#![feature(portable_simd)]
This is a tracking issue for the future feature chartered in RFC 2977, with the intent of creating something akin to the design in RFC 2948 (rust-lang/rfcs#2948): a portable SIMD library (
std::simd
).Portable SIMD project group: https://github.com/rust-lang/project-portable-simd
Implementation: https://github.com/rust-lang/portable-simd
More discussion can be found in the #project-portable-simd zulip stream.
Steps
Unresolved Questions
Implementation History
[T]::as_simd(_mut)
#91479The text was updated successfully, but these errors were encountered: