-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Alignment niches for references types. #3204
base: master
Are you sure you want to change the base?
Conversation
This RFC does not explain how references to the reference would be handled. That is, the following code must be well formed: pub fn banana(val: &mut E) -> Option<&mut &u16> {
match val {
E::A(ref mut val) => Some(val),
_ => None
}
}
let reference: &mut &u16 = banana(enum).unwrap();
static SOME_REF: u16 = 42;
*reference = &SOME_REF; // How does the compiler know it needs to copy over / remove the discriminant from alignment bits? Right now the value returned by |
Under my proposal, either we have a It works exactly the same way as the null pointer optimization (take your example and replace |
- Should the `Aligned<T>` type be part of `stdlib`'s public API? This would allow library authors (like `hashbrown`) to also exploit these niches. | ||
- "size niches" could also be introduced: e.g. `&[u8; 100]` has no alignment niches, but has size niches above `100.wrapping_neg()` (thanks scottmcm for the idea!) | ||
- This somewhat complicates the validity invariant of `Aligned<T>`, is it acceptable? | ||
- This forces `Weak<T>` to drop back to using a `NonNull<T>`, because the sentinel value is no longer a valid `Aligned<T>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weak<T>
could just be:
pub struct Weak<T>(WeakInner<T>);
enum WeakInner<T> {
Sentinal = -1,
Ptr(Aligned<T>),
}
sidestepping that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go so far as to say Weak
is a perfect use case for this feature!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't know this was possible. (but WeakInner<T>
should wrap a RcBox<T>
)
However, this relies on the fact that -1
is always an invalid Aligned<RcBox<T>>
value, whatever T
is. This is true if "size niches" are implemented, but how does the compiler derive this automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simple to say way is that if an explicit discriminant falls in the niche of the other value(s), it can be niched into it by the layout pass.
(And it'd be enum WeakRepr { Dangling = -1, Real(Aligned<RcBox<T>>) }
to ensure it uses the RcBox
alignment. (I still wish we could unify the RcBox
and ArcInner
names to use the same naming convention.))
Additionally, we don't guarantee the value of the dangling weak sentinel, and I don't think we guarantee that None::<Weak<_>>
is null, so the discriminant doesn't have to be specified, so long as a niche remains for Option
to take.
|
||
## `Aligned<T>` | ||
|
||
To handle other smart pointers (`Arc<T>`, `Rc<T>`, `Weak<T>`), a new `std`-internal type `std::ptr::Aligned<T>` is introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really shouldn't be std
-internal — There are lots of use cases for this in the ecosystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree in principle, but adding a new public type grows the scope of this RFC, because now we have to design an API to go with it. As long as Aligned<T>
is internal, we can limit the API to the strict minimum.
|
||
To handle other smart pointers (`Arc<T>`, `Rc<T>`, `Weak<T>`), a new `std`-internal type `std::ptr::Aligned<T>` is introduced. | ||
|
||
This type behaves mostly like `NonNull<T>`, but also requires that the wrapped pointer is well-aligned, enabling the use of the attributes presented in the previous section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pondering: Change NonNull<T>
to be NonNull<T, ALIGN = 1>
. So then you'd have type Aligned<T> = NonNull<T, ALIGN = align_of::<T>()>;
.
Hmm, maybe that can't work, though, since there's already NonNull<T>: From<&T>
, which would overlap from Aligned<T>: From<&T>
for 1-aligned stuff...
I'm just thinking that this could be nice for other stuff, too, since it could have a .read
method that would be able to be smart about read
or read_unaligned
, for example. (Or even just become a new intrinsic that would pass the actual alignment to the LLVM load/stores, rather than needing to pick between the "I promise it's normally-aligned" and "you have to treat it as 1-aligned" versions -- so you could still have a Aligned<u8, 16>
, for example, if you got it from casting a simd address.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be nice, to avoid duplicating the whole NonNull<T>
API in a new Aligned<T>
type.
Could the From
overlap be resolved by having NonNull<T, 0>: From<&T>
, and Aligned<T>: From<&T>
?
If "size niches" exist, only ALIGN=1
(and greater) would have them, to not break current NonNull<T>
usage that uses -1
as a manual sentinel value.
Now, does the compiler support the required features to do this in std? (we need an unstable defaulted const parameter, and some sort of const specialization to support both From
implementations)
one possible way to spell |
That could actually be pretty elegant; it would remove the need for defining the new niches via custom attributes. |
With respect to So how about having a single |
Once you set whatever "well formed" means as Stable you can't add more requirements later or you potentially invalidate old code. Even new platforms should likely not add new requirements or the old code will not actually be portable to those platforms and it will silently break. You could gate the effects of WellFormed by edition, and then rustfix would know that it can't touch that code and just give an error that it needs to be adjusted manually to re-check invariants. Other than that it's a fine plan. |
Good point. But it also applies to modifying the ABI of regular references, which this proposal does without much restraint. In practice I wouldn’t expect |
This isn't possible without a breaking change; for example, the following code is currently sound on all targets, independently of any reserved pointer ranges: let ptr = 0xFEDCBA9876543210 as usize as *const ();
let zst: &() = unsafe { &* ptr }; This fact is also required for It may be enough to special-case references to ZSTs to not have these requirements, but this seems pretty ad-hoc. Another issue is that checking if a |
I don’t think this should be considered sound. If applied to ZSTs used as proof tokens, it would render them toothless, especially if they are More philosophically, you should not be allowed to pull references out of thin air just because you don’t need to actually access any memory when dereferencing them. To claim the existence of a value at a given address you have to have sound knowledge that memory has been allocated and a value has been placed there. I don’t think it makes any difference that the extent of the allocation is empty and ‘placing’ the value happens to be a no-op at the machine code level. Someone must have had put it there, at least conceptually. For such a claim to be valid, it has to point to a place where values can ever possibly reside. To put it differently, even if ZSTs can reside anywhere in the address space, it doesn’t mean the address space has to extend throughout the entire range of the Though on the other hand, singling out ZSTs wouldn’t necessarily be so unprincipled, given that we had proposals like #2040 and #2754. I suppose As for the simplicity argument, adding a zero page reservation would add only one parameter as well, the beginning of the address space: this would change |
A concern I'd have with Or perhaps not... In truth, I think I see far more uses for this than I see for |
You can pull a reference to a zero-sized Copy data type with a public constructor out of thin air (assuming it's at an aligned address). If the constructor isn't public, or the type isn't Copy, then you're potentially breaking someone's invariants and that's gonna be a bad time. |
@fstirlitz to be clear, conjuring a ZST via dereferencing an arbitrary nonnull well-aligned pointer is sound, as far as the language is concerned. This isn't changing. However, it is semantically identical to Where the unsoundness enters, is that it's Wrong ("library unsound") to violate privacy and create a type not through the APIs provided to construct it. If someone writes Back on the topic of I'd love to be proven wrong in the future, but I don't really see all that much value for a compiler-known, niche-optimized pointer type between "known nonnull and well aligned" ( |
There's certainly some value in having "all the invariants of a reference but stop bugging me about the lifetime". In other words, it's always aligned, and if the target location is still live then it does contain initialized data in a valid state. |
Last time I tried this, making the layout computation of |
I tried something similar last year, but instead of trying to make it depend on specific Ts I tried to implement
By doing a conservative layout analysis of T to see if we could be certain that it has a non-zero size (e.g. it being an aggregate that contains at least one scalar field, an enum with at least one such variant etc. etc.). Anything that would require another query was considered as potentially zero-sized. I got as far as compiling stage-1 I didn't investigate further whether that was a bug in LLVM or my attempt was generating nonsensical IR. I have no guess how easy or difficult it would be to fix that issue. A similar conservative approach might be possible for alignments by trying to determine the minimum knowable alignment for a type instead of the exact alignment. |
After some thinking, this actually seems a workable solution: add the
These three properties are enough to exploit alignment and size niches. One advantage of
The only issue is
I must admit I didn't think that far yet. 😃 I have never worked on |
Minor nit, |
I don't actually know for sure, but from what I've heard about how rustc's internals are structured, I'd expect the issue to not be that pub enum MyType {
A(WellFormed<MyType>),
B,
C,
} For |
As an extreme example, let's assume that we're on a 16-bit architecture (pointers are 16-bits) for simplicity: pub enum MyEnum1<'a> {
Ref(&'a MyEnum2<'a>),
Z = 0,
P1 = 1,
N1 = -1,
N2 = -2,
N3 = -3,
}
pub enum MyEnum2<'a> {
Ref(&'a MyEnum1<'a>),
Z = 0,
P1 = 1,
N1 = -1,
N2 = -2,
N3 = -3,
} There are two possible equally valid sets of layouts:
|
@programmerjake's example holds for size niches, but can a similar example be created for alignment niches? I don't think so... though this would still require splitting alignment and size calculation, so I think it could be strictly ordered alignment→niches→size, if only alignment and not size niches are considered. Even in the case of recursive requirements, it's solvable, if not optimally. Niche optimization is an optimization, so it's fine to miss an optimization in some cases (if not ideal). "Just" (yes it's more complicated) break cycles by picking some types and not giving their references any size/align niches. (In order to assure the computation is deterministic and gives the same answer no matter what part of the dependency cycle is queried first, it should be sufficient to pessimize types in a total ordering, or even just to pessimize the whole cycle.) This wouldn't ever give worse results than current, without the extra niches. |
They might as well be both 2 bytes: with variant |
As for |
As currently speced, the niche is only The example as given exhibits dual-minima with simple continuous range niching. The situation still exists with full alignment niche exploitation, even if it's significantly harder to hit -- just add a lot more variants -- and as such needs to be handled. |
I'd like to add a comment here relative to my experiments relative to the efficiency of niches. It all started because I wanted to implement the use of niche for enum with more data member. (What is mentioned in the "Future possibilities" of this RFC.) The PR was rust-lang/rust#70477 , but the benchmarking saw some perf regressions in the compiler and then i did not have the time to continue this experiment. Anyway, what's relevant for this RFC, is that before enabling more niches, one should probably try to actually make sure that the niche optimization generate decent code (I summarized some possible improvement in rust-lang/rust#75866 (comment) ) |
This comment has been minimized.
This comment has been minimized.
Added a section on the thin DST issue, with a few possibilities on how to resolve it:
|
Making Custom DSTs, though, are useful because the language knows how to interact with them (mediated by the custom code of course). In order to be able to have e.g. This isn't the thread for custom potentially thin DSTs, so we needn't hash out how they're supposed to work here. I just want to note that treating custom thin DSTs like extern types removes any benefit of the feature. The option that's worth mentioning is requiring layout extraction to work even after the object is dropped. It's not the whole solution, but what interacts with |
Requiring layout extraction to work after drop isn't enough to make the layout extraction functions safe, it would also have to work after deallocation for the functions to be safe (as in not marked with |
If you have a |
What would be the correct place to discuss this? Should I open a new issue somewhere?
Consider the following: fn noop_write<T: ?Sized>(v: &mut T) {
let size = std::mem::size_of_val(v);
let p = v as *mut T as *mut u8;
// Safety:
// - we have exclusive access, so writes are fine
// - we leave the object in the same state we've found it.
unsafe { std::ptr::copy(p, p, size) }
}
fn data_race(v: Arc<Mutex<ThinDst>>) {
let v1 = v.clone();
let _ = std::thread::spawn(move || {
// Get exclusive access, and do a non-synchronized write to the whole value...
let v: &mut ThinDst = &mut *v1.lock().unwrap();
noop_write(v);
});
let _ = std::thread::spawn(move || {
// ...but the `size_of_val` call races with the write by accessing the data inside the mutex!
let mutex: &Mutex<ThinDst> = &*v;
let _ = std::mem::size_of_val(mutex);
});
} |
hmm, I'm inclined to say that the code that created the |
I now remember the issue I had last time I tried to make use of alignment for niches. Consider the following function: unsafe fn transmute_opt_references<'a, T: 'a, U: 'a>(val: Option<&'a T>) -> &'a U {
std::mem::transmute(val)
} This currently typechecks and builds because the layout of Now, if we make I can't see a way to just split the |
You're in luck, we have this for However, that might not be enough: today you can replace But that might be covered by the usual "don't |
Just to be sure I understand correctly, @eddyb so you mean that |
@nox Yeah, so that it works for |
IMHO, this is entirely expected; after all, under this proposition we have: struct E<T> {
A(T),
B,
C,
}
assert_eq!(std::mem::size_of::<E<&u8>>(), 16);
assert_eq!(std::mem::size_of::<E<&u16>>(), 8); So of course you can't transmute |
I think we still need to split some queries and not just layout and align. Consider this: #[repr(C)]
struct Foo<'a, T: 'a> {
some_int: usize,
some_ref: &'a T,
} We should be able to transmute Btw you mentioned |
I see, are you suggesting We should try messing with all of this just enough to do a crater run, like pretend (Arguably generic parameters are fine if only used inside e.g.
My intuition was that Or at least, I might be wrong today because the only way I can think of to have "niche vs tag" bump the alignment in the tag case is manually specified discriminants that force a larger-than-byte tag, but interfering with discriminants at all disables niche-based optimizations today IIRC (mostly to keep codegen simple). Either way, if we have |
I agree I can't think of one in current rust. I've pondered adding something like an Whether that's something we should have I don't know, but I think it's a thing where the "we have an underestimate of alignment and size" would work for the purposes of picking how big a niche to offer, even if computing the exact alignment is hard. |
- For fat pointers, the metadata is valid (as described in `mem::{size, align}_of_val_raw`). | ||
- The pointer points to a region of memory that has suitable size and alignment (it must fit the layout returned by `Layout::from_value_raw`). There is no constraint placed upon the "liveness" of this memory; it may be deallocated while the `WellFormed<T>` exists, or already deallocated when the `WellFormed` is created. | ||
|
||
### Thin DSTs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quickly noting: I believe we should allow references to extern
types to be dangling, such that external libraries can freely tag pointers.
I haven't read the details of the RFC, but building on the idea that extern
types are special, perhaps we could just disallow alignment niches for them entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what this section talks about, yes (even if it only mentions extern types in passing, I lump them together with thin DSTs because they are very similar).
In particular, I believe your proposition corresponds to the second "resolution":
- Make
WellFormed::{size, align}_of_val
unsafe, and treat thin DSTs as havingsize: 0, align: 1
for the purposes of theWellFormed
invariant.
Prototype: Add unstable `-Z reference-niches` option MCP: rust-lang/compiler-team#641 Relevant RFC: rust-lang/rfcs#3204 This prototype adds a new `-Z reference-niches` option, controlling the range of valid bit-patterns for reference types (`&T` and `&mut T`), thereby enabling new enum niching opportunities. Like `-Z randomize-layout`, this setting is crate-local; as such, references to built-in types (primitives, tuples, ...) are not affected. The possible settings are (here, `MAX` denotes the all-1 bit-pattern): | `-Z reference-niches=` | Valid range | |:---:|:---:| | `null` (the default) | `1..=MAX` | | `size` | `1..=(MAX- size)` | | `align` | `align..=MAX.align_down_to(align)` | | `size,align` | `align..=(MAX-size).align_down_to(align)` | ------ This is very WIP, and I'm not sure the approach I've taken here is the best one, but stage 1 tests pass locally; I believe this is in a good enough state to unleash this upon unsuspecting 3rd-party code, and see what breaks.
Prototype: Add unstable `-Z reference-niches` option MCP: rust-lang/compiler-team#641 Relevant RFC: rust-lang/rfcs#3204 This prototype adds a new `-Z reference-niches` option, controlling the range of valid bit-patterns for reference types (`&T` and `&mut T`), thereby enabling new enum niching opportunities. Like `-Z randomize-layout`, this setting is crate-local; as such, references to built-in types (primitives, tuples, ...) are not affected. The possible settings are (here, `MAX` denotes the all-1 bit-pattern): | `-Z reference-niches=` | Valid range | |:---:|:---:| | `null` (the default) | `1..=MAX` | | `size` | `1..=(MAX- size)` | | `align` | `align..=MAX.align_down_to(align)` | | `size,align` | `align..=(MAX-size).align_down_to(align)` | ------ This is very WIP, and I'm not sure the approach I've taken here is the best one, but stage 1 tests pass locally; I believe this is in a good enough state to unleash this upon unsuspecting 3rd-party code, and see what breaks.
Prototype: Add unstable `-Z reference-niches` option MCP: rust-lang/compiler-team#641 Relevant RFC: rust-lang/rfcs#3204 This prototype adds a new `-Z reference-niches` option, controlling the range of valid bit-patterns for reference types (`&T` and `&mut T`), thereby enabling new enum niching opportunities. Like `-Z randomize-layout`, this setting is crate-local; as such, references to built-in types (primitives, tuples, ...) are not affected. The possible settings are (here, `MAX` denotes the all-1 bit-pattern): | `-Z reference-niches=` | Valid range | |:---:|:---:| | `null` (the default) | `1..=MAX` | | `size` | `1..=(MAX- size)` | | `align` | `align..=MAX.align_down_to(align)` | | `size,align` | `align..=(MAX-size).align_down_to(align)` | ------ This is very WIP, and I'm not sure the approach I've taken here is the best one, but stage 1 tests pass locally; I believe this is in a good enough state to unleash this upon unsuspecting 3rd-party code, and see what breaks.
what's the rationale for introducing a new std-internal type instead of just adding an additional validity constraint to the existing internal |
The optimization should be applicable to |
another alternative to the Aligned type is a rustc perma-unstable attribute, similar to the ones that power the NonZero and NonNull types. this way no library code needs adjusting, just add the attribute to every pointer type that must be aligned. |
Regarding the
This example is not strictly true - even only knowing that a type T has alignment 2, but without knowing its size, you can still know that I like the "opaqueness" of WellFormed (ie. that it makes no guarantees about what niches specifically are available) but I'm not so much a fan of the Another concern I had is the proliferation of pointer-like types - |
A type can have a size smaller than its alignment (only) if it is a zero-sized type. E.g. |
You're right, I missed that case - still "is a ZST" is easily known along with alignment using an algorithm that does not require computing the layout of any other types. |
This RFC proposes to add niches to reference-like types by exploiting unused bit patterns caused by alignment requirements.
Rendered
Previous discussion on IRLO