Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate uninitialized in favor of a new MaybeUninit type #1892

Merged
merged 20 commits into from
Aug 19, 2018

Conversation

canndrew
Copy link
Contributor

@canndrew canndrew commented Feb 9, 2017

My thoughts on what to do with uninitialized and !.

Rendered

Tracking issue

Edit: This has been updated to instead recommend deprecating uninitialized entirely. The old Inhabited trait proposal is now listed as an alternative.

Edit: FCP Proposal is #1892 (comment) (in the collapsed-by-default part).

@canndrew
Copy link
Contributor Author

canndrew commented Feb 9, 2017

cc @nikomatsakis, @arielb1, @eddyb. I know y'all have some thoughts on what to do with all this.

@strega-nil
Copy link

Calling uninitialized is already extremely dangerous. This just seems like unnecessary complication. If you want to create a !, for some reason, whatever you're doing is probably UB anyways. At most, we might want to have a warning in the docs.

@canndrew
Copy link
Contributor Author

canndrew commented Feb 9, 2017

@ubsan, what about the catch_unwind example in the RFC? That's based on a compiler bug that got created when feature(never_type) got switched on in the standard library.

Without this, any (otherwise correct) code that uses uninitialized::<T> is landmine waiting to go off when someone tries to set T = !.

@nagisa
Copy link
Member

nagisa commented Feb 9, 2017

The compiler bug happened mostly because previously the T would become () due to ! not existing. I personally think this PR is going in the right direction (a-la Sized).


This trait is automatically implemented for all inhabited types.

Change the type of `uninitialized` to:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conflicts somewhat with Inhabited being an auto-trait (a-la Sized), or rather T is already Inhabited unless specified otherwise (i.e. T: ?Inhabited)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Does ! impl Sized? Does that even make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-i-m Yes, size_of::<!>() == 0, More generally though, for any given size every element of ! has that size. Similarly, any two elements of ! have the same size. These are both trivially true since ! has no elements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are defining size_of::<T>() == n iff for all values v of T that v takes n bits to express, and since there are no values of T = !, this vacuously true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, yeah. There's two subtly different definitions you could give of Sized but ! satisfies both of them vacuously.

This could be a rather large breaking change depending on how many people are
currently calling `uninitialized::<T>` with a generic `T`. However all such
code is already somewhat future-incompatible as it will malfunction (or panic)
if used with `!`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Inhabited is auto-trait, like Sized, I do not see how that’s problematic in any way.

The author of the crate may expect this change to be private and its effects
contained to within the crate. But in making this change they've also stopped
exporting the `Inhabited` impl, causing potential breakages for downstream
users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, that’s pretty much the same story with Sized. I.e. you cannot change pub struct Sized { a: [u8; 42] } to pub struct Sized { a: [u8] }.

@arielb1
Copy link
Contributor

arielb1 commented Feb 9, 2017

We pretty much decided in the design sprint to deprecate mem::uninitialized in favor of using a MaybeUninitialized<T> union in the standard library.

@canndrew
Copy link
Contributor Author

canndrew commented Feb 9, 2017

@arielb1 Ah okay. I figured that might too radical for now so I intended this RFC as a (possibly temporary) middle ground.

@canndrew
Copy link
Contributor Author

canndrew commented Feb 9, 2017

@nagisa Sorry if my wording is confusing. Inhabited isn't intended to be an auto-trait like Sized in the sense that you have to explicitly opt-out with ?Inhabited. That would create unnecessary restrictions on using uninhabited types. ! and Void should be usable pretty much anywhere, it's only really uninitialized which is problematic.

@nagisa
Copy link
Member

nagisa commented Feb 9, 2017

it's only really uninitialized which is problematic.

That’s not true. ptr::read (and probably a number of other functions) has the same problem as uninitialized and we ain’t deprecating that, so an auto-trait is still worthwhile IMO:

fn main() {
let x: *const ! = &0 as *const _ as *const _; // imagine this comes as an generic argument from somewhere
let z: ! = unsafe {
    ::std::mem::read(x) // boom bam blamma
};
}

@canndrew
Copy link
Contributor Author

canndrew commented Feb 9, 2017

Ah true, I hadn't thought of ptr::read. transmute has the same problem but I've made a seperate RFC for that. I just went through the list of intrinsics and I couldn't find any others which return a generic T without also taking one as an argument.

I still think ?Inhabited would make uninhabited types almost unusable unless people added the ?Inhabited bound everywhere. It seems simpler and a lot less restrictive to just put T: Inhabited on ptr::read as well.

match std::panic::catch_unwind(|| {
let val = f();
unsafe {
(*foo_ref).value = val;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is broken for types with Drop impls. It should be ptr::write(&mut (*foo_ref).value, val)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought we settled on different rules for overwriting union fields for some reason. Thanks for the catch!

@aturon aturon added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 9, 2017
```

Yet calling this function does not diverge! It just breaks everything then eats
your laundry instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow, this seems preferable to folding my laundry at the moment...

if used with `!`.

Another drawback is that the `Inhabited` trait leaks private information about
types. Consider a type with the following definition:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced this is more serious than already-existing leaks... for example, you can already find out the size of a type. Is there any fundamental difference with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, it's the same as Sized in this regard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense 😄

Ideally, Rust's type system should have a way of talking about initializedness
statically. In the past there have been proposals for new pointer types which
could safely handle uninitialized data. We should seriously consider pursuing
one of these proposals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I totally agree! This would make static muts much more powerful while still being safe.

@ranma42
Copy link
Contributor

ranma42 commented Feb 10, 2017

@canndrew could you add the alternative from the design sprint? (add MaybeUninit<T> and deprecate mem::uninitialized)

I find it compelling, as it removes some magic from the compiler (no special checks or automatic traits for size/inhabitedness) and lets the type system just do "its thing".

@ranma42
Copy link
Contributor

ranma42 commented Feb 10, 2017

Sorry, I just realised the option of completely deprecating mem::uninitialized is mentioned as a future direction. While it is certainly possible to make this change in smaller steps, I think it might be very interesting to assess the potential disruption of the "direct" change (would a crater run be sufficient for this?). If deemed possible, I think that would be the best course of action.

@ranma42
Copy link
Contributor

ranma42 commented Feb 10, 2017

Are there any plans to do the same to mem::zeroed, either through this RFC or through another one? I found no relevant results searching through the RFCs repo.

@ghost
Copy link

ghost commented Feb 10, 2017

I still think ?Inhabited would make uninhabited types almost unusable unless people added the ?Inhabited bound everywhere. It seems simpler and a lot less restrictive to just put T: Inhabited on ptr::read as well.

I agree with what you're saying about ?Inhabited. I don't want that either. But then I think we'd need some kind of read_or_unreachable function without a bound anyway, otherwise Vec<T> will need an Inhabited bound (at least for remove and into_iter), which will infect a LOT of code. It's genuinely unreachable code to get to ptr::read from Vec<!> anyway because the only way to do it would be to push an uninhabited value onto it in the first place. Vec<!> is harmless, so having an Inhabited bound on it would be pointless.

I'd be (more) okay with Inhabited as a regular (compiler implemented) trait but it's completely unworkable to add it onto these functions as it would just break too much code, uninitialized, zeroed and read are all stable and there's probably a lot of code out there that calls them generically. I think the best we could do to make Inhabited as a trait work (though I don't think it will be used much, this is if you want the Inhabited trait in general rather than just for these cases) is to deprecate everything which should have the bound and then provide better alternatives: mem::MaybeInitialized<T> for mem::uninitialized and mem::zeroed. ptr::read_or_unreachable<T> and ptr::read_inhabited<T: Inhabited> for ptr::read/ptr::read_volatile/ptr::read_unaligned (those last two will become pretty verbose :(). Adding the bound (unless it's ?Inhabited) onto the functions is, unfortunately, a no-go because it will cause a lot of breakage.

Myself, I think the best route is:

  • Don't have an Inhabited trait, nor ?Inhabited. I don't think T: Inhabited is the best default to enforce because I think that cases where T = ! will cause issues are rare. It will lead to a lot of unnecessary restrictions on ! or will lead to ?Inhabited everywhere. Neither of those are ideal. Similarly, I don't think there are many cases where knowing that a type is inhabited is useful, but I'm not sure.
  • Deprecate uninitialized and zeroed and leave the signatures as they are. We could, perhaps, throw an error or lint at trans time if T = !, because it's highly likely to be a bug. I don't know if that's possible, but I think it's what is (was?) done for mem::transmute.
  • Add MaybeInitialized<T> to core::mem. Give it a zeroed constructor which calls memset on it. (maybe one day a const constructor by using [0u8; size_of::<T>()], but that's later). In the deprecation message for uninitialized and zeroed, direct them here.
  • Add a note onto ptr::read's docs that if T in uninhabited then the call must be unreachable.

@djzin
Copy link

djzin commented Feb 12, 2017

Could this issue be solved with ! having no size? If the size of a type is defined to be ceil(log(n)) where n is the number of possible representations, then ! should have undefined size, or, if you like, "negative infinity" size. The main issue with this is that empty enums are already defined as being sized... and also "negative infinity" is not representable in a usize... but just a thought I had ;)

@mark-i-m
Copy link
Member

mark-i-m commented Feb 12, 2017 via email

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 11, 2018

@ubsan That makes sense. I guess one would need to drop the object instead, the storage after the drop should be uninit, so there is no need to write uninit again to it, although I guess one could do so.

@RalfJung
Copy link
Member

RalfJung commented Aug 17, 2018

honestly I still do not understand why creating &mut to uninitialized memory is automatically undefined behaviour and not reading from it

Well, there are other people saying the exact opposite. ;) We have to make a choice either way, but for both cases we have people arguing that this is clearly the more intuitive option.

I don't see how we could allow &T to point to uninitialized memory without making a pretty big backwards incompatible language change: all code that expects a &T and doesn't guard against T potentially being uninitialized would be at risk at best. Could it become broken or incorrect? No idea.

No, there would be no incompatible change. As @ubsan mentioned, we have two invariants at play here: Which assumptions the compiler can make for its optimizations, and which assumptions safe code can make for values it sees. There is no a priori reason to think these are the same, and in fact I think it is rather impossible to make them the same. I have a blogpost upcoming for this that should hopefully be done no later than Monday...

But just one example: A safe higher-order function which takes an argument f: fn(&i32) -> &i32 can assume that f can be called with any shared reference. So following your "violating safe code assumptions is insta-UB", it would be UB to have a function which does not have this property. On the other hand, many libraries have private functions not marked unsafe that actually are de-facto unsafe because they make extra assumptions that are guaranteed by the surrounding module. So your proposal makes all those libraries UB.

There are other problems as well. For example, the invariant that may be assumed by safe code is impossible to check for because it is frequently not computable (as in, would require solving the halting problem). I think we should have a definition of UB that can, at least in principle, be checked.

japaric added a commit to rust-embedded/heapless that referenced this pull request Aug 19, 2018
bors bot added a commit to rust-embedded/heapless that referenced this pull request Aug 19, 2018
55: internally use MaybeUninit r=japaric a=japaric

which has been proposed in rust-lang/rfcs#1892

Co-authored-by: Jorge Aparicio <[email protected]>
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Aug 19, 2018
@rfcbot
Copy link
Collaborator

rfcbot commented Aug 19, 2018

The final comment period, with a disposition to merge, as per the review above, is now complete.

@Centril Centril merged commit 21f887f into rust-lang:master Aug 19, 2018
@Centril
Copy link
Contributor

Centril commented Aug 19, 2018

Huzzah! This RFC has been merged!

Tracking issue: rust-lang/rust#53491

XOSplicer pushed a commit to XOSplicer/heapless that referenced this pull request Aug 22, 2018
bors added a commit to rust-lang/rust that referenced this pull request May 20, 2019
stabilize core parts of MaybeUninit

and deprecate mem::uninitialized in the future (1.40.0). This is part of implementing rust-lang/rfcs#1892.

Also expand the documentation a bit.

This type is currently primarily useful when dealing with partially initialized arrays. In libstd, it is used e.g. in `BTreeMap` (with some unstable APIs that however can all be replaced, less ergonomically, by stable ones). What we stabilize should also be enough for `SmallVec` (Cc @bluss).

Making this useful for structs requires rust-lang/rfcs#2582 or a commitment that references to uninitialized data are not insta-UB.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.