Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-referential generators vs dereferenceable #381

Open
RalfJung opened this issue Dec 3, 2022 · 4 comments
Open

Self-referential generators vs dereferenceable #381

RalfJung opened this issue Dec 3, 2022 · 4 comments
Labels
A-dereferenceable Topic: when exactly does a reference need to point to regular dereferenceable memory?

Comments

@RalfJung
Copy link
Member

RalfJung commented Dec 3, 2022

Miri recently started checking the dereferenceable attribute of &mut !Unpin references by doing "fake reads" when such functions start executing. Interestingly, that makes this safe piece of code fail.

This makes sense, I think, for the same reason that we cannot do "fake reads" of &!Freeze references: when doing such reads, we might invalidate other aliasing references. In the case of this future, what happens is that

  • a reference to the field where Delay::new(1) is stored gets created
  • next time poll gets called, we do a "fake read" of the entire future, which invalidates the previously created reference (a noalias reference doesn't like other pointers being used for reads, that's kind of the point)
  • then we access the reference, causing UB

What I do not fully understand yet is why something like this is not sufficient to cause the issue.

How could we solve this? I am honestly not entirely sure, but see two avenues worth pursuing:

  • Figure out whether these "fake reads" are really needed to model dereferenceable. I think they are, because otherwise I don't think we get the right interaction between noalias and dereferenceable, but I asked the LLVM folks about this.
  • Remove dereferenceable from &mut !Unpin references. This is obviously correct but codegen/optimization people will probably be unhappy...

The latter point might be what we have to do, and somewhat mirrors the fact that we remove dereferenceable from &!Freeze references, but not really -- for shared references we thought dereferenceable_on_entry would still be a sound attribute to add, but with this problem it looks like that might not be the case (neither for &mut !Unpin nor for &!Freeze), at least if the "fake read" model of dereferenceable prevails.

The problematic code condenses to something like this:

struct NotUnpinType {
  delay: usize,
  delay_ref: &mut usize, // points to `delay`
}

fn poll(self: &mut NotUnpinType) {
  let fake_read = *self;
  self.delay_ref -= 1; // self.delay_ref is a noalias reference
  // so we should be able to reorder the last two lines, but that changes the value stored in fake_read.
}
@RalfJung
Copy link
Member Author

RalfJung commented Dec 3, 2022

I'll make Miri not complain about this, to avoid blocking other people on this (rust-lang/miri#2713) -- so playground might not reproduce these errors if you are trying this in the future.

@RalfJung
Copy link
Member Author

RalfJung commented Dec 3, 2022

What I do not fully understand yet is why something like this is not sufficient to cause the issue.

Interestingly, this is enough to trigger the problem. Somehow wrapping the reference in Pin actually adds more UB? Very strange.

EDIT: Any wrapper struct will do, it doesn't have to be Pin.

@RalfJung
Copy link
Member Author

RalfJung commented Dec 4, 2022

I think the fact that a wrapper struct is needed is due to the fact that add_retags "gives up" on direct assignments to places like (((*_8) as variant#3).1: &mut i32), where it is not sure that the place still points to the same thing after the assignment is done.

RalfJung pushed a commit to RalfJung/rust that referenced this issue Dec 5, 2022
for now, do not do fake reads on non-Unpin mutable references

Work-around for rust-lang/unsafe-code-guidelines#381, needed to make the new test pass. Undoes parts of rust-lang/miri#2694.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 5, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 5, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 8, 2022
make retagging work even with 'unstable' places

This is based on top of rust-lang#105301. Only the last two commits are new.

While investigating rust-lang/unsafe-code-guidelines#381 I realized that we would have caught this issue much earlier if the add_retag pass wouldn't bail out on assignments of the form `*ptr = ...`.

So this PR changes our retag strategy:
- When a new reference is created via `Rvalue::Ref` (or a raw ptr via `Rvalue::AddressOf`), we do the retagging as part of just executing that address-taking operation.
- For everything else, we still insert retags -- these retags basically serve to ensure that references stored in local variables (and their fields) are always freshly tagged, so skipping this for assignments like `*ptr = ...` is less egregious.
r? `@oli-obk`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 8, 2022
make retagging work even with 'unstable' places

This is based on top of rust-lang#105301. Only the last two commits are new.

While investigating rust-lang/unsafe-code-guidelines#381 I realized that we would have caught this issue much earlier if the add_retag pass wouldn't bail out on assignments of the form `*ptr = ...`.

So this PR changes our retag strategy:
- When a new reference is created via `Rvalue::Ref` (or a raw ptr via `Rvalue::AddressOf`), we do the retagging as part of just executing that address-taking operation.
- For everything else, we still insert retags -- these retags basically serve to ensure that references stored in local variables (and their fields) are always freshly tagged, so skipping this for assignments like `*ptr = ...` is less egregious.
r? ``@oli-obk``
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 8, 2022
make retagging work even with 'unstable' places

This is based on top of rust-lang#105301. Only the last two commits are new.

While investigating rust-lang/unsafe-code-guidelines#381 I realized that we would have caught this issue much earlier if the add_retag pass wouldn't bail out on assignments of the form `*ptr = ...`.

So this PR changes our retag strategy:
- When a new reference is created via `Rvalue::Ref` (or a raw ptr via `Rvalue::AddressOf`), we do the retagging as part of just executing that address-taking operation.
- For everything else, we still insert retags -- these retags basically serve to ensure that references stored in local variables (and their fields) are always freshly tagged, so skipping this for assignments like `*ptr = ...` is less egregious.
r? ```@oli-obk```
RalfJung pushed a commit to RalfJung/miri that referenced this issue Dec 9, 2022
RalfJung pushed a commit to RalfJung/miri that referenced this issue Dec 9, 2022
make retagging work even with 'unstable' places

This is based on top of rust-lang/rust#105301. Only the last two commits are new.

While investigating rust-lang/unsafe-code-guidelines#381 I realized that we would have caught this issue much earlier if the add_retag pass wouldn't bail out on assignments of the form `*ptr = ...`.

So this PR changes our retag strategy:
- When a new reference is created via `Rvalue::Ref` (or a raw ptr via `Rvalue::AddressOf`), we do the retagging as part of just executing that address-taking operation.
- For everything else, we still insert retags -- these retags basically serve to ensure that references stored in local variables (and their fields) are always freshly tagged, so skipping this for assignments like `*ptr = ...` is less egregious.
r? ```@oli-obk```
@JakobDegen
Copy link
Contributor

Remove dereferenceable from &mut !Unpin references. This is obviously correct but codegen/optimization people will probably be unhappy...

I'm not 100% sure I understand what's going on here correctly, but I think this is the right thing to do. If LLVM dereferencable requires inserting a let _x = *p; then it's probably wrong to emit that for any SRW pointers.

In other words, our semantics after the recent revert seem fine to me. This isn't even particularly optimization unfriendly, LLVM is just failing to represent the right things...

bors pushed a commit to rust-lang-ci/rust that referenced this issue Feb 7, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 7, 2023
… r=nbdd0121

make &mut !Unpin not dereferenceable, and Box<!Unpin> not noalias

See rust-lang/unsafe-code-guidelines#381 and [this LLVM discussion](https://discourse.llvm.org/t/interaction-of-noalias-and-dereferenceable/66979). The exact semantics of how `noalias` and `dereferenceable` interact are unclear, and `@comex` found a case of LLVM actually exploiting that ambiguity for optimizations. I think for now we should treat LLVM `dereferenceable` as implying a "fake read" to happen immediately at the top of the function (standing in for the spurious reads that LLVM might introduce), and that fake read is subject to all the usual `noalias` restrictions. This means we cannot put `dereferenceable` on `&mut !Unpin` references as those references can alias with other references that are being read and written inside the function (e.g. for self-referential generators), meaning the fake read introduces aliasing conflicts with those other accesses.

For `&` this is already not a problem due to rust-lang#98017 which removed the `dereferenceable` attribute for other reasons.

Regular `&mut Unpin` references are unaffected, so I hope the impact of this is going to be tiny.

The first commit does some refactoring of the `PointerKind` enum since I found the old code very confusing each time I had to touch it. It doesn't change behavior.

Fixes rust-lang/miri#2714

EDIT: Turns out our `Box<!Unpin>` treatment was incorrect, too, so the PR also fixes that now (in codegen and Miri): we do not put `noalias` on these boxes any more.
bors pushed a commit to rust-lang/miri that referenced this issue Feb 7, 2023
bors added a commit to rust-lang/miri that referenced this issue Feb 7, 2023
make &mut !Unpin not dereferenceable, and Box<!Unpin> not noalias

See rust-lang/unsafe-code-guidelines#381 and [this LLVM discussion](https://discourse.llvm.org/t/interaction-of-noalias-and-dereferenceable/66979). The exact semantics of how `noalias` and `dereferenceable` interact are unclear, and `@comex` found a case of LLVM actually exploiting that ambiguity for optimizations. I think for now we should treat LLVM `dereferenceable` as implying a "fake read" to happen immediately at the top of the function (standing in for the spurious reads that LLVM might introduce), and that fake read is subject to all the usual `noalias` restrictions. This means we cannot put `dereferenceable` on `&mut !Unpin` references as those references can alias with other references that are being read and written inside the function (e.g. for self-referential generators), meaning the fake read introduces aliasing conflicts with those other accesses.

For `&` this is already not a problem due to rust-lang/rust#98017 which removed the `dereferenceable` attribute for other reasons.

Regular `&mut Unpin` references are unaffected, so I hope the impact of this is going to be tiny.

The first commit does some refactoring of the `PointerKind` enum since I found the old code very confusing each time I had to touch it. It doesn't change behavior.

Fixes #2714

EDIT: Turns out our `Box<!Unpin>` treatment was incorrect, too, so the PR also fixes that now (in codegen and Miri): we do not put `noalias` on these boxes any more.
@RalfJung RalfJung added the A-dereferenceable Topic: when exactly does a reference need to point to regular dereferenceable memory? label Jun 6, 2023
spikespaz pushed a commit to spikespaz/dotwalk-rs that referenced this issue Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dereferenceable Topic: when exactly does a reference need to point to regular dereferenceable memory?
Projects
None yet
Development

No branches or pull requests

2 participants