Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double Drop on Rust beta #73137

Closed
pranjalssh opened this issue Jun 8, 2020 · 24 comments
Closed

Double Drop on Rust beta #73137

pranjalssh opened this issue Jun 8, 2020 · 24 comments
Labels
C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness ICEBreaker-Cleanup-Crew Helping to "clean up" bugs with minimal examples and bisections P-critical Critical priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@pranjalssh
Copy link

pranjalssh commented Jun 8, 2020

Getting double Drop/free errors on Rust. Sounds very dangerous. It started on 23 May rust nightly. Fails on beta as well, which is the next release.

A minimal repro https://bit.ly/2A52CJe
Dropping is printed twice. Flipping the order of elements in struct, makes it work and then drop is only called once.

Cc @asm89

@pranjalssh pranjalssh added the C-bug Category: This is a bug. label Jun 8, 2020
@jonas-schievink jonas-schievink added E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 8, 2020
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 8, 2020
@jonas-schievink jonas-schievink added the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Jun 8, 2020
@jonas-schievink
Copy link
Contributor

@rustbot ping cleanup

@rustbot rustbot added the ICEBreaker-Cleanup-Crew Helping to "clean up" bugs with minimal examples and bisections label Jun 8, 2020
@rustbot
Copy link
Collaborator

rustbot commented Jun 8, 2020

Hey Cleanup Crew ICE-breakers! This bug has been identified as a good
"Cleanup ICE-breaking candidate". In case it's useful, here are some
instructions for tackling these sorts of bugs. Maybe take a look?
Thanks! <3

cc @AminArria @camelid @chrissimpkins @contrun @DutchGhost @elshize @ethanboxx @h-michael @HallerPatrick @hdhoang @hellow554 @imtsuki @kanru @KarlK90 @LeSeulArtichaut @MAdrianMattocks @matheus-consoli @mental32 @nmccarty @Noah-Kennedy @pard68 @PeytonT @pierreN @Redblueflame @RobbieClarken @RobertoSnap @robjtede @SarthakSingh31 @senden9 @shekohex @sinato @spastorino @turboladen @woshilapin @yerke

@pranjalssh
Copy link
Author

Possibly related to #71956 ?

@Darksonn
Copy link
Contributor

Darksonn commented Jun 8, 2020

Here is a simpler test case:

struct DD {
    a: String,
}

// DD is dropped twice
impl Drop for DD {
    fn drop(&mut self) {
        println!("Dropping!");
    }
}

struct Yolo {
    a: DD,
    b: Option<DD>,
}

async fn gg(x: &Yolo) {}

fn main() {
    futures::executor::block_on(async {
        let d2 = async { DD { a: "a".to_owned() } };

        // Swap the order here and it works
        let action = Yolo {
            b: None,
            a: d2.await,
        };
        gg(&action).await;
    });
}

playground

@lcnr
Copy link
Contributor

lcnr commented Jun 8, 2020

Note that the underlying problem is not a double free. We instead copy the value of a both into b and a, which changes b to Some(Yolo { .. }) which is then correctly dropped.

An example illustrating this is

struct Foo {
    a: usize,
    b: &'static u32,
}

fn main() {
    futures::executor::block_on(async {
        // Swap the order here and it works
        let action = Foo {
            b: &42,
            a: async { 0 }.await,
        };
        
        println!("{:p}", action.b);
        // ^ This prints `0x0`

        async {}.await;
    });
}

@pranjalssh
Copy link
Author

This would explain why my debugger replied with cannot access memory on action.b

@jonas-schievink jonas-schievink removed the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Jun 8, 2020
@LeSeulArtichaut LeSeulArtichaut added P-critical Critical priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jun 8, 2020
@LeSeulArtichaut
Copy link
Contributor

Assigning P-critical as discussed as part of the Prioritization WG procedure.

@Aaron1011
Copy link
Member

Aaron1011 commented Jun 8, 2020

A self-contained version of @lcnr's example:

#![feature(wake_trait)]
use std::future::Future;
use std::task::{Waker, Wake, Context};
use std::sync::Arc;


struct DummyWaker;
impl Wake for DummyWaker {
    fn wake(self: Arc<Self>) {}
}

struct Foo {
    a: usize,
    b: &'static u32,
}

fn main() {
    let mut fut = Box::pin(async {
        // Swap the order here and it works
        let action = Foo {
            b: &42,
            a: async { 0 }.await,
        };
        
        println!("{:p}", action.b);
        // ^ This prints `0x0`

        async {}.await;
    });
    let waker = Waker::from(Arc::new(DummyWaker));
    let mut cx = Context::from_waker(&waker);
    fut.as_mut().poll(&mut cx);
}

@dtolnay
Copy link
Member

dtolnay commented Jun 8, 2020

FYI @ecstatic-morse @tmandry because #71956 was in the right commit range and seems related.

@Aaron1011
Copy link
Member

Aaron1011 commented Jun 8, 2020

The MIR for the example I posted contains the following two lines:

        ((((*(_1.0: &mut [static generator@src/main.rs:18:34: 29:6 for<'r> {std::future::ResumeTy, u32, &'r u32, [static generator@src/main.rs:22:22: 22:27 {}], impl std::future::Future, (), Foo, [static generator@src/main.rs:28:15: 28:17 {}], impl std::future::Future}])) as variant#4).0: Foo).0: usize) = move _4; // scope 0 at src/main.rs:20:22: 23:10
        ((((*(_1.0: &mut [static generator@src/main.rs:18:34: 29:6 for<'r> {std::future::ResumeTy, u32, &'r u32, [static generator@src/main.rs:22:22: 22:27 {}], impl std::future::Future, (), Foo, [static generator@src/main.rs:28:15: 28:17 {}], impl std::future::Future}])) as variant#4).0: Foo).1: &u32) = move (((*(_1.0: &mut [static generator@src/main.rs:18:34: 29:6 for<'r> {std::future::ResumeTy, u32, &'r u32, [static generator@src/main.rs:22:22: 22:27 {}], impl std::future::Future, (), Foo, [static generator@src/main.rs:28:15: 28:17 {}], impl std::future::Future}])) as variant#3).0: &u32); // scope 0 at src/main.rs:20:22: 23:10

The first line treans the generator enum as being in state variant4, but the second line attempts to read from variant3.

EDIT: Accesses using the 'wrong' variant appear to be intentional:

// Note that if a field is included in multiple variants, we will
// just use the first one here. That's fine; fields do not move
// around inside generators, so it doesn't matter which variant
// index we access them by.
remap.entry(locals[saved_local]).or_insert((tys[saved_local], variant_index, idx));

I suspect that the computed variant layouts are not actually compatible in this case.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jun 8, 2020

Bisected and confirmed this was introduced in #71956.

For background, that PR changed the dataflow analysis we use to mark which locals require storage in the generator state machine. I don't know yet whether the new approach was simply incorrect or it violated some assumption made by the MIR transformation later on.

In either case, the fact that this took 17 days to find is concerning. I think we should revert #71956 on beta even if we can identify a proper fix in the short-term.

I'm continuing to investigate, but help would be appreciated.

@ecstatic-morse ecstatic-morse removed the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Jun 8, 2020
ecstatic-morse added a commit to ecstatic-morse/rust that referenced this issue Jun 8, 2020
@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jun 9, 2020

After #71956, the generator transform no longer marks locals _3 (action) and _4 (&42) as live at the same time. Before this, they were marked as conflicting at the following MIR statement. I'm not sure which is correct in the abstract. Clearly we need to move out _4 before _3 becomes initialized, but I don't think MIR semantics for boundary conditions such as this one were ever formalized.

_3 = Foo { a: move _7, b: move _4 };

In any case, it seems the generator transform relied on this being a conflict. Because both _3 and _4 need to be saved in the generator's state machine since they live across yield points, the above statement is lowered to the following by the generator transform.

(((*(_1.0: &mut [static generator])) as variant#4].0: Foo) = Foo { a: move _7, b: move (((*(_1.0: &mut [static generator])) as variant#3).0: &u32) };

The place on the left-hand side ((generator as variant#4).0) and the one on the right-hand side ((generator as variant#3).0) are the fields reserved for _3 and _4 in the state machine. After #71956, these fields are allowed to overlap in the layout of the state machine. I believe assignments that alias in this way are ill-formed MIR: we're careful not to generate statements like *x = *x for example. Presumably this ultimately results in the miscompilation.

@tmandry, @jonas-schievink does this assessment seem correct? I'm less familiar with MIR lowering of generators than either one of you. If so, I think the correct solution once the revert lands is to introduce a temporary when lowering an assignment of one generator-saved local to another.

@tmandry
Copy link
Member

tmandry commented Jun 9, 2020

Yep, that sounds correct to me. Off the top of my head I can't say that the second example is definitely ill-formed MIR, but it seems very likely.

Using a temporary in these cases, as you suggested, ought to work. The other approach is to go back to being more conservative with the analysis (which seems less desirable, and possibly harder to implement after the changes from #71956.)

bors added a commit to rust-lang-ci/rust that referenced this issue Jun 9, 2020
Revert rust-lang#71956

...since it caused unsoundness in rust-lang#73137. Also adds a reduced version of rust-lang#73137 to the test suite. The addition of the `MaybeInitializedLocals` dataflow analysis has not been reverted, but it is no longer used.

Presumably there is a more targeted fix, but I'm worried that other bugs may be lurking. I'm not yet sure what the root cause of rust-lang#73137 is.

This will need to get backported to beta.

r? @tmandry
@jonas-schievink
Copy link
Contributor

Indeed, MIR constructs generally require its operands to not alias, and assignment statements are one of them. I plan to have them all documented and checked by the validation pass (-Zvalidate-mir), but it doesn't handle this case yet.

(I ran into similar issues with destination propagation, so this is somewhat reassuring – we might be able to share code that tells you about aliasing constraints of MIR constructs between destination propagation, the generator transform, and the validator this way)

@ecstatic-morse
Copy link
Contributor

The revert has landed, so this should be fixed on the latest mater.

This still needs a beta backport, however, so I'm leaving the issue open until one lands.

Thank you for reporting this @pranjalssh!

@Elinvynia Elinvynia added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 9, 2020
@LeSeulArtichaut LeSeulArtichaut removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 9, 2020
@RalfJung
Copy link
Member

Indeed, MIR constructs generally require its operands to not alias, and assignment statements are one of them. I plan to have them all documented and checked by the validation pass (-Zvalidate-mir), but it doesn't handle this case yet.

If this is the official stanza on MIR semantics, we should also make sure that Miri can catch this as UB. A static check will never be able to properly reflect dynamic properties such as "operands must not alias". Cc #68364

@spastorino
Copy link
Member

@ecstatic-morse given that this has been already reverted, I guess we may want to remove P-critical, right? and leave the issue open, probably with a different priority to find a better proper fix.

@ecstatic-morse
Copy link
Contributor

@spastorino I find it strange that the priority of an issue can change based on whether a fix is being worked on. If (god forbid) I get hit by a bus tomorrow and no one thinks to take on the work of backporting #73153, we would still need to block the release, no?

ecstatic-morse added a commit to ecstatic-morse/rust that referenced this issue Jun 11, 2020
This is to prevent the miscompilation in rust-lang#73137 from reappearing.
Only runs with `-Zvalidate-mir`.
@spastorino
Copy link
Member

@spastorino I find it strange that the priority of an issue can change based on whether a fix is being worked on. If (god forbid) I get hit by a bus tomorrow and no one thinks to take on the work of backporting #73153, we would still need to block the release, no?

@ecstatic-morse, maybe I didn't properly explain myself or I may be overlooking something. What I meant is that given that this problem was fixed by reverting the PR that has caused this regression, we can lower the priority of the issue now. My understanding is that after the revert the problem is fixed but we want to keep the issue open to find a better solution. Am I right? if so, this is why I consider that this is not P-critical anymore, given that the underlying problem is solved.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jun 11, 2020

If nothing were done from this point forward (i.e. no one backported #73153), this miscompilation would land on the next stable release. I didn't come up with the current labelling system, but that seems critical to me.

@Mark-Simulacrum
Copy link
Member

I do think it makes some sense perhaps to keep this P-critical while the backport isn't accepted; I'm not sure if @pnkfelix hasn't had time to accept it yet or if we need to discuss next T-compiler meeting or what.

@spastorino
Copy link
Member

@ecstatic-morse the revert won't hit stable because it was beta-nominated and it was accepted for backport during today's meeting. Anyway, we have also discussed if we should change the priority or not but I don't think we have clearly concluded anything.

In my opinion, we should remove the P-critical as soon as the PR that fixes the issue lands and the PR is also beta nominated, because by doing that we have a way to track it and won't be missed. It will be accepted and it will be backported. The downsides of not doing so is having more P-critical issues in our process, we need to review them frequently, add them to the agenda, discuss them during the meeting, etc but those are unactionable and it also makes the list of P-critical issues bigger.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jun 11, 2020

I don't have anything to add to my comments above besides the fact that I'm generally suspicious of open-source projects who change issue priority based on things besides severity. Ultimately it's up to y'all (meaning WG-prioritization) on how to handle release blocking issues that have been fixed on master but not beta.

Mark-Simulacrum pushed a commit to Mark-Simulacrum/rust that referenced this issue Jun 13, 2020
@ecstatic-morse
Copy link
Contributor

Fixed on beta in #73326

ecstatic-morse added a commit to ecstatic-morse/rust that referenced this issue Jun 19, 2020
This is to prevent the miscompilation in rust-lang#73137 from reappearing.
Only runs with `-Zvalidate-mir`.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Jun 23, 2020
…ir, r=tmandry

Check for assignments between non-conflicting generator saved locals

This is to prevent future changes to the generator transform from reintroducing the problem that caused rust-lang#73137. Namely, a store between two generator saved locals whose storage does not conflict.

My ultimate goal is to introduce a modified version of rust-lang#71956 that handles this case properly.

r? @tmandry
Manishearth added a commit to Manishearth/rust that referenced this issue Jun 23, 2020
…ir, r=tmandry

Check for assignments between non-conflicting generator saved locals

This is to prevent future changes to the generator transform from reintroducing the problem that caused rust-lang#73137. Namely, a store between two generator saved locals whose storage does not conflict.

My ultimate goal is to introduce a modified version of rust-lang#71956 that handles this case properly.

r? @tmandry
Manishearth added a commit to Manishearth/rust that referenced this issue Jun 23, 2020
…ir, r=tmandry

Check for assignments between non-conflicting generator saved locals

This is to prevent future changes to the generator transform from reintroducing the problem that caused rust-lang#73137. Namely, a store between two generator saved locals whose storage does not conflict.

My ultimate goal is to introduce a modified version of rust-lang#71956 that handles this case properly.

r? @tmandry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness ICEBreaker-Cleanup-Crew Helping to "clean up" bugs with minimal examples and bisections P-critical Critical priority regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests