Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing destructors leak resources #14875

Closed
alexcrichton opened this issue Jun 13, 2014 · 35 comments
Closed

Failing destructors leak resources #14875

alexcrichton opened this issue Jun 13, 2014 · 35 comments
Labels
A-codegen Area: Code generation A-destructors Area: destructors (Drop, ..) A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. P-low Low priority

Comments

@alexcrichton
Copy link
Member

In this program, the local variable _a is leaked:

struct Foo;

impl Drop for Foo {
    fn drop(&mut self) { fail!(); }
}

fn main() {
    let _a = box 1;
    let _b = Foo;
}
@alexcrichton
Copy link
Member Author

Nominating, I feel uncertain about our story of failing destructors.

@thehydroimpulse
Copy link
Contributor

This was one of the motivations for the original algebraic effects systems (and the RFC I wrote). But that was a much bigger abstraction than merely fixing this scenario. I guess it was somewhat similar to C++'s nofail exception guarantee.

@brson
Copy link
Contributor

brson commented Jun 17, 2014

I'm a little shocked that this problem still exists.

@brson brson added P-high and removed I-nominated labels Jun 19, 2014
@brson
Copy link
Contributor

brson commented Jun 19, 2014

P-high, not 1.0

@ghost
Copy link

ghost commented Jun 29, 2014

So:

fn main() {
    let b = Bar;
    let f = Foo;
}

compiles down to:

define internal void @_ZN4main20hd8096a747d8d732eIaa4v0.0E() unnamed_addr #0 {
entry-block:
  %b = alloca %"struct.Bar<[]>"
  %f = alloca %"struct.Foo<[]>"
  %0 = getelementptr inbounds %"struct.Bar<[]>"* %b, i32 0, i32 0
  store i1 true, i1* %0
  %1 = getelementptr inbounds %"struct.Foo<[]>"* %f, i32 0, i32 0
  store i1 true, i1* %1
  call void @_ZN3Foo14glue_drop.114317h8e9a26d4f7ff5149E(%"struct.Foo<[]>"* %f)
  call void @_ZN3Bar14glue_drop.114517hb6d883694ad7fcd5E(%"struct.Bar<[]>"* %b)
  ret void
}

To me it looks like at least all but the last one should be Invoke instructions with a landing pad.

@nrc
Copy link
Member

nrc commented Sep 24, 2014

So the desired behaviour if we fail in a destructor is to return and execute any other destructors and then continue to fail? If that sounds right, this should just be a matter of adding landing pads for the dtor calls as @jakub- suggests, right? I assume that if one of the other destructors fails, then we will go down the double-fail path automatically?

@alexcrichton
Copy link
Member Author

That's what I would expect at least! We don't support more than double-failure at the library level, so we can probably avoid any more nesting after the first landing pad.

@thestinger
Copy link
Contributor

It would be memory unsafe to run nested destructors, so there's not much else that can be done.

@nrc nrc self-assigned this Oct 5, 2014
@nrc
Copy link
Member

nrc commented Oct 8, 2014

This is more complex than I thought. If we have n objects with cleanup, we need to introduce n landing pads, because any of the dtor calls might fail. For dtor i we need to do the remaining n-i cleanups, but any dtor calls don't have to be invokes. We are basically turning any scope into n scopes - one scope per object with cleanup. In the cleanup case we don't need to do this, since this first fail will be a double fail and crash - this is the n-i case above, but with i = 0, I guess.

This seems like a lot of extra code being generated :-(. I wonder if we should treat fail in a dtor like a double fail and just crash.

If we do want to try a bit harder, perhaps an implementation strategy is to have cleanup treat dtor calls (on normal exit of a scope) as belonging to their own scopes. I guess we could do some kind of transformation of the CleanupScope data structure.

@alexcrichton
Copy link
Member Author

One thing I've noticed is that our existing notion of landing pads in the compiler isn't quite what you want when you execute the drop glue. Right now a landing pad will never ever rejoin the original code path (they're completely divergent), but when invoking drop glue for local variables we want both the unwinding and the normal case to fall through to the next instruction. I think that this would inflate the debuginfo metadata, but I don't think that this would incur a runtime cost.

As in, I think that each local variable's destructor should be its own basic block, and then each basic block is terminated with an invoke where both edges go to the next destructor (and the final basic block at the end just has a return). The runtime would continue to hard abort on double failure, of course.

@thestinger
Copy link
Contributor

It would be memory unsafe to continue invoking the inner drop glue after failure. It is often assumed that the outer destructors succeeded. A lot of redesign would need to be done to cope with it and it would have a performance hit. Splitting into basic blocks will also have a performance hit because LLVM can't optimize across them well.

@alexcrichton
Copy link
Member Author

@thestinger, this is the second time you have said this, but can you give a concrete example as to where it would be memory unsafe?

@thestinger
Copy link
Contributor

When a type does something like setting vec.cap = 0 to prevent destruction after an operation that could fail. There are numerous examples in the standard library. There's already a lot of memory unsafety from failure under the current implementation but this would bring more problems. It doesn't seem to be something that's considered in most of the unsafe code.

@nrc
Copy link
Member

nrc commented Oct 8, 2014

I'm not sure we're talking about the same thing - it sounds like @thestinger is talking about nested dtors, which we would continue to abandon, whilst @alexcrichton and I are talking about 'sibling' dtors - e.g., if you have two variables in a scope and the dtor for the first fails, then continue to execute the dtor for the second. I guess this could be memory safe if the second has references to the first, but they can't be owning refs so I don't think the dtor should rely on them being consistent. Perhaps that is not strong enough though.

I did have the performance worry, but I also worry about code bloat.

@nrc nrc added the I-nominated label Oct 8, 2014
@nrc
Copy link
Member

nrc commented Oct 9, 2014

nominating since if we abort on fail in dtors that will be a backwards incompatible change

@nrc
Copy link
Member

nrc commented Oct 9, 2014

I discussed this a bit on irc with @alexcrichton, @pcwalton, and @sfackler. To summarise, there are two options really - either we terminate the process if we fail in a dtor or we treat each variable as having its own scope for destructors, as outlined above by @alexcrichton and I. To expand on the latter a bit - each dtor call would invoke rather than call but the normal and landing pad branches the normal return executes the next dtor. The landing pad would set a flag and then continue with the next dtor. A double fail would be handled only by the runtime, not in the landing pads. After executing all the cleanups we would check the flag mentioned before to see if we need to resume or do a normal return.

The cost of the properly failing dtors would be a bunch more basic blocks (may or may not be a cost, depending on LLVM optimisation), a branch at the end of any function with cleanups and a stack slot for a flag on these functions, and updating that flag after a failed dtor call.

The terminate on fail method has no runtime cost and is much simpler to implement, however, failing in a dtor is common enough that it might cause problems.

#8861 should prevent dtors seeing other objects in an inconsistent state, even after a failure.

@pcwalton
Copy link
Contributor

pcwalton commented Oct 9, 2014

I don't believe that aborting on failure in destructors is P-backcompat-lang since it causes havoc at the moment anyway, so we'd be within our rights to change it post 1.0.

@thestinger
Copy link
Contributor

It compiles and runs fine, so it's a backwards compatibility issue. Code that leaks resources still works fine in production (just look at Firefox and Chromium) and crashing instead would be a break of the contract with user code.

@pnkfelix
Copy link
Member

pnkfelix commented Oct 9, 2014

leaving at P-high, not 1.0.

We are choosing to leave open the option of adopting the "abort on fail! from within a dtor", because code that is faiing from within a dtor is breaking a linguistic contract. However, we are also unlikely to actually implement the "abort on fail! from within a dtor" option. We just are leaving that door open.

@brson brson added E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. and removed I-wrong P-medium Medium priority labels Aug 25, 2016
@briansmith
Copy link
Contributor

It compiles and runs fine, so it's a backwards compatibility issue. Code that leaks resources still works fine in production (just look at Firefox and Chromium) and crashing instead would be a break of the contract with user code.

I think people are aware of this, but: It depends on what the resource is, right? If the resource is a mutex or equivalent then deadlocks or worse could definitely occur.

Also, keep in mind that some code in a destructor (Drop implementation`) might assume that if it is reached, then some previously-dropped item's destructor ran successfully--e.g. nulling a pointer. Thus, running a destructor after another destructor failed could result in all kinds of badness including potentially even UaF.

@nagisa
Copy link
Member

nagisa commented Aug 26, 2016

With MIR, a panic inside your destructor will not prevent other destructors in the scope from running anymore. Since MIR is the default and AST based translator is removed, this is technically fixed now.

@eddyb eddyb added E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. and removed E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. I-needs-decision Issue: In need of a decision. labels Aug 26, 2016
cristicbz added a commit to cristicbz/rust that referenced this issue Aug 30, 2016
sophiajt pushed a commit to sophiajt/rust that referenced this issue Aug 30, 2016
Add test for rust-lang#14875

You can check this out in the playground https://is.gd/oVKC2T . It will fail on stable, but pass on nightly as @nagisa suggested on the issue.

Fixes rust-lang#14875
sophiajt pushed a commit to sophiajt/rust that referenced this issue Aug 30, 2016
Add test for rust-lang#14875

You can check this out in the playground https://is.gd/oVKC2T . It will fail on stable, but pass on nightly as @nagisa suggested on the issue.

Fixes rust-lang#14875
sophiajt pushed a commit to sophiajt/rust that referenced this issue Aug 31, 2016
Add test for rust-lang#14875

You can check this out in the playground https://is.gd/oVKC2T . It will fail on stable, but pass on nightly as @nagisa suggested on the issue.

Fixes rust-lang#14875
sophiajt pushed a commit to sophiajt/rust that referenced this issue Aug 31, 2016
Add test for rust-lang#14875

You can check this out in the playground https://is.gd/oVKC2T . It will fail on stable, but pass on nightly as @nagisa suggested on the issue.

Fixes rust-lang#14875
@NXTangl
Copy link

NXTangl commented Oct 10, 2016

Just want to add, for the record, that allowing failing destructors to leak other destructors can and will cause nasty use-after-free bugs for scoped threads, since they rely on finalization code to join the child threads before the borrowed stack frame expires.

@nagisa
Copy link
Member

nagisa commented Oct 10, 2016

@NXTangl it was been agreed quite a while ago that relying on destructors to run for safety is not safe. There’s a reason mem::forget() is safe and safe mem::forget() is in turn a reason to not rely on destructors for safety.

@SimonSapin
Copy link
Contributor

This is also why destructor-based scoped threads were remove from std: #24292. Crossbeam now uses closures to make sure to join threads after the closure returns.

@NXTangl
Copy link

NXTangl commented Oct 28, 2016

Yeah, I know that...but, AFAIK, we rely on being able to use the destructors of stack objects for finalization, no? What happens if the closure throws an exception? Or am I completely misunderstanding how it works?

@sfackler
Copy link
Member

Destructors are run when unwinding from a panic.

@NXTangl
Copy link

NXTangl commented Oct 29, 2016

Which is my point. This says that current behavior means that panicking in a destructor during a panic can cause those destructors to be leaked, thereby causing UB in the child threads.

@jdm
Copy link
Contributor

jdm commented Oct 29, 2016

Leaking isn't undefined behaviour; it may yield unexpected behaviour, which is (I believe) what you mean.

@gnzlbg
Copy link
Contributor

gnzlbg commented May 7, 2019

If Drop::drop panics one has to make sure the value is leaked. Ootherwise some fields might be double-dropped when the destructor is invoked a second time.

Is this behavior documented anywhere? I've only found this issue, no RFC, internals / mailing list post, no nothing. I've filled #60611 for tracking documenting this, so if someone has any sources about the current drop behavior it would be nice to post them there.

lnicola pushed a commit to lnicola/rust that referenced this issue Jun 19, 2023
…do-not-transform-lifetimes, r=Veykril

fix: implemeted lifetime transformation fot assits

A part of rust-lang/rust-analyzer#13363
I expect to implement transformation of const params in a separate PR

Other assists and a completion affected:
- `generate_function` currently just ignores lifetimes and, consequently, is not affected
- `inline_call` and `replace_derive_with...` don't seem to need lifetime transformation
- `trait_impl` (a completion) is fixed and tested
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-destructors Area: destructors (Drop, ..) A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. P-low Low priority
Projects
None yet
Development

No branches or pull requests