-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Stabilize catch_panic #1236
RFC: Stabilize catch_panic #1236
Conversation
I'm on board with this, but Maybe |
I personally feel that it's more appropriate in That being said I'm not 100% comfortable with |
Big 👎 on removing the bounds. I think I've made my position clear numerous times on this one in other RFCs. The drawback (watering down Rust's exception handling story) is a big drawback. Stabilizing it with the bounds does not make it backwards incompatible to remove them later (or if it does, it's in a pretty subtle way). And yes, before you respond, I'm well aware that Rust already has exception unsafety. But it only really has to be dealt with in destructors and unsafe code right now (plus some misuses of global state). The current variant with the (On an unrelated note, I do wish that thread local sharing could be mitigated as well, but I don't think it can be done backwards compatibly). Finally, I'd add that stabilizing this has very serious consequences around allowing panics to turn into aborts (which I definitely want). This makes it far too easy to write code that visibly changes behavior under the two options. If the idea is that by accepting this RFC in any form we would be closing off that ability, I can't in good conscience say the RFC should be accepted at all. (Please don't respond to me that you can already do this with unsafe code--I know that. Virtually nobody does it in practice). Edit: I see that the abort thing was specifically addressed in the RFC. But I don't think I found the way it was phrased ("people will know that we plan to do this eventually and therefore won't misuse panics") too comforting. |
I very much want to see catch_panic stabilized. I see arguments to both sides about the bounds. What if we stabilize a bounded version and an unsafe, unbounded version? This does nothing to address the TLS subverting concerns, but could plausibly help in some cases. Of course, offering two functions is mental overhead (and requires a name for the unbounded & unsafe function). std::thread doesn't seem like the right place. I'd almost want to see an std::panic, but I'm not sure what else would belong there. |
Commenting as someone who has lots of production C++ experience (7.5 years Google, 2.5 years Sandstorm.io / Cap'n Proto) and would very much like to switch to Rust. Here's the real-world facts that need to be dealt with:
Therefore: We need a way to isolate failures just to the individual event and its dependents, without harming unrelated events. We can do this with catch_panic today, but it will force every event callback to act like it's in a separate thread, using redundant synchronization when accessing shared data structures. The synchronization is redundant if the event loop is single-threaded or otherwise guarantees that events targeting the same object will be handled in only one thread at a time. Normally this is one of the big advantages of "actor model" programming: that events are effectively atomic and need not synchronize. Forcing synchronization not only forces redundant boilerplate on programmers but also hurts performance, as those atomic ops aren't cheap even when not contended. So for Sandstorm and Cap'n Proto, we really need this -- or some other as-yet unknown solution to the above problems. |
My thoughts on this matter: First and foremost, the bounds on catch_panic are about stronger exception safety than memory safety. All rust programs must already be sufficiently exception safe that the inclusion of an "anything goes" catch_panic is trivially memory safe, because you can arbitrarily emulate the "observing bad state" consequences of catch_panic with destructors further down the stack. Therefore no matter what we do catch_panic should never be Second, my current model of Rust's error handling story is that everyone should endeavour to never panic if correctly written, but also be ready for everyone else to panic (modulo unsafe code which may need to assume certain operations don't panic). From this perspective, catch_panic simply makes this story less painful. It gives you a new tool to be ready for panics with, without having to spawn a whole OS thread. Once you've removed the OS thread, I do not want to see a scheme where we encourage people to unsafely transmute their types to be 'static just to handle panics properly. I do not believe catch_panic will make panics into a proper try/catch error-handling scheme for several reasons:
|
Big 👍 on removing the bounds.
|
FWIW, C++'s behavior of turning double/destructor-exceptions into aborts is a huge thorn in the side of fault-tolerant programming. If a destructor panics during unwind, the right thing to do (in the name of fault tolerance) is merge the two panics into one panic and keep unwinding. C++ can't really do that because its exceptions are typed and intended to be caught by type, making it unclear how to "merge" them. Rust panics are not typed and not intended to be machine-decoded at all, so you can define whatever arbitrary merge function you want, e.g. "just concatenate the description strings". I think having a compiler flag that turns panics into aborts is a great idea, though. It would arguably make sense to use such a compiler flag for batch jobs (e.g. MapReduce) but not use it for interactive servers. |
And we've already had issues with The reason for the existing bounds was that anything you can see through |
It would be helpful to me to at least copy/paste a few of your thoughts here, and perhaps even word them in the context of this RFC. Some discussion at lunch today made me realize that the wording around the rationale here was a little light, so I've pushed an update which expands a bit on why I believe the bounds should be removed.
As stated in the RFC, however, this isn't actually quite true. Broken invariants can be exposed through types like
Can you elaborate a bit on why you're worried here? I also very much want to have a panic-to-abort option to the compiler, and I'm under the impression that this RFC does not affect this design much. The only consequence I can think of is that
This is certainly a possibility, but I've tried to articulate in the RFC why I believe
@gankro explained this as well, but because exception safety doesn't directly cause memory safety without otherwise having unsafe blocks it goes against the design principles of the standard library to mark this function unsafe. Unsafe functions in the standard library are only annotated when they can directly cause memory unsafety in otherwise safe situations, and this isn't quite the case for
There's actually some possibility for perhaps filling out
That being said we could always put it in
Thanks for weighing in! It's definitely helpful to get more use cases for either having or removing the bounds.
Excellent point! I forgot to include this in the text originally but I've now pushed an update which mentions this. |
Catchable exceptions requires coding transactionally to preserve strong exception safety. Graydon talked about this quite a bit about this as well when he was maintaining Rust. I do not believe programmers are capable of coding transactionally in any useful way, at least not consistently. I also believe that adding ("with a caveat around thread locals") doesn't materially damage this proposition for normal programming purposes, because of the ways people tend to use thread locals.
I've been trying to forestall responses like these (but apparently not well enough). Suffice it to say that I'm well aware of the precise guarantees Rust provides and I think they are largely sufficient for everyday programming. I'm not convinced that's the case with an unbounded
The problem is that if people do start using
I don't see any new text. |
As noted in the RFC, it's important to consider this in the perspective of idiomatic Rust. One of the key parts of exception safety is observing a broken invariant, and all the methods of doing so today aren't necessarily idiomatic in everyday code. The RFC also explains how it's unlikely that
Gah, sorry! Now updated. |
I somewhat agree with point (1), but I don't think conventions this early in a programming language's life will necessarily have that much importance going forward. I disagree with (2) because I think many Rust programmers simply aren't aware of this. If
Like I said, I don't think the thread pool scenario having to use unsafe code is a big deal; I expect fundamental libraries like that to have to use a certain amount of unsafe. The argument against |
Correct me if I'm wrong, but I think you're making the oft-repeated assertion that exceptions may leave shared data structures in an inconsistent state, that it's hard to make sure that doesn't happen, and that it's dangerous to continue under such circumstances. This sounds logical in theory, but doesn't match with my real-world experience. In practice, I've found that the vast majority of the time, this just isn't actually a problem. The vast majority of actual exceptions occurring in our C++ system (which liberally throws exceptions in any unexpected circumstances) have no lasting harm other than erroring the request in which they were thrown. Off the top of my head I remember only one case where servers were left in a bad state after a certain kind of exception and manual cleanup was required -- and the bad state was in the kernel, so even restarting the process wouldn't have helped. But even if it would have helped, it would absolutely not be worth it for us to throw away the fault tolerance benefits we get from our exception strategy in order to avoid one-a-year persistent bad state bugs. On the contrary, having more common aborts -- or trying to sweep problems under the rug to avoid aborting -- would make production problems far more common for us. |
@kentonv At this risk of turning this into he-said-she-said, I have to disagree with you here. Web servers are only one class of software, and one that Rust is not even specifically aimed at, and you're clearly talking about a server / daemon of some sort, because in other programs dissecting things into a single request / response wouldn't really work. Either way, I don't see how preserving the |
@pythonesque Yes, my experience is focused on distributed systems serving live traffic (but not just web servers -- also their back-ends, etc.). I don't mean to claim that my requirements are universal. If Rust doesn't want to cover this use case, that's fair -- but sad for me. That said, it does seem to me that the arguments I'm making could also apply to interactive client apps too. User clicks on button. App fires off some processing in response. Something goes wrong. It's perhaps nicer to the user to cancel the one task initiated by the button press rather than take down the whole program and lose all their state. But yeah, I don't have much experience with such apps.
Well... In C++, Cap'n Proto's event framework is based on promises. When you arrange for some event callback to be called later and return a value of type I'm not sure to what extent Rust's lifetime checking is expressive enough to support that. The alternative is to fall back to refcounting everything, in which case I guess to be honest I don't understand what the |
The FWIW, atomic operations don't hurt much if they're uncontended--on x86, a relaxed operation is pretty much free except that it doesn't allow coalescing or reordering multiple operations on the same memory address, and unordered (which is "atomic enough" for Rust) is the Java default. There could be an OIBIT for poisoning that allowed you to forego atomic operations altogether, but people seem to be against a proliferation of OIBITs. What about a safe OIBIT I would say this: if the bounds are going to change, I would prefer the bounds to change and then stabilization, rather than doing both at once. Right now we have no experience with how |
We only had an issue with Rc, and even that issue was ridiculous. As in patching it was basically a political move, and had little technical merit. No reasonable program would ever run into this behaviour. RefCell never had an issue (usize::MAX is the value for borrow_mut, so the whole thing just locks up until you drop one of your Refs). Our mistake was in trying to say that mem::forget was unsafe, which meant we didn't think through the consequences of destructor leaks in our designs. This lead to issues in thread::scoped and Vec::drain but these were known when we opted to make mem::forget safe. By marking unbounded catch_panic safe, I like to think we'll avoid getting similarly reckless. Marking it unsafe will make us think "oh no this is unsafe so I don't need to care about it". |
Seems like FFI code could just use an unrestricted unsafe |
It's important to remember though (as I'm sure you do) that I'd be hesitant to add a one-off OIBIT which isn't as core to the language as |
👍 for keeping the The RFC states that exception-unsafe code can't be memory unsafe. But that's assuming that the unsafe code that the program is using indirectly is very well written and carefully auditted, like the stdlib is. I expect very few libraries that contain unsafe code to be exception safe. In addition to this, the Rust language and stdlib have had several decisions that aim to reduce the number of logical mistakes that the programmer could make. For example variables being immutable by default, the hash map using the In all these examples the default way is the "proper way", and you have to explicitely opt-in to do something potentially erroneous. Even if the potentially erroneous code is still memory safe. I think that this principle ("the proper way by default") should be applied to exception safety as well. Forbid the user from observing any poisonned state, unless they explicitely opt-in to see it. |
@tomaka I agree that I expect some wild exception-unsafety, but I imagine it's of the vanilla variety (as in, it's not safe even without catch-panic). Also I believe this RFC doesn't change the proper default, which is just "don't catch panics". This is largely just needed for FFI and high-reliability systems. |
I'm definitely going to use 1 - In an HTTP server to return a 505 error to the user and send an email to the webmaster in case of panic. 2 - In my game to actually show an error to the user. Games are usually not run through the terminal, so in case of panic you'd just see the window magically disappear. I can't use Of course that's really subjective, but I don't see |
IMO panics should never be used as a form of exception throwing. If you're doing so, you're doing it wrong. We should not encourage library writers to use panic (which is what this proposal indirectly does by providing stabilizing a way to catch these panics). The original reasoning behind renaming Failable (and also recoverable) methods should return |
I strongly believe that if this should be stabilized it should be marked with unsafe and/or intrinsics. As @graydon said |
@ticki Take a look at the motivation section of the RFC. If something like
We are far past that point. Any library that uses, e.g., |
@BurntSushi I know the major selling point of this is FFI, and, sure, that's a problem for languages having exceptions etc., but it should be avoided in pure Rust code, hence giving it a unsafe would be ideal. |
Today, If |
What? |
@BurntSushi Type violation (invalid enums, for example) are (in most cases) not memory unsafe, but still illegal.
True. Indexing is panicable, because indexing is in most cases unrecoverable. Handling OoB is something that often leads to bugs in C++ and Java. But encouraging people to use panics while still being able to recover them would mean that they'd be abused for error handling, which is out of the scope of panics (hence the name). There's a big difference between using panics as abnormal, unrecoverable failure and as fixable exceptional cases. |
"unsafe" is not synonymous with "illegal."
See the RFC with
Of course they are. If I'm running a web server and a bug is triggered during a request that causes a panic, then it sure would be nice to not bring down the entire web server. This use case is part of the motivation in this RFC, albeit with "thread pools." |
Okay, fair enough, let me rephrase: I think that catching panics outside |
In most cases, but not in all. As an illustrative, but unrealistic example, imagine there was a way to decide, given a pointer, whether that pointer is still valid at the given type - then we could decide that pointer validity is not part of the contract that safe code relies on. Converting an integer to a shared borrow would be safe. And every pointer access would have to do the validity check to prevent crashes. As a more realistic example, we could decide for invalid enum discriminants to be safe. However, that would mean that safe code cannot rely on enum discriminants being valid, and hence every single match statement has to check for that case. That's not the decision that has been made; match statements in safe code assume that the enum discriminant is valid. As a consequence, code that could leave behind invalid enum discriminants must be marked The question that is discussed here is different. There is no property (that we know of) that safe code could rely on, that holds unless Note that the |
That's just another way of saying, "I think we should change I'm not sure why exactly you're stuck on this. Your concern is that people will abuse this for standard error handling. I think almost everyone has that same concern. We are trying to address that concern with |
I'm not exactly sure how you'd define PanicSafes (Is it a attribute? What restrictions are there? Can I have "no panic" blocks?), but the idea is interesting. Is there a RFC for that thing? I'm not against panics, though. I do not think panics should be considered |
I've linked to it a few times already. It has been merged. #1323 |
@BurntSushi Cool. I must have missed it. |
Shall we also provide a safe mechanism to subvert trait coherence? After all, trait coherence is not directly implicated in memory safety. As expressed by @graydon and @aturon above, I believe the fundamental requirement for a safe/ |
I am pretty sure that in combination with associated types, trait coherence is needed for memory safety. I know this is the case in Haskell, where type families have to be coherent, otherwise the program can crash. I don't have the time right now to come up with an example, though. |
Good point. But even so. |
I'm of the school of thought that Rust has one way of handling recoverable errors-- There are two principal reasons why I'd prefer we not make it easier or more idiomatic to catch panics:
RAII plus destructors plus lexical-scope drop is just awesomely powerful to use so long as it's easy to determine when your destructors run. Rust is sooo close to giving you exactly what you expect that it's a shame there's this odd corner where what it can give you is rather complicated to reason about. FFI Counterargument I'm going to cheaply lean on the C defense here--C libraries have plenty of abort() type invariants in practice. To say they're widely used is obviously an understatement. Yes, unwinding is indeed terrible if you have non-rust frames in your stack. Abort instead! Responding to the "but you could recover.." or "if you're in e.g. a desktop app, don't punish the user.." Do you really know for sure nothing has gone wrong when you unwound? Maybe you'll try again and that transaction that didn't abort correctly will deadlock you. Basically... I'm not totally against catching panics because Rust is all about power and if you Know What You're Doing, by all means... If you're convinced it's the right call for your use case and you're pretty darn sure everything that needed to be undone was. Maybe Servo is one of those codebases, for example. Everyone hates browsers crashing, and the I imagine the Servo devs have a shot at getting early drop pervasively right. But:
|
I should admit that I'm sure my bias is in play here. I use rust for backend storage/database systems where a crash isn't a huge deal (the systems have to be highly redundant anyway) but incorrect behavior is a very big deal (customer data loss). And queuing/gradual degradation is a far more insidious problem than fast failures/restarts. I'm sure the world looks very different if you're using Rust for an application running on an end-user's computer or something. Or if you're hosting collections of somewhat unknown code in a kind of general application server and you can't always enforce the applications will follow a predefined set of rules. |
I agree with @jamwt, that it would be appropriate to make this unsafe. Unsafes are more than memory unsafety. They're a general barrier to avoid users doing things which they should only do if they know exactly what they are doing and why. |
@jamwt Catching panics is a necessary evil for ffi: It's UB to unwind across the boundary. A correct rust library exposing a rust function using the C ABI must catch panics at the boundary, otherwise unpredictable results happen. This is an area where Rust is not safe by default, instead you have to remember to catch panics, so in that sense we need to make it easy to do it correctly. |
One tiny perspective on this long discussion: Panics in libraries are making a policy decision; catching them is required in order to prevent them from being global policy decisions. Abort on panic means your third party dependencies get to decide when your application aborts, but third parties do not know your application and what constraints it's working with. There are exactly three choices: (1) libraries must not use panics (APIs will be polluted with Results used for bugs instead of error conditions), (2) third party libraries that panic may not be used at all, or (3) panics must be catchable. The point about policy decisions having transitivity is isomorphic to the situation where Rust is being used across an ffi; but ffi is not the only situation. Abort on panic is not the right behaviour for all applications. It's probably the wrong behaviour for most applications that aren't factored as compositions of small restartable processes; most applications are event-driven, whether servers or UIs. Tearing down anything beyond a simplistic single-threaded server is bad policy as previously discussed (one might consider tainting the process and killing it once all outstanding requests complete, but this still requires catching panics), and disappearing UI is a poor user experience - you'll want to best-effort save state and restore on restart, at a minimum. My experience does not match @jamwt - exceptions do work well in servers written in GC languages. In practice it's mostly a non-issue, because there are far fewer things that need careful coding when you have a rock-solid guarantee of memory safety. Without memory safety, every memory allocation is hazardous; and in a language like C++ with copy constructors, destructors etc. mere function calls are fraught with danger. But this is mostly a social property of lack of memory safety, not of exceptions. |
A buggy library can cause problems in other ways than panicking - e.g. it can infinite-loop, or allocate too much memory and get killed, or it can corrupt data structures it has access to. A "segmentation fault" abort is often not the desired thing to do with a GUI program. That is a slightly different issue than "catch_panic" - here you want to handle the panic, show a proper error message, and then abort the process. |
So an interesting thought here is, in the case of a Rust library and non-Rust main program, would it be a good best practice to export a symbol which is a wrapper around |
Stabilize
std::thread::catch_panic
after removing theSend
and'static
bounds from the closure parameter.Rendered