Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for Cell::update #50186

Open
ghost opened this issue Apr 23, 2018 · 67 comments
Open

Tracking issue for Cell::update #50186

ghost opened this issue Apr 23, 2018 · 67 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Small Libs issues that are considered "small" or self-contained Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@ghost
Copy link

ghost commented Apr 23, 2018

This issue tracks the cell_update feature.

The feature adds Cell::update, which allows one to more easily modify the inner value.
For example, instead of c.set(c.get() + 1) now you can just do c.update(|x| x + 1):

let c = Cell::new(5);
let new = c.update(|x| x + 1);

assert_eq!(new, 6);
assert_eq!(c.get(), 6);
@ghost ghost mentioned this issue Apr 23, 2018
@sfackler sfackler added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Apr 24, 2018
@oli-obk
Copy link
Contributor

oli-obk commented Apr 24, 2018

Not a blocking issue for merging this as an unstable feature, so I'm registering my concern here:

The update function returning its new value feels very ++i-ish. Similar to += returning the result of the operation (which rust luckily does not!). I don't think we should be doing this. Options I see:

  • return () (that was the suggestion in the PR)
  • make the closure argument a &mut T, and allow the closure to return any value.
    • if users want the "return new/old value" behaviour, they can now encode it themselves via |x| {*x += 1; *y }

Maybe there are other options?

Data point: the volatile crate's update method returns () (https://github.com/embed-rs/volatile/blob/master/src/lib.rs#L134)

@ghost
Copy link
Author

ghost commented Apr 24, 2018

@oli-obk

I like the second option (taking a &mut T in the closure). There's an interesting variation on this option - instead of requiring T: Copy we could require T: Default to support updates like this one:

let c = Cell::new(Vec::new());
c.update(|v| v.push("foo"));

Perhaps we could have two methods with different names that clearly reflect how they internally work?

fn get_update<F>(&self, f: F) where T: Copy, F: FnMut(&mut T);
fn take_update<F>(&self, f: F) where T: Default, F: FnMut(&mut T);

@ghost ghost mentioned this issue May 2, 2018
@ghost
Copy link
Author

ghost commented May 2, 2018

What do you think, @SimonSapin? Would two such methods, get_update and take_update, be a more appropriate interface?

@SimonSapin
Copy link
Contributor

The T: Default flavor feels kinda awkward but I don’t know how to put that in more concrete terms, sorry :/

It looks like there is a number of slightly different possible APIs for this. I don’t have a strong opinion on which is (or are) preferable.

bors added a commit that referenced this issue Jul 23, 2018
Implement rfc 1789: Conversions from `&mut T` to `&Cell<T>`

I'm surprised that RFC 1789 has not been implemented for several months. Tracking issue: #43038

Please note: when I was writing tests for `&Cell<[i32]>`, I found it is not easy to get the length of the contained slice. So I designed a `get_with` method which might be useful for similar cases. This method is not designed in the RFC, and it certainly needs to be reviewed by core team. I think it has some connections with `Cell::update` #50186 , which is also in design phase.
@CodeSandwich
Copy link

CodeSandwich commented Aug 6, 2018

Taking &mut in this method would be awesome, I would love to use it.

Unfortunately it seems that it would have to be unsafe, because AFAIK there is no way in Rust to restrict usage like this:

let c = Cell::new(vec![123]);
c.update(|vec| {
    let x = &mut vec[0];            // x references 123
    c.update(|vec| vec.clear());    // x references uninitialised memory
    *x = 456;                       // BOOM! undefined behavior
})

I hope, I'm wrong.

@SimonSapin
Copy link
Contributor

@CodeSandwich it is a fundamental restriction of Cell<T> that a reference &T or &mut T (or anything else inside of T) must not be given to the value inside the cell. It’s what makes it sound. (This is the difference with RefCell.)

We could make an API that takes a closure that takes &mut T, but that reference would not be to the inside of the cell. It would be to value that has been copied (with T: Copy) or moved (and temporarily replaced with some other value, with T: Default) onto the stack and will be copied/moved back when the closure returns. So code like your example would be arguably buggy but still sound: the inner update() would operate on a temporary value that would be overwritten at the end of the outer one.

@CodeSandwich
Copy link

CodeSandwich commented Aug 13, 2018

I think, that a modifier method might be safe if we just forbid usage of reference to Cell's self inside. Rust does not have mechanism for forbidding usage of a SPECIFIC reference, but it does have mechanism for forbidding usage of ALL references: 'static closures.

I've created a simple implementation of such method and I couldn't find any hole in it. It's a limited solution, but it's enough for many use cases including simple number and collections modifications.

@SimonSapin
Copy link
Contributor

A 'static closure could still reach the Cell, for example by owning a Rc.

@CodeSandwich
Copy link

That's absolutely right :(

@XAMPPRocky XAMPPRocky added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Oct 2, 2018
@earthengine
Copy link

earthengine commented Oct 3, 2018

It seems like, the semantic are a bit different between Default and Copy: the later is more light weighted, as it does not have to fill the hole behind with some temporary rubbish.

What shall we name those variants? update_move for the Default case?

Also there is another variation: update_unsafe which will be unsafe and does not require a trait bound: it only moves the value and leave uninitialized value behind, before the method returns. This will make any nested calls UB and so have to be marked as unsafe.

pub unsafe fn update_unsafe<F>(&self, f: F) -> T where F: FnOnce(T) -> T;

@SimonSapin
Copy link
Contributor

if you’re gonna use unsafe anyway you can use .get() and manipulate the raw pointer. I don’t think we should facilitate it more than that.

@yuriks
Copy link
Contributor

yuriks commented Oct 15, 2018

IMO, closure receiving a &mut to a stack copy isn't that advantageous ergonomically in terms of offering flexibility. You still need to jump through hoops to get the old value if you want to. Consider:

let id = next_id.update(|x| { let y = *x; *x += 1; y });

I think the most universally applicable thing to do would be to offer all of update, get_and_update and update_and_get:

things.update(|v| v.push(4)); // returns ()
// Alternative with v by-move. I'd guess this one isn't as useful, and you can easily use
// mem::swap or mem::replace in the by-ref version instead if you need it.
things.update(|v| { v.push(4); v });
let id = next_id.get_and_update(|x| x + 1);
let now = current_time.update_and_get(|x| x + 50);

These are 3 additional names, but I think they'll make the function more useful. A by-mut-ref update would only be useful for complex types where you're likely to do in-place updates, but I find myself doing the .get() -> update/keep value -> .set() pattern a lot with Cell.

@peterjoel
Copy link
Contributor

Related conversation on a similar PR (for RefCell): #38741

@matthew-mcallister
Copy link
Contributor

matthew-mcallister commented Jan 6, 2020

Has it been considered to use an arbitrary return type on top of updating the value? That is (naming aside),

fn update1<F>(&self, f: F) where F: FnOnce(T) -> T;
fn update2<F, R>(&self, f: F) -> R where F: FnOnce(T) -> (T, R);

The second method has a very functional feel to it and appears both flexible and sound (i.e. you can't return &T). You could get either ++i or i++ behavior out of it. Also it is the same whether T: Copy or not. Though ergonomics are of course debatable.

EDIT: Something to chew on:

fn dup<T: Copy>(x: T) -> (T, T) { (x, x) }
let i = i.update2(|i| (i + 1, i)); // i++
let i = i.update2(|i| dup(i + 1)); // ++i

@Lucretiel
Copy link
Contributor

Lucretiel commented Jan 17, 2020

+1 for a T version, as opposed to &mut T. It's easy to construct demo code that is more expressive with either version (ie, x + 1 vs vec.push(..)), but unless we add both, I think the more general version is preferable, since under the hood they both require a swap-out swap-in for soundness. Also +1 for an extra version with a return value.

Finally, I think Default is preferable to Copy. My gut is that it covers a wider range of cases; that is, it's more likely for a type to implement Default but not Copy than the other way around, and hopefully for simple cases the compiler should be able to optimize out the extra writes anyway.

@camsteffen
Copy link
Contributor

It sounds like accepting &mut T isn't a viable option.

I'm sort of combining previous ideas here. How about this?

fn update<F>(&self, f: F) where T: Copy, F: FnOnce(T) -> T;
fn update_map<F, U>(&self, f: F) -> U where T: Copy, F: FnOnce(T) -> (T, U);
fn take_update<F>(&self, f: F) where T: Default, F: FnOnce(T) -> T;
fn take_update_map<F, U>(&self, f: F) -> U where T: Default, F: FnOnce(T) -> (T, U);

@m-ou-se
Copy link
Member

m-ou-se commented Feb 29, 2020

A completely different option could be to not take a function as argument, but return a 'smart pointer' object that contains the taken/copied value and writes it back to the cell when it is destructed.

Then you could write

    a.take_update().push(1);

    *b.copy_update() += 1;

[Demo implementation]

@fogti
Copy link
Contributor

fogti commented Mar 2, 2020

@m-ou-se I think that would be unsafe suprising because multiple instances of the 'smart pointer' / guard could exist simultaneously.

        let mut cu1 = b.copy_update();
        let mut cu2 = b.copy_update();
        *cu1 += 1;
        *cu2 += 1;
        *cu1 += *cu2;

https://play.rust-lang.org/?version=beta&mode=debug&edition=2018&gist=f3fc67425f1b7f1dba1220fb961ad7d9

@Lucretiel
Copy link
Contributor

Lucretiel commented Mar 2, 2020

The proposed implementation appears to move the value out of the cell and into the smart pointer; pointer in this case is a misnomer. This will lead to the old c++ auto_ptr problem: not unsound, but surprising in nontrivial use cases (ie, if two exist at the same time, one will receive a garbage value)

@m-ou-se
Copy link
Member

m-ou-se commented Mar 2, 2020

None of the proposed solutions (with a T -> T lambda, with a &mut T -> () lambda, or with a 'smart pointer'/RaiiGuard/younameit, etc.) are unsafe. All of them can be implemented using only safe code.

All of them have this 'update while updating' problem. That's not inherent to a specific solution, but inherent to Cell. There's just no way to 'lock' access to a Cell. That's the entire point of a Cell. Rust has a built-in solution for this problem in general: only allowing mutation through a provably uniquely borrowed &mut. Cell is the way to explicitly disable that feature.

With the current (unstable) update() function: [Demo]

    let a = Cell::new(0);
    a.update(|x| {
        a.update(|x| x + 1);
        x + 1
    });
    assert_eq!(a.get(), 1); // not 2

With a &mut:

    let a = Cell::new(0);
    a.update(|x| {
        a.update(|x| *x += 1);
        *x += 1;
    });
    assert_eq!(a.get(), 1); // not 2

With a copy-containing 'smart pointer':

    let a = Cell::new(0);
    {
        let mut x = a.copy_update();
        *a.copy_update() += 1;
        *x += 1;
    }
    assert_eq!(a.get(), 1); // not 2

There is no way to avoid this problem. Every possible solution we come up with will have this problem. All we can do is try to make it easier/less verbose to write the commonly used .get() + .set() (or .take() + .set()) combination.

@psionic12
Copy link

psionic12 commented May 2, 2022

Currently, the implementation use Cell::get() to get the original value, which makes a copy, is it reasonable to use MaybeUninit to replace the old value, since a new value will be set back after all?

@danielhenrymantilla
Copy link
Contributor

@psionic12 perf-wise, a mem::replace cannot be less expensive than a Copy (the original copy is still there; and there is potentially a second write unless elided out). API-wise, remember that .set_with() can be re-entrant, so the nested call still needs to see a well-formed value 😬. So, a MaybeUninit-based approach, on top of requiring a change to all Cells so that they wrap a MaybeUninit for soundness, would lead to a problematic situation (need to panic?) when facing re-entrancy like this, which seems quite at odds with Cell's design philosophy 🙂

@psionic12
Copy link

psionic12 commented May 2, 2022

@danielhenrymantilla I'm sorry I forgot the "re-entrant" thing, if someone somehow needs to access the cell again in f(), bad things happens :(, thanks for pointing out. Design a stable API is really harder than I think.

@matklad
Copy link
Member

matklad commented May 20, 2023

To add one more color:

fn swap_with<F>(&self, f: F) where T: Default, F: FnMut(T) -> T;

"swap" in the name gives intuition about what happens under the hood

@pierzchalski
Copy link
Contributor

So in the long term, we'd like to have Cell::<T: Copy | Default>::update(&Self, impl FnOnce(T) -> T) - people like that API. But, that's fiddly to do even with current unstable specialization.

Assuming we want to reserve update for the "nice" version, does that mean we can basically settle for these for now?

impl Cell<T: Copy> {
    // Unresolved: return T or not?
    // Unresolved: return new T or old T?
    pub fn get_update(&self, impl FnOnce(T) -> T) -> T { ... }
}

impl Cell<T: Default> {
    pub fn take_update(&self, impl FnOnce(T) -> T) { ... }
}

tgross35 added a commit to tgross35/rust that referenced this issue Apr 8, 2024
Allow updating variables outside of the scope, such as to extract the old
value. This will be more consistent with `Atomic*::fetch_update`.

Tracking issue: rust-lang#50186
@Ddystopia
Copy link
Contributor

What do you think about swap_update?

cell.swap_update(&cell, |c| c += 1)

Method will temporary swap 2 cells, execute callback and swap them again?

@Amanieu
Copy link
Member

Amanieu commented Jun 18, 2024

We discussed this in the @rust-lang/libs-api meeting today and decided to stabilize update as-is (only for T: Copy).

The main reason is that, fundamentally, update is a convenience method meant to help with common use cases. The behavior of take_update is subtle, especially when the closure panics, and it is better for users who desire this behavior to explicitly write it out with .set and .take.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jun 18, 2024

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jun 18, 2024
@rfcbot
Copy link

rfcbot commented Jun 20, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Jun 20, 2024
@jdahlstrom
Copy link

jdahlstrom commented Jun 27, 2024

Hmm. I'm somewhat perplexed that this issue is six years old, now in stabilization FCP, and despite a lot of bikeshedding about the name, AFAICS only one comment (which seems to have been missed/ignored) has raised the issue that "update" (and all the other proposed names in fact) is inconsistent with several other APIs, including Cell::replace and general Rust API guidelines which recommend foo paired with foo_with:

Type Takes value Takes function
Cell replace (update proposed)
RefCell replace replace_with
std::mem1 replace (replace_with crate)
Option replace (none)

In addition, all of these return the old value except update, which returns the new one. This is also the case with the Atomic*::fetch_update methods which, besides panic::update_hook, are the only methods in std using the "update" term.

If this inconsistency is desired or necessary in some way, I think it should be clearly justified before stabilization.

Footnotes

  1. As well as std::ptr::replace and <T*>::replace

@BurntSushi
Copy link
Member

@jdahlstrom I do think the naming situation here is a little odd. The fact that replace takes a value but update takes a closure is a little weird at first glance.

I think the case in favor of update is that a non-closure update method probably doesn't make sense to add. Like, the whole point of update is that you get the old value in a closure and the new value out. It saves you from the Cell::get and Cell::set dance. You don't really get saved from that dance without a closure. This is fundamentally different than replace where you get the old value back. For example, a Cell::replace without a closure makes sense in some cases, because you might just want to unconditionally set a new value without caring about what the old value is before setting, but get the old value back out after setting it.

So after thinking through this, while update and replace seem superficially similar, they have different usage patterns.

@jdahlstrom
Copy link

@BurntSushi To be clear, I find that the "update with a closure" pattern is entirely reasonable in itself; what I’m concerned about is

  1. the naming incongruence with RefCell::replace_with and the general convention that if there are two variants of a function, x taking a value and another taking a closure to compute a value, the latter should be named x_with, and
  2. the semantical incongruence with Cell::update returning the new value and RefCell::replace_with as well as other replace-type functions returning the old value.

If (2) is desired because it permits some Cell-specific pattern then (1) is defensible because then naming the method replace_with could be confusing and bug-prone. But I feel that in that case there should be some discussion in the docs about the differences and why Cell has an update rather than a replace_with.

@BurntSushi
Copy link
Member

I think we're saying the same thing.

Improving the docs sounds good to me. I'm not quite sure what the best phrasing would be, but I don't think this particular doc update should block stabilization.

@camsteffen
Copy link
Contributor

Just playing devil's advocate if I may. I don't find myself having a strong opinion on the name.

I think the case in favor of update is that a non-closure update method probably doesn't make sense to add.

That is assuming that there has to be some update method, and that having one is more important than consistency with other APIs.

So after thinking through this, while update and replace seem superficially similar, they have different usage patterns.

This feels like circular reasoning to me - "it's different from replace because it's different". To me it seems like Cell::update and RefCell::replace_with satisfy the same use case in principle - generate a new value from the old value. Replace functions typically return the old value out of necessity because otherwise you would be forcing a clone. Of course that doesn't apply here since we have a Copy type, but consistency is still worth considering.

If it seems viable to return either old or new value, but neither are really needed since it's Copy, perhaps it's better to not return anything.

@BurntSushi
Copy link
Member

BurntSushi commented Jun 28, 2024

"it's different from replace because it's different"

That's not what I said though. A better summary of what I said is "the name is different because the usage pattern of the routine is different." That's not circular reasoning. Like, we don't have a rule in std that every single method that takes a closure has to end with the suffix _with. It's just what we usually use to disambiguate it with routines that take a value directly instead of a closure. But if taking a value directly isn't a sensible operation on its own, I don't see a problem with dropping the _with suffix.

The main issue here, from what I can tell, is the similarity between update and replace_with. But my point is that while they are superficially similar, for Cell (and Copy types specifically), the usage pattern is quite different. Different to the point that a Cell::update that doesn't take a closure probably doesn't make sense.

That is assuming that there has to be some update method, and that having one is more important than consistency with other APIs.

Sure... If we don't add update then there's no point in debating its name.

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Jun 28, 2024

If Cell::update() returns the old value

  • it is consistent with RefCell::replace_with();

  • but just to allow replacing:

    let old = cell.get();
    cell.set();
    if old … {

    with:

    let old = cell.update(|old| …);
    if old … {

    which isn't that much more useful than plain set-&-get.

If Cell::update() returns the new value:

  • It allows replacing:

    let old = cell.get(); // <- or inlined.
    let new = …;
    cell.set(new);
    if new … {

    with:

    let new = cell.update(|old| …);
    if new … {

    which, in the non-inlined cell.get() case, starts having maybe enough value.

  • But it is inconsistent with RefCell::update_with()!

    I kind of agree that, whilst not a deal breaker thanks to the different name, it's kind of unfortunate to have this inconsistency-ish thing, just for the sake of a tiny convenience around returning the new value. It might not be worth doing.

If Cell::update() does not return anything

Then we lose some convenience, but we retain the main counter.update(|n| n + 1) aspect of the API.


Conclusion

Since the "return-old" is barely more convenient than explicit .set() and .get(), and "return-new" is slightly inconsistent with other "similar" APIs, returning nothing may be the most sensible for the time being.


Aside: a possibility to punt on the decision

Click to see

would be to have the API of Cell::update() return (), but with the specific choice of () being opaque:

impl<T: Copy> Cell<T> {
	fn update<'t>(&self, f: impl FnOnce(T) -> T) -> impl 't + Sized
	where
	    T: 't,
	{
	    self.set(f(self.get()));
	}
}

That way a future release could always decide to return a copy of T rather than ().

@peterjoel
Copy link
Contributor

peterjoel commented Jun 28, 2024 via email

@camsteffen
Copy link
Contributor

camsteffen commented Jun 28, 2024

That is assuming that there has to be some update method, and that having one is more important than consistency with other APIs.

Sure... If we don't add update then there's no point in debating its name.

Sorry, what I mean to say is "...assuming that there needs to be a method named 'update'".

I think the baseline here is that we want a closure-taking method. The name and the return value are the "negotiables".

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jun 30, 2024
@rfcbot
Copy link

rfcbot commented Jun 30, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Jul 4, 2024
@Pantamis
Copy link

Pantamis commented Aug 8, 2024

I also think having such method only for copy types makes sense.

For the T : Default case I wonder if it would be best just adding a method which could be named take_inspect and calls a function only on a shared reference to a temporary moved out value from the Cell:

impl<T: Default> Cell<T> {
    /// Takes the contained value to call a function 
    /// on a reference and then moves it back in the cell.
    /// 
    /// The cell will contain `Default::default()` 
    /// in the function scope or if it panics.
    /// 
    /// # Examples
    ///
    /// ```
    /// use std::cell::Cell;
    /// use std::sync::mpsc::channel;
    /// 
    /// let (tx, rx) = channel();
    ///
    /// let c = Cell::new(Some(tx));
    ///
    /// c.take_inspect(|t| t.as_ref().unwrap().send(true).unwrap());
    /// c.take_inspect(|t| t.as_ref().unwrap().send(c.take().is_some()).unwrap());
    /// 
    /// println!("Received from channel: {}", rx.recv().unwrap());
    /// assert_eq!(rx.recv().unwrap(), false);
    /// assert_eq!(c.take().is_some(), true);
    /// ```
    #[inline]
    pub fn take_inspect<F>(&self, f: F)
    where 
    F: FnOnce(&T),
    {
       let moved = self.take();
       f(&moved);
       self.set(moved);
   }
}

Although less useful I still find nice to have a one line method to do something on a reference to the value inside a cell (which Cell prevents in general). Nested calls can still be confusing but at least not UB as in case of acting on a exclusive reference. A shared ref is already very useful when it holds a handler (to a connection or thread pool) that can never be cloned, needs only a shared reference to send items or look at cell state while still having the ability to change the contained value in some rare occasion without having to use a RefCell.

Is there a RFC proposing this idea already ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Small Libs issues that are considered "small" or self-contained Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests