Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for binary heap retain #71503

Closed
arlopurcell opened this issue Apr 24, 2020 · 19 comments
Closed

Tracking Issue for binary heap retain #71503

arlopurcell opened this issue Apr 24, 2020 · 19 comments
Labels
A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Small Libs issues that are considered "small" or self-contained Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@arlopurcell
Copy link

This is a tracking issue for BinaryHeap::retain.
The feature gate for the issue is #![feature(binary_heap_retain)].

Added in PR #71485

@arlopurcell arlopurcell added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Apr 24, 2020
@jonas-schievink jonas-schievink added A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 24, 2020
@KodrAus KodrAus added I-nominated Libs-Small Libs issues that are considered "small" or self-contained Libs-Tracked Libs issues that are tracked on the team's project board. labels Jul 29, 2020
@KodrAus
Copy link
Contributor

KodrAus commented Oct 30, 2020

We discussed this at the recent Libs meeting and had a question about taking &T rather than &mut T. In the end we thought keeping consistent with Vec and not having to think about resorting afterwards made the &T item type appropriate.

If anybody is keen to open a stabilization PR we can kick off a FCP!

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 22, 2021
Improve rebuilding behaviour of BinaryHeap::retain.

This changes `BinaryHeap::retain` such that it doesn't always fully rebuild the heap, but only rebuilds the parts for which that's necessary.

This makes use of the fact that retain gives out `&T`s and not `&mut T`s.

Retaining every element or removing only elements at the end results in no rebuilding at all. Retaining most elements results in only reordering the elements that got moved (those after the first removed element), using the same logic as was already used for `append`.

cc `@KodrAus` `@sfackler` - We briefly discussed this possibility in the meeting last week while we talked about stabilization of this function (rust-lang#71503).
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 23, 2021
Improve rebuilding behaviour of BinaryHeap::retain.

This changes `BinaryHeap::retain` such that it doesn't always fully rebuild the heap, but only rebuilds the parts for which that's necessary.

This makes use of the fact that retain gives out `&T`s and not `&mut T`s.

Retaining every element or removing only elements at the end results in no rebuilding at all. Retaining most elements results in only reordering the elements that got moved (those after the first removed element), using the same logic as was already used for `append`.

cc `@KodrAus` `@sfackler` - We briefly discussed this possibility in the meeting last week while we talked about stabilization of this function (rust-lang#71503).
@JohnTitor
Copy link
Member

If anybody is keen to open a stabilization PR we can kick off a FCP!

@rust-lang/libs-api May I ask to kick off a FCP? Stabilizing it will unblock rust-lang/rust-clippy#9059.

@Amanieu
Copy link
Member

Amanieu commented Jul 21, 2022

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jul 21, 2022

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 21, 2022
@Amanieu
Copy link
Member

Amanieu commented Jul 21, 2022

I think we should change the API to either:

  • Use &mut for the closure argument (which is consistent with HashMap)
  • Or add retain_mut like Vec does.

I personally prefer the first option: retain_mut was only added to Vec because retain was already stabilized when we noticed the mistake.

@rfcbot concern &mut

@JohnTitor
Copy link
Member

I think we should change the API to either:

* Use `&mut` for the closure argument (which is consistent with `HashMap`)

* Or add `retain_mut` like `Vec` does.

I personally prefer the first option: retain_mut was only added to Vec because retain was already stabilized when we noticed the mistake.

Sounds good, should it be addressed in a stabilization PR or done before it?

@Amanieu
Copy link
Member

Amanieu commented Jul 21, 2022

This can be addressed in the stabilization PR but we do need to make a decision here.

@m-ou-se
Copy link
Member

m-ou-se commented Jul 21, 2022

I personally prefer the first option: retain_mut was only added to Vec because retain was already stabilized when we noticed the mistake.

Agreed.

@joshtriplett
Copy link
Member

Agreed: let's make it retain, and perhaps mention something in its documentation for people who might be looking for retain_mut.

@Amanieu
Copy link
Member

Amanieu commented Jul 21, 2022

It seems we have a consensus to make retain take a closure with &mut T.

@rfcbot resolve &mut

@ijackson
Copy link
Contributor

I don't think this is right.

As I understand it, a retain whose F takes &T can be implemented more efficiently since it can know that the ordering doesn't change. For example, if only a small number of elements are removed, a few log-time fixups would suffice.

A retain which gives out &mut T needs to assume the ordering might change, and would have to check it and be prepared to reinsert elements somewhere entirely else.

@ijackson
Copy link
Contributor

(This is unlike Vec which doesn't care about what the contents are like, so Vec::retain_mut is just the same as Vec::retain inside.)

zydou pushed a commit to zydou/arti that referenced this issue Oct 27, 2022
This is currently nightly-only and is blocked on an unresolved API
question:
  rust-lang/rust#71503
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 15, 2023
Leak amplification for peek_mut() to ensure BinaryHeap's invariant is always met

In the libs-api team's discussion around rust-lang#104210, some of the team had hesitations around exposing malformed BinaryHeaps of an element type whose Ord and Drop impls are trusted, and which does not contain interior mutability.

For example in the context of this kind of code:

```rust
use std::collections::BinaryHeap;
use std::ops::Range;
use std::slice;

fn main() {
    let slice = &mut ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'];
    let cut_points = BinaryHeap::from(vec![4, 2, 7]);
    println!("{:?}", chop(slice, cut_points));
}

// This is a souped up slice::split_at_mut to split in arbitrary many places.
//
// usize's Ord impl is trusted, so 1 single bounds check guarantees all those
// output slices are non-overlapping and in-bounds
fn chop<T>(slice: &mut [T], mut cut_points: BinaryHeap<usize>) -> Vec<&mut [T]> {
    let mut vec = Vec::with_capacity(cut_points.len() + 1);
    let max = match cut_points.pop() {
        Some(max) => max,
        None => {
            vec.push(slice);
            return vec;
        }
    };

    assert!(max <= slice.len());

    let len = slice.len();
    let ptr: *mut T = slice.as_mut_ptr();
    let get_unchecked_mut = unsafe {
        |range: Range<usize>| &mut *slice::from_raw_parts_mut(ptr.add(range.start), range.len())
    };

    vec.push(get_unchecked_mut(max..len));
    let mut end = max;
    while let Some(start) = cut_points.pop() {
        vec.push(get_unchecked_mut(start..end));
        end = start;
    }
    vec.push(get_unchecked_mut(0..end));
    vec
}
```

```console
[['7', '8', '9'], ['4', '5', '6'], ['2', '3'], ['0', '1']]
```

In the current BinaryHeap API, `peek_mut()` is the only thing that makes the above function unsound.

```rust
let slice = &mut ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'];
let mut cut_points = BinaryHeap::from(vec![4, 2, 7]);
{
    let mut max = cut_points.peek_mut().unwrap();
    *max = 0;
    std::mem::forget(max);
}
println!("{:?}", chop(slice, cut_points));
```

```console
[['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], [], ['2', '3'], ['0', '1']]
```

Or worse:

```rust
let slice = &mut ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'];
let mut cut_points = BinaryHeap::from(vec![100, 100]);
{
    let mut max = cut_points.peek_mut().unwrap();
    *max = 0;
    std::mem::forget(max);
}
println!("{:?}", chop(slice, cut_points));
```

```console
[['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], [], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '\u{1}', '\0', '?', '翾', '?', '翾', '\0', '\0', '?', '翾', '?', '翾', '?', '啿', '?', '啿', '?', '啿', '?', '啿', '?', '啿', '?', '翾', '\0', '\0', '񤬐', '啿', '\u{5}', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\u{8}', '\0', '`@',` '\0', '\u{1}', '\0', '?', '翾', '?', '翾', '?', '翾', '
thread 'main' panicked at 'index out of bounds: the len is 33 but the index is 33', library/core/src/unicode/unicode_data.rs:319:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

---

This PR makes `peek_mut()` use leak amplification (https://doc.rust-lang.org/1.66.0/nomicon/leaking.html#drain) to preserve the heap's invariant even in the situation that `PeekMut` gets leaked.

I'll also follow up in the tracking issue of unstable `drain_sorted()` (rust-lang#59278) and `retain()` (rust-lang#71503).
@dtolnay
Copy link
Member

dtolnay commented Jan 15, 2023

I agree with @ijackson's rebuttal; retain and retain_mut are fundamentally different operations for BinaryHeap. #78681 is already taking advantage of the distinction. I don't think I would want to add the behavior of retain_mut under the name retain.

@dtolnay
Copy link
Member

dtolnay commented Jan 15, 2023

@rfcbot concern predicate panic can leave heap in invalid state

#105851 introduced an invariant that BinaryHeap methods are required to abide by. The current implementation of retain does not meet it.

#![feature(binary_heap_retain)]

use std::collections::BinaryHeap;
use std::panic::{self, AssertUnwindSafe};

fn main() {
    let mut heap = BinaryHeap::from_iter([3, 1, 2]);
    println!("heap={:?}", heap);
    panic::catch_unwind(AssertUnwindSafe(|| {
        heap.retain(|e| {
            if *e == 1 {
                panic!();
            }
            false
        });
    }));
    println!("heap={:?}", heap);
    println!("sorted={:?}", heap.into_sorted_vec());
}

This prints sorted=[2, 1] which is obviously not a sorted vector.

@dtolnay
Copy link
Member

dtolnay commented Jan 15, 2023

I'm marking yaahc's box because she has stepped down from T-libs-api after the point that this feature got proposed for FCP.

This can enter FCP as soon as the predicate panic concern is addressed.

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Feb 23, 2023
Rebuild BinaryHeap on unwind from retain

This closes the hole identified in rust-lang#71503 (comment) which had made it possible for the caller to end up with a heap in invalid state. As of rust-lang#105851, heaps in invalid state are not supposed to exist.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Feb 24, 2023
Rebuild BinaryHeap on unwind from retain

This closes the hole identified in rust-lang#71503 (comment) which had made it possible for the caller to end up with a heap in invalid state. As of rust-lang#105851, heaps in invalid state are not supposed to exist.
github-actions bot pushed a commit to eggyal/copse that referenced this issue Feb 25, 2023
Rebuild BinaryHeap on unwind from retain

This closes the hole identified in rust-lang/rust#71503 (comment) which had made it possible for the caller to end up with a heap in invalid state. As of #105851, heaps in invalid state are not supposed to exist.
@Aaron1011
Copy link
Member

It looks like #106918 fixed the predicate panic issue - can this now enter FCP?

@m-ou-se m-ou-se added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Mar 13, 2023
@dtolnay
Copy link
Member

dtolnay commented Mar 14, 2023

@rfcbot resolve predicate panic can leave heap in invalid state
I fixed this in #106918.

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Mar 14, 2023
@rfcbot
Copy link

rfcbot commented Mar 14, 2023

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Mar 24, 2023
@rfcbot
Copy link

rfcbot commented Mar 24, 2023

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Amanieu added a commit to Amanieu/rust that referenced this issue Mar 28, 2023
FCP finished in tracking issue: rust-lang#71503
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Mar 30, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 2, 2023
…nton

Stabilize `binary_heap_retain`

FCP finished in tracking issue: rust-lang#71503
@m-ou-se m-ou-se closed this as completed Apr 4, 2023
thomcc pushed a commit to tcdi/postgrestd that referenced this issue May 31, 2023
Rebuild BinaryHeap on unwind from retain

This closes the hole identified in rust-lang/rust#71503 (comment) which had made it possible for the caller to end up with a heap in invalid state. As of #105851, heaps in invalid state are not supposed to exist.
@dtolnay dtolnay removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Small Libs issues that are considered "small" or self-contained Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests