Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vec::push_in_capacity ? #84649

Closed
leonardo-m opened this issue Apr 28, 2021 · 11 comments · Fixed by #89123
Closed

Vec::push_in_capacity ? #84649

leonardo-m opened this issue Apr 28, 2021 · 11 comments · Fixed by #89123
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR.

Comments

@leonardo-m
Copy link

leonardo-m commented Apr 28, 2021

This enhancement suggestion looks a bit silly to me, but I think it's worth asking for an opinion in this bug tracker.

I think this is a reasonable usage pattern of Vec::spare_capacity_mut and Vec::set_len. I've used it with a larger SomeData to avoid both an useless initialization and the use of Vec::push that in those cases is slightly slower, the disadvantage is that this code contains unsafe{} and it's a bit fiddly:

#![feature(vec_spare_capacity, maybe_uninit_extra)]
fn cond(i: usize) -> bool { i % 2 == 0 }
struct SomeData(usize); // Something larger.
fn main() {
    const N: usize = 1_000_000;
    let mut data = Vec::with_capacity(N);
    let mut data_len = 0;
    let data_uninit = data.spare_capacity_mut();

    for i in 0 .. N {
        if cond(i) {
            data_uninit[data_len].write(SomeData(i));
            data_len += 1;
        }
    }
    unsafe {
        data.set_len(data_len);
    }
}

An API that removes the unsafety could be:

#![feature(vec_push_in_capacity)]
fn cond(i: usize) -> bool { i % 2 == 0 }
struct SomeData(usize); // Something larger.
fn main() {
    const N: usize = 1_000_000;
    let mut data = Vec::with_capacity(N);

    for i in 0 .. N {
        if cond(i) {
            data.push_in_capacity(SomeData(i));
        }
    }
}

Vec::push_in_capacity is similar to Vec::push, but it doesn't include any growing strategy (and doesn't include such growing/realloc code in the binary), so it just uses the capacity. If the capacity isn't enough it panics (just like the first code panics if the capacity is finished at data_uninit[data_len]).

@fee1-dead

This comment has been minimized.

@rustbot rustbot added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Apr 28, 2021
@fee1-dead
Copy link
Member

@rustbot label -C-enhancement C-feature-request

@rustbot rustbot added C-feature-request Category: A feature request, i.e: not implemented / a PR. and removed C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Apr 28, 2021
@dalcde
Copy link
Contributor

dalcde commented Apr 28, 2021

It might be better to return a Result<(), T> instead to let the user handle the failure case.

@leonardo-m
Copy link
Author

leonardo-m commented Apr 28, 2021

An extra note. Currently on this code:

fn cond(i: usize) -> bool { i % 2 == 0 }
pub struct SomeData(usize);
pub fn foo() -> Vec<SomeData> {
    const N: usize = 1_000_000;
    let mut data = Vec::with_capacity(N);

    for i in 0 .. N {
        if cond(i) {
            data.push(SomeData(i));
        }
    }
    data
}

I think LLVM12 isn't able to see that code will never realloc and doesn't need a bound test:

alloc::raw_vec::finish_grow:
        push    r15
        push    r14
        push    rbx
        mov     r15, rsi
        mov     r14, rdi
        test    rdx, rdx
        je      .LBB0_10
        mov     rbx, rdx
        mov     rdi, qword ptr [rcx]
        test    rdi, rdi
        je      .LBB0_2
        mov     rsi, qword ptr [rcx + 8]
        test    rsi, rsi
        je      .LBB0_2
        mov     rdx, rbx
        mov     rcx, r15
        call    qword ptr [rip + __rust_realloc@GOTPCREL]
        mov     rcx, r15
        test    rax, rax
        jne     .LBB0_9
        jmp     .LBB0_8
.LBB0_2:
        test    r15, r15
        je      .LBB0_3
        mov     rdi, r15
        mov     rsi, rbx
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        mov     rcx, r15
        test    rax, rax
        je      .LBB0_8
.LBB0_9:
        mov     qword ptr [r14 + 8], rax
        xor     eax, eax
        mov     rbx, rcx
        jmp     .LBB0_11
.LBB0_10:
        mov     qword ptr [r14 + 8], r15
        mov     eax, 1
        xor     ebx, ebx
        jmp     .LBB0_11
.LBB0_3:
        xor     ecx, ecx
        mov     rax, rbx
        test    rax, rax
        jne     .LBB0_9
.LBB0_8:
        mov     qword ptr [r14 + 8], r15
        mov     eax, 1
.LBB0_11:
        mov     qword ptr [r14 + 16], rbx
        mov     qword ptr [r14], rax
        pop     rbx
        pop     r14
        pop     r15
        ret

alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle:
        push    r14
        push    rbx
        sub     rsp, 56
        inc     rsi
        je      .LBB1_8
        mov     r14, rdi
        mov     rcx, qword ptr [rdi + 8]
        lea     rax, [rcx + rcx]
        cmp     rax, rsi
        cmova   rsi, rax
        cmp     rsi, 4
        mov     eax, 4
        cmova   rax, rsi
        mov     edx, 8
        xor     ebx, ebx
        mul     rdx
        setno   bl
        shl     rbx, 3
        test    rcx, rcx
        je      .LBB1_2
        mov     rdx, qword ptr [r14]
        shl     rcx, 3
        mov     qword ptr [rsp + 8], rdx
        mov     qword ptr [rsp + 16], rcx
        mov     qword ptr [rsp + 24], 8
        jmp     .LBB1_4
.LBB1_2:
        mov     qword ptr [rsp + 8], 0
.LBB1_4:
        lea     rdi, [rsp + 32]
        lea     rcx, [rsp + 8]
        mov     rsi, rax
        mov     rdx, rbx
        call    alloc::raw_vec::finish_grow
        mov     rdi, qword ptr [rsp + 40]
        mov     rsi, qword ptr [rsp + 48]
        cmp     qword ptr [rsp + 32], 1
        je      .LBB1_5
        mov     qword ptr [r14], rdi
        shr     rsi, 3
        mov     qword ptr [r14 + 8], rsi
        add     rsp, 56
        pop     rbx
        pop     r14
        ret
.LBB1_5:
        test    rsi, rsi
        jne     .LBB1_6
.LBB1_8:
        call    qword ptr [rip + alloc::raw_vec::capacity_overflow@GOTPCREL]
        ud2
.LBB1_6:
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

.LCPI2_0:
        .long   1000000
        .long   0
        .long   0
        .long   0
example::foo:
        push    r15
        push    r14
        push    rbx
        mov     r14, rdi
        mov     edi, 8000000
        mov     esi, 8
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB2_8
        mov     qword ptr [r14], rax
        vmovaps xmm0, xmmword ptr [rip + .LCPI2_0]
        vmovups xmmword ptr [r14 + 8], xmm0
        xor     esi, esi
        xor     ebx, ebx
        jmp     .LBB2_2
.LBB2_5:
        mov     qword ptr [rax + 8*rsi], rbx
        inc     rsi
        mov     qword ptr [r14 + 16], rsi
.LBB2_6:
        mov     rbx, r15
        cmp     r15, 1000000
        je      .LBB2_7
.LBB2_2:
        lea     r15, [rbx + 1]
        test    bl, 1
        jne     .LBB2_6
        cmp     rsi, qword ptr [r14 + 8]
        jne     .LBB2_5
        mov     rdi, r14
        call    alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle
        mov     rax, qword ptr [r14]
        mov     rsi, qword ptr [r14 + 16]
        jmp     .LBB2_5
.LBB2_7:
        mov     rax, r14
        pop     rbx
        pop     r14
        pop     r15
        ret
.LBB2_8:
        mov     edi, 8000000
        mov     esi, 8
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

@dalcde
Copy link
Contributor

dalcde commented Apr 29, 2021

The naive implementation of push_in_capacity

pub fn push_in_capacity<T>(v: &mut Vec<T>, x: T) -> Result<(), T> {
    if v.len() == v.capacity() {
        Err(x)
    } else {
        v.push(x);
        Ok(())
    }
}

already optimizes out the reallocation case with -O2. Since (parts of) the reallocation code is inlined into push, this would give smaller code size (godbolt).

Here is a use case that I encounter fairly frequently: I want to construct a Vec, so I preallocate one and push elements into it. However, if there is an off-by-one error in the preallocation, then it would reallocate every time. Having a push_in_capacity function (followed by unwrap) would help catch these errors.

@kornelski
Copy link
Contributor

kornelski commented Dec 8, 2021

I suggest creating a fixed-capacity Vec type wrapper instead, so that it can also be used with Extend and many more methods without realloc code bloat.

let vec = Vec::with_capacity(999);
let does_not_grow = vec.as_vec_that_does_not_grow();

does_not_grow.push(1);
does_not_grow.extend(1..100);
does_not_grow.extend_from_slice(&[1,2,3,4]);

@leonardo-m
Copy link
Author

I suggest creating a fixed-capacity Vec type wrapper instead, so that it can also be used with Extend and many more methods without realloc code bloat.

A question: such wrapper doesn't cause bloat itself?

@kornelski
Copy link
Contributor

kornelski commented Dec 8, 2021

No, I don't think it'd be much different from deref that gives you the Vec as a slice.

@dalcde
Copy link
Contributor

dalcde commented Dec 8, 2021 via email

@kornelski
Copy link
Contributor

kornelski commented Dec 8, 2021

Panic.

I also hope that the wrapper type with a fixed-size storage would be easier for optimizer to reason about (similarly how taking a slice of a vec before a loop helps remove bounds checks in the loop). If the optimizer is able to remove capacity checks, then failure handling code doesn't matter, because it will be optimized out too.

struct VecWrapper<'vec, T> {
   data: &'vec mut [MaybeUninit<T>], // spare capacity of the vec
   len: &'vec mut usize, // points to vec.len field
}

fn push(&mut self, val: T) {
   let (item, rest) = self.capacity.split_first_mut().expect("ran out of capacity");
   item.write(val);
   self.data = rest;
   self.len += 1;
}

Interestingly, it can't be as simply implemented outside of std, because the len field can't be directly borrowed, but if you need to use &mut Vec reference and set_len, then you can't have the data slice at the same time, because it'd violate mutable aliasing rules.

If necessary, the capacity error behavior could be configurable via a type parameter, e.g. VecWrapper<'a, T, OnOverflowPanic> or VecWrapper<'a, T, OnOverflowDrop> (that silently ignores excess elements). The extra argument would allow compile-time specialized code (so no extra runtime check), and inflexibility of a type argument wouldn't get in the way, because it's a temporary object that you'd create on demand, so you can create a different one for each use.

@lincot
Copy link

lincot commented Sep 23, 2022

why not add unchecked versions of functions with capacity check
(push_unchecked, extend_unchecked, extend_from_slice_unchecked,
extend_from_within_unchecked, push_str_unchecked, ...)?

pub trait PushUnchecked<T> {
    unsafe fn push_unchecked(&mut self, value: T);
}
impl<T> PushUnchecked<T> for Vec<T> {
    #[inline]
    unsafe fn push_unchecked(&mut self, value: T) {
        debug_assert!(self.len() < self.capacity());
        if self.len() == self.capacity() {
            core::hint::unreachable_unchecked();
        }
        self.push(value);
    }
}

fn cond(i: usize) -> bool {
    i % 2 == 0
}
struct SomeData(usize);
fn main() {
    const N: usize = 1_000_000;
    let mut data = Vec::with_capacity(N);

    for i in 0..N {
        if cond(i) {
            unsafe { data.push_unchecked(SomeData(i)) };
        }
    }
}

@bors bors closed this as completed in 1a7c203 Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants