Search code examples
rust

why rust into_iter move whole collection instead of clean it


Example scenario

struct Foo {
    bar: Vec<i32>,
}

let foo = Foo { bar: vec![1, 2, 3] };
let bar: HashMap<i32, i32> = foo.bar.into_iter().map(|x|(x,x)).collect();

Expected behaviour: all values from foo.bar moved into new hash map, foo.bar now is clean vec and can be used after this, foo as whole struct can be used after this.

Real behaviour: foo.bar moved as whole collection, so whole struct foo now is partial moved and cant be used.

Question: why into_iter consume whole collection instead of consuming only elements and leave struct empty and how I can workaround this ?


Solution

  • Why calling into_iter on Vec<T> consumes vector?

    This is because IntoIterator trait is defined to consume value turned into iterator (by convention "into" in Rust means that you change something into something else, if you only want to view/borrow something, usually you will see "as"; for example Vec::as_slice).

    pub trait IntoIterator {
        type Item;
        type IntoIter: Iterator<Item = Self::Item>;
    
        // Required method
        fn into_iter(self) -> Self::IntoIter;
    }
    

    Vec::into_iter essentially uses this trait, so it only makes sense that it consumes vector.

    How to "take" some value and leave an empty value in its place to avoid partial move?

    You can use std::mem::take, which is defined as:

    pub fn take<T>(dest: &mut T) -> T
    where
        T: Default,
    

    It takes value from &mut T and uses trait Default to construct a default (usually empty in case of collections) object in its place. Since Vec implements Default you can use it as follows:

    struct Foo {
        bar: Vec<i32>,
    }
    
    fn main() {
        let mut foo = Foo { bar: vec![1, 2, 3] };
        // Take foo.bar so it can be consumed later, and leave an empty vector in its place.
        // Note that this requires mutating foo.
        let bar = std::mem::take(&mut foo.bar);
        let bar: std::collections::HashMap<i32, i32> = bar.into_iter().map(|x| (x, x)).collect();
        println!("{bar:?}");
    }
    

    In general if you want to take some value, which does not implement Default trait, or you want to use different value than it provides, you can use std::mem::replace or std::mem::swap.

    EDIT. As Ivan C pointed in the comments, you could also use Vec::drain. This is more specific to Vec, but it is useful to know about, since collections in standard library provide "Drain API". Draining is the process of taking some items from collection (described by given range) and returning them as iterator, while removing them from original collection. So you could also write:

    struct Foo {
        bar: Vec<i32>,
    }
    
    fn main() {
        let mut foo = Foo { bar: vec![1, 2, 3] };
        // We drain foo.bar, which will result with and emptying it.
        // Note that foo still must be mutable
        let bar: std::collections::HashMap<i32, i32> = foo.bar.drain(..).map(|x| (x, x)).collect();
        println!("{bar:?}");
        assert!(foo.bar.is_empty());
    }
    

    Passing "full range" .. to the drain method will result in taking all elements from vector leaving it in empty state.

    However note, that iterator returned from method drain borrows original vector for the duration of its lifetime. This means, that you cannot use original vector as long as you haven't processed all of its items (or you have not dropped returned iterator). This in turn means, that if your vector is held by some Mutex, then you must lengthen the critical section when you hold the mutex until you process all items (which can be arbitrarily long, depending on what you do with those items). There are also some intricacies around leaking memory and dropping returned iterator, that you should know when you use drain.

    So which method you should choose? This depends on what you want to express. If you are implementing some kind of stealing queue, then swapping memory shows your intent more clearly (and can greatly reduce locking contention). If you however are not taking vector from "somebody else", and just want to iterate over its values, then using drain should be perfectly fine (and even more idiomatic).