I know that vectors in Rust are allocated on the heap where the pointer, capacity, and length of the vector are stored on the stack.
Let's say I have the following vector:
let vec = vec![1, 2, 3];
If I make an iterator from this vector:
let vec_iter = vec.iter();
How does Rust model this iterator in terms of allocation on the heap vs. stack? Is it the same as the vector?
Most iterators are stack allocated.
In cases like Vec::iter()
, they create iterators that have two pointers, one to the end, one to the first element, like so
use std::marker::PhantomData;
pub struct Iter<'a, T: 'a> {
ptr: *const T,
end: *const T,
_marker: PhantomData<&'a T>,
}
Since pointer doesn't convey ownership or lifetime, PhantomData<&'a T>
tells the compiler this struct holds reference of lifetime 'a
to type T
Iter::next
looks somewhat like this
impl<'a, T> Iterator for Iter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
unsafe {// pointer dereferencing is only allowed in unsafe
if self.ptr == self.end {
None
} else {
let old = self.ptr;
self.ptr = self.ptr.offset(1);
Some(&*old)
}
}
}
}
And a new Iter
is created like so
impl<'a, T: 'a> Iter<'a, T> {
pub fn new(slice: &'a [T]) -> Self {
assert_ne!(std::mem::size_of::<T>(), 0); // doesn't handle zero size type
let start = slice.as_ptr();
Iter {
ptr: start,
end: unsafe { start.add(slice.len()) },
_marker: PhantomData,
}
}
}
Now we can use it like any other iterators
let v = vec!['a', 'b', 'c', 'd', 'e'];
for c in Iter::new(&v) {
println!("{c}");
}
And thanks to PhantomData
, the compiler can guard us against use after free and other memory issues.
let iter = {
let v = vec!['a', 'b', 'c', 'd', 'e'];
Iter::new(&v) // error! borrowed value doesn't live long enough
};
for c in iter {
println!("{c}");
}