I'm handling byte input in a streaming fashion like so:
pub struct CharQueue<R: Read> {
reader: R,
deque: VecDeque<u8>,
buf: [u8;1024],
}
self.deque.reserve_exact(bytes_read);
for i in 0..bytes_read {
self.deque.push_back(self.buf[i]);
}
In profiling, it appears that VecDeque::cap()
is a major contributor to runtime. This is surprising because it does almost nothing but return a variable (and branch, I guess):
fn cap(&self) -> usize {
if mem::size_of::<T>() == 0 {
// For zero sized types, we are always at maximum capacity
MAXIMUM_ZST_CAPACITY
} else {
self.buf.capacity()
}
}
So it must be getting called an enormous number of times (which it would be, if it's getting called inside push_back()
, which makes well enough sense in hindsight).
I'm wondering if there's a way to copy this entire buffer into the queue in one go, so the capacity only has to be checked and incremented once. .reserve_exact()
skips the intermediate allocations, but not the checks and increments. .append()
is kind of along these lines, but I'd have to consume the buffer to transform it into another VecDeque first, which I don't want to do because I want to re-use it. What I really want is something like push_back_slice()
that just takes a slice, increments the queue length/does any required allocations once, and then copies each element of the slice directly into the available space without mutating or consuming it.
Is there a way to do this?
VecDeque
implements the Extend
trait, allowing you to add all items from an iterator to the collection without performing repeated allocations or checks. With your code, that should look like this:
self.deque.extend(bytes_read.iter().copied());