I'm buffering the last X lines of stdout, stderr & stdin of a process. I'd like to keep the last X lines and be able to access a line by its id (line number). So if we store 100 lines and insert 200 of them, you can access lines 100-200. (In reality we want to store ~2000 lines.)
The performance case is insertion. So insertion itself should be fast. Retrieving will occasionally happen but is probably at 10% of the use case. (We won't look into the output for most of the time.)
Old approach, fragmenting
I used a wrapping ArrayDeque
and then kept book over the line-count, but this means we're using a [Vec<u8>;100]
in the example above. An array of String thus an Array of Vec<u8>
.
New approach, with open questions
My* new idea is to store data in one array of u8 and then keep book over start position and length of each entry in the array. The problem here is that we would need the book-keeping to be also some kind of ringbuffer and erase old entries the moment our array of data has to wrap. Maybe there are also better ways to implement this ? At least this takes full advantage of a ringbuffer and prevents memory fragmentation.
*thanks also to sebk from the rust community
Current easy approach
const MAX: usize = 5;
pub struct LineRingBuffer {
counter: Option<usize>,
data: ArrayDeque<[String; MAX], Wrapping>,
min_line: usize,
}
impl LineRingBuffer {
pub fn new() -> Self {
Self {
counter: None,
data: ArrayDeque::new(),
min_line: 0,
}
}
pub fn get<'a>(&'a self,pos: usize) -> Option<&String> {
if let Some(max) = self.counter {
if pos >= self.min_line && pos <= max {
return self.data.get(pos - self.min_line);
}
}
None
}
pub fn insert(&mut self, line: String) {
self.data.push_back(line);
if let Some(ref mut v) = self.counter {
*v += 1;
if *v - self.min_line >= MAX {
self.min_line += 1;
}
} else {
self.counter = Some(0);
}
}
}
Draft of the new idea questioned about:
pub struct SliceRingbuffer {
counter: Option<usize>,
min_line: usize,
data: Box<[u8;250_000]>,
index: ArrayDeque<Entry,Wrapping>,
}
struct Entry {
start: usize,
length: usize,
}
For whatever reason the current approach is still pretty fast, even though I expect a lot of allocations of different size (depending on the lines) and thus fragmentation.
Let's go back to basics.
A circular buffer typically guarantees no fragmentation because it is not typed by the content you store, but by size. You might define a 1MB circular buffer, for example. For fixed-length types, this gives you a fixed number of elements you can store.
You're evidently not doing this. By storing Vec<u8>
as an element, even though the overarching array is fixed-length, the content is not. Each element stored is, in the array, a fat pointer pointing to the Vec
(starting point and length).
Naturally, when you insert, you will therefore have to:
Vec
(this is the fragmentation you're thinking of, but not really seeing, as the rust allocator is pretty efficient at this kind of stuff)Your second option is an actual circular buffer. You gain in fixed size and zero allocs if you do it right, you lose out on the ability to store entire lines with 100% guarantee of having an entire line at the start of your buffer.
Before we head into the wide lands of DYI, a quick pointer to VecDeque
is in order. This is a much more optimized version of what you implemented, albeit with some (fully warranted) unsafe
sections.
We're going to make a bunch of assumptions and set some requirements for this:
u8
Vec
; in practice you would not reimplement this entire structure at all, the array is purely there for demonstrationThe result of these choices is the following element structure:
| Element size | Data |
|--------------|----------|
| 4 bytes | N bytes |
We are therefore losing 4 bytes ahead of every message to be able to get a clear pointer/skip reference to the next element (of maximum size comparable to a u32
).
A naive implementation example is as follows (playground link):
use byteorder::{NativeEndian, ReadBytesExt, WriteBytesExt};
pub struct CircularBuffer {
data: Vec<u8>,
tail: usize,
elements: usize,
}
impl CircularBuffer {
pub fn new(max: usize) -> Self {
CircularBuffer {
data: Vec::with_capacity(max),
elements: 0,
tail: 0,
}
}
/// Amount of elements in buffer
pub fn elements(&self) -> usize {
self.elements
}
/// Amount of used bytes in buffer, including metadata
pub fn len(&self) -> usize {
self.tail
}
/// Length of first element in ringbuffer
pub fn next_element_len(&self) -> Option<usize> {
self.data
.get(0..4)
.and_then(|mut v| v.read_u32::<NativeEndian>().ok().map(|r| r as usize))
}
/// Remove first element in ringbuffer (wrap)
pub fn pop(&mut self) -> Option<Vec<u8>> {
self.next_element_len().map(|chunk_size| {
self.tail -= chunk_size + 4;
self.elements -= 1;
self.data
.splice(..(chunk_size + 4), vec![])
.skip(4)
.collect()
})
}
pub fn get(&self, idx: usize) -> Option<&[u8]> {
if self.elements <= idx {
return None;
}
let mut current_head = 0;
let mut current_element = 0;
while current_head < self.len() - 4 {
// Get the length of the next block
let element_size = self
.data
.get(0..4)
.and_then(|mut v| v.read_u32::<NativeEndian>().ok().map(|r| r as usize))
.unwrap();
if current_element == idx {
return self
.data
.get((current_head + 4)..(current_head + element_size + 4));
}
current_element += 1;
current_head += 4 + element_size;
}
return None;
}
pub fn insert(&mut self, mut element: Vec<u8>) {
let e_len = element.len();
let capacity = self.data.capacity();
while self.len() + e_len + 4 > capacity {
self.pop();
}
self.data.write_u32::<NativeEndian>(e_len as u32).unwrap();
self.data.append(&mut element);
self.tail += 4 + e_len;
self.elements += 1;
println!("{:?}", self.data);
}
}
Do note again that this is a naive implementation aimed at showing you how you'd go around the problem of clipping strings in your buffer. The "real", optimal implementation would unsafe
to shift and remove elements.