A common pattern that happens to me in Rust goes something like this:
struct Foo { /*...*/ }
struct FooProcessor {
foos: Vec<&'??? mut Foo>, // this lifetime is the issue, see explanation below
// other data structures needed for processing Foos ....
}
impl FooProcessor {
// just to make it clear, `self` may outlive `input` and it's `Foo`s
pub fn process(&mut self, input: &mut [Foo]) {
// some example operation that requires random access to a subset
self.foos.extend(input.iter().filter(|f| f.some_property()));
self.foos.sort();
// work some more on the Foos...
// remove all references to the processed Foos
self.foos.clear();
}
}
The problem is the lifetime of FooProcessor::foos
, as highlighted by the questionmarks above. All references to Foo
s are cleared at the end of process()
, but of course the borrow checker doesn't know that.
FooProcessor
is usually a large, long lived object (specifically, it may outlive some Foo
s), and I don't want to reallocate the Vec<&Foo>
every time process()
is called (not even by using some SmallVec
thing).
Is there a good way to solve this that does not require the use of unsafe code (lifetime transmutation / pointers) each time I have this pattern? (If the solution involves using some crate, it's of course fine if that crate uses unsafe internally).
Transmuting Vec<T>
to Vec<U>
is in general unsound. However, we can use Vec::into_raw_parts
to decompose the vector into a data pointer, length, and capacity, transmute the data pointer, and then use Vec::from_raw_parts
to recreate the vector with the new type, as long as we can satisfy the invariants of Vec::from_raw_parts
.
Well, we could... but Vec::into_raw_parts
is unstable. However, it is implemented in terms of public Vec
methods, so it's trivial to copy and paste the implementation:
fn vec_into_raw_parts<T>(v: Vec<T>) -> (*mut T, usize, usize) {
let mut v = std::mem::ManuallyDrop::new(v);
(v.as_mut_ptr(), v.len(), v.capacity())
}
Ok, now let's walk down the list of Vec::from_raw_parts
' invariants:
ptr
must have been allocated using the global allocator, such as via thealloc::alloc
function.
We are getting the pointer from a Vec
that uses the global allocator.
T
needs to have the same alignment as whatptr
was allocated with. (T
having a less strict alignment is not sufficient, the alignment really needs to be equal to satisfy thedealloc
requirement that memory must be allocated and deallocated with the same layout.)
Vec
will ensure the allocation is aligned for T
, so we can just have our function assert that T
and U
have the same alignment. As the types are known at compile time, the compiler will elide this check if it would pass.
The size of
T
times thecapacity
(ie. the allocated size in bytes) needs to be the same size as the pointer was allocated with. (Because similar to alignment,dealloc
must be called with the same layoutsize
.)
We will also assert that the sizes of T
and U
are equal, which satisfies this invariant.
length
needs to be less than or equal to capacity.
capacity
needs to be the capacity that the pointer was allocated with.The allocated size in bytes must be no larger than isize::MAX. See the safety documentation of pointer::offset.
We will get the length, capacity, and pointer from an existing vector, which must satisfy these invariants unless we've already hit some UB somewhere.
The first
length
values must be properly initialized values of typeT
.
We will clear the vector before decomposing it, so length
will be zero, satisfying this invariant vacuously.
Now we can define a safe transmutation between Vec
element types in these terms:
fn safe_vec_transmute<T, U>(mut v: Vec<T>) -> Vec<U> {
assert_eq!(std::mem::size_of::<T>(), std::mem::size_of::<U>());
assert_eq!(std::mem::align_of::<T>(), std::mem::align_of::<U>());
v.clear();
let (ptr, len, cap) = vec_into_raw_parts(v);
// SAFETY:
//
// We assert T and U have the same size and alignment, and we clear the
// vector first. This satisfies several invariants of Vec:from_raw_parts.
// The remaining invariants are satisfied because we get ptr, len, and cap
// from an existing vector.
unsafe { Vec::from_raw_parts(ptr as *mut U, len, cap) }
}
To make use of this, you would store foos
as Vec<&'static mut T>
. In process()
you could std::mem::take(&mut self.foos)
to steal the allocation, leaving behind a zero-capacity vector, transmute the lifetime using safe_vec_transmute
, do your work with the vector, transmute the lifetime back to 'static
, and store it back in self.foos
.
It would look something like this (untested code):
let foos = safe_vec_transmute(std::mem::take(&mut self.foos));
foos.extend(...);
foos.sort();
// More work
self.foos = safe_vec_transmute(foos);
Note that safe_vec_transmute
compiles (with optimizations) to exactly the same instructions as a function that clears a vector and returns it, so there is zero runtime overhead!