Search code examples
rustlanguage-lawyerstrict-aliasing

Aliasing rules and making Rust function "generic over aliasing"


I have a routine which takes in an immutable slice of points and performs some transformation, writing to a mutable slice of points. (As a simple example, let's say we translate all the points by some constant.) Suppose also that the implementation cannot be reduced to a point-by-point processing. (In my case, I have vectorized the algorithm after confirming that the compiler can't do it automatically.)

While minimizing duplicate code, I want the routine to work with both 1. transforming one array to a different array, and 2. transforming a single, mutable array in-place.

Suppose the existing implementation has the form

fn transform(in_: &[Point], out: &mut [Point], translate: Point) {
   // ...
}

This solves case (1), but because we can't borrow the same array mutably and immutably at the same time, this will not work for case (2). Indeed, trying to use unsafe, and get simultaneous mutable and immutable references, is instant undefined behavior.

How can we get around this problem? Can we bypass the rules by doing something like

unsafe fn transform_impl(in_: &[UnsafeCell<Point>], out: &[UnsafeCell<Point>], translate: Point) {
  // ... promise we won't modify in_ here, nor have overlapping lifetimes
  // between values read from in_ and written to out ...
}

fn transform_in_place(inout: &mut [Point], translate: Point) {
  unsafe {
    let a: &[UnsafeCell<Point>] = std::mem::transmute(inout);
    transform_impl(a, a, translate);
  }
}

fn transform(in_: &[Point], out: &mut [FPoint], translate: Point) {
  unsafe {
    transform_impl(std::mem::transmute(in_), std::mem::transmute(out), translate);
  }
}

using interior mutability, and promise that we won't change values in out from "underneath" the values in in_? Or is casting an immutable slice to an UnsafeCell slice, like before, instant undefined behavior?

(In C, we can just use pointers, since non-restrict pointers of the same type may alias.)

I would prefer to have a single function to handle both cases, if only to minimize the amount of duplicate generated code. If this is ultimately impractical, what would be an elegant way to make this work without implementing the algorithm in source twice?


Solution

  • Transmuting &mut T to &UnsafeCell<T> (and therefore &mut [Point] to &[UnsafeCell<Point>] is definitely fine. This is documented. You do have to be careful to not hold a mutable and shared reference at the same time, though, which means restricting the access to either & from in_ or &mut from out, but not both at the same time for the same index. It may even be possible to do it safely, using Cell, with Cell::from_mut() and Cell::as_slice_of_cells() (depending on how you need to access the Points).

    The problem is the other cast, &T to &UnsafeCell<T>. If the reference is then written into, this is undoubtedly UB. The question is what happens when we only create a reference and not write into it.

    Stacked Borrows marks this UB. Tree Borrows (another possible aliasing model) accepts this. It is my understanding that people want this to be allowed, but it is unclear how to allow this with Stacked Borrows.

    Given the aliasing model is not decided yet, I'd recommend to avoid this pattern. It is possible to do what you want if you take raw points and not references; this does mean more unsafe code, though.