Search code examples
crustffiunsafeconst-correctness

How do I pass a non-mutable reference from Rust to a C-API that doesn't use const (even though it should)?


I have a wrapper around a C-API:

#[repr(transparent)]
pub struct Request(http_request_t);

This wrapper provides several methods for interacting with the request:

impl Request {
    pub fn bytes_received(&self) -> usize {
        unsafe {
            http_request_bytes_received(&self.0 as *const http_request_t)
        }
    }
}

Unfortunately the C-API is not so strict about const-correctness and thus has a type signature of usize http_request_bytes_received(*http_request_t) which is dutifully converted by bindgen to http_request_bytes_received(*mut http_request_t) -> usize.

Now I could just cast my way out of this, but casting from &T to *mut T can easily lead to undefined behaviour (and it's a nasty cast). But it might be OK as http_request_bytes_received doesn't mutate http_request_t.

A possible alternative would be to use UnsafeCell so http_request_t is interior mutable:

#[repr(transparent)]
pub struct Request(UnsafeCell<http_request_t>);

impl Request {
    pub fn bytes_received(&self) -> usize {
        unsafe {
            http_request_bytes_received(self.0.get())
        }
    }
}

Is this approach sound and would there any serious downsides?

(I imagine it might limit some Rust optimizations and will make Request !Sync)


Solution

  • Short answer: just cast it to *mut T and pass it to C.

    Long answer:

    It's best to first understand why casting *const T to *mut T is prone to undefined behaviour.

    Rust's memory model ensures that a &mut T will not alias with anything else, so the compiler is free to, say, clobber T entirely and then restore its content, and the programmer could not observe that behaviour. If a &mut T and &T co-exists and point to the same location, undefined behaviour arises because what will happen if you read from &T while compiler clobbers &mut T? Similarly, if you have &T, the compiler assumes no one will modify it (excluding interior mutability through UnsafeCell), and undefined behaviour arise if the memory it points to is modified.

    With the background, it's easy to see why *const T to *mut T is dangerous -- you cannot dereference the resulting pointer. If you ever dereference the *mut T, you've obtained a &mut T, and it'll be UB. However, the casting operation itself is safe, and you can safely cast the *mut T back to *const T and dereference it.

    This is Rust semantics; on the C-side, the guarantee about T* is very weak. If you hold a T*, the compiler cannot assume there are no sharers. In fact, the compiler cannot even assert that it points to valid address (it could be null or past-the-end pointer). C compiler cannot generate store instructions to the memory location unless the code write to the pointer explicitly.

    The weaker meaning of T* in C-side means that it won't violate Rust's assumption about semantics of &T. You can safely cast &T to *mut T and pass it to C, provided that C-side never modifies the memory pointed by the pointer.

    Note that you can instruct the C compiler that the pointer won't alias with anything else with T * restrict, but as the C code you mentioned is not strict with const-correctness, it probably does not use restrict as well.