Search code examples
crustffi

How to pass pointer through [u8;1] member in struct when use ffi between Rust and C


There is a struct in c

struct Foo{
    unsigned int    body_len;
    unsigned char   body[1];    
}

the question is how to define the Rust part of the struct Foo and how to create a Foo instance to be passed to the extern C function that accepts a Foo * as argument, so body data can be accessed in C part. the body member will be converted to (unsigned char *) in c function.


Solution

  • The theoretically best Rust analogue of this struct declaration is:

    #[repr(C)]
    struct Foo {
        body_len: std::ffi::c_uint,
        body: [std::ffi::c_uchar],
    }
    

    This is a custom dynamically sized type; the compiler knows that its size varies. However, there is a catch: working with such types involves “fat pointers” or “wide pointers” which store the length in the pointer, not in the struct. So you'll find it difficult to pass around &Foos or even *const Foos because the compiler will demand that you construct the pointer with a redundant length (and it's difficult to actually do that with the currently stable operations).

    Therefore, in order to deal with the internally specified length (that is, in a field in the same struct), you'll need to give up and take essentially the same approach as in the C code, of providing a dummy field marking the start of the data which is bigger than the Foo. However, unlike standard C, Rust doesn't demand that arrays have size of at least 1, so we can arrange things more sensibly.

    #[repr(C)]
    struct Foo {
        body_len: std::ffi::c_uint,
        body: [std::ffi::c_uchar; 0],
    }
    

    You can then use pointer operations to access the body. (You must not use references, because while the exact semantics are not yet decided, it might be UB to access “past the end” of a reference's referent.) Note that getting this right is quite tricky; you should ask for a code review of whatever code you actually write, and also test your code under Miri (as far as possible, avoiding the actual FFI parts), to have sufficient confidence that it is sound and correct.

    Here is a complete example of how this might be done, providing a safe BoxFoo type that can allocate Foos and hand out pointers to them. I'm not myself an expert on this particular kind of pointer trickery, so please don't take it as the definitely best way to do this.

    use std::{alloc, ptr};
    use std::ffi::{c_uint, c_uchar};
    
    #[repr(C)]
    struct Foo {
        body_len: c_uint,
        body: [c_uchar; 0],
    }
    
    /// An owner for a heap allocated [`Foo`] that understands its length.
    /// (We can't use `Box` because it assumes that `size_of::<Foo>()` is the entire size.)
    struct BoxFoo {
        foo: *mut Foo,
    }
    
    impl BoxFoo {
        /// Computes the layout for allocating a `Foo` of a given size.
        fn layout(body_len: c_uint) -> alloc::Layout {
            let byte_len = usize::try_from(body_len).expect("body_len overflowed usize");
            let (layout, _) = alloc::Layout::new::<Foo>()
                .extend(alloc::Layout::array::<c_uchar>(byte_len).unwrap())
                .unwrap();
            layout
        }
    
        pub fn new(bytes_to_copy: &[c_uchar]) -> Self {
            let body_len: std::ffi::c_uint = bytes_to_copy.len().try_into().expect("body too long");
            let layout = Self::layout(body_len);
    
            // SAFETY:
            // * The preconditions of `alloc::alloc()` are met because we know `Foo` has nonzero size
            //   even if its data is nonzero.
            // * The pointer write operations write into memory we just allocated to be big enough.
            let foo = unsafe {
                // Allocate enough memory for the data.
                let ptr: *mut Foo = alloc::alloc(layout).cast::<Foo>();
                if ptr.is_null() {
                    alloc::handle_alloc_error(layout);
                }
    
                // Initialize the fixed part of the struct
                ptr::write(ptr, Foo { body_len, body: [] });
                // Initialize the variable length part
                let body_array_ptr: *mut [c_uchar; 0] = ptr::addr_of_mut!((*ptr).body);
                std::ptr::copy_nonoverlapping(
                    bytes_to_copy.as_ptr(),
                    body_array_ptr.cast::<c_uchar>(),
                    bytes_to_copy.len(),
                );
                ptr
            };
    
            Self { foo }
        }
    
        pub fn as_ptr(&self) -> *const Foo {
            self.foo
        }
    
        pub fn body(&self) -> &[c_uchar] {
            // SAFETY: `Foo` has the invariant that `body_len` shall be no longer than is
            // actually allocated after the `Foo` structure.
            unsafe {
                std::slice::from_raw_parts(
                    ptr::addr_of!((*self.foo).body).cast::<c_uchar>(),
                    usize::try_from((*self.foo).body_len).expect("body_len overflowed usize"),
                )
            }
        }
    }
    
    impl std::ops::Deref for BoxFoo {
        type Target = Foo;
    
        fn deref(&self) -> &Self::Target {
            // SAFETY: It is an invariant of this type that the pointer points to a valid Foo.
            unsafe { &*self.foo }
        }
    }
    
    impl Drop for BoxFoo {
        fn drop(&mut self) {
            let layout = Self::layout(unsafe { (*self.foo).body_len });
            // SAFETY: It is an invariant of this type that the pointer is a current allocation of
            // this layout.
            unsafe {
                alloc::dealloc(self.foo.cast::<u8>(), layout);
            }
        }
    }
    
    #[test]
    fn alloc_use_dealloc() {
        let data = b"hello world";
        let foo = BoxFoo::new(data);
        assert_eq!(foo.body(), data);
        drop(foo);
    }