Search code examples
crustffi

Borrow checking for C struct field


Imagine that C API exposes opaque pointer to some data and two accessors to some field void set_string(struct foo*, const char*) and const char* get_string(struct foo*) and that documentation states something along the lines

The string returned by get_string is valid as long as the opaque pointer to foo is valid and no subsequent set_string call is made. Otherwise behavior is undefined

This is simple example of the header from C that exemplifies the issue

//foo.h
struct foo;
const char* get_string(struct foo*);
void set_string(struct foo*, const char*);

I'm wondering if and how is it possible to make rust borrow checker keep an eye on that external reference and avoid potential UB after generating bindings with bindgen. I have been experimenting around with

use foo::foo;
use std::marker::PhantomData;

struct Foo<'a> {
    pointer: *const foo,
    _phantom: PhantomData<&'a CStr>,
}

but it seems to be a dead end.


Solution

  • So the documentation essentially tells you the value returned by get_string is borrowed from struct foo and set_string mutates it so needs a mutable borrow, you can do that without any lifetime in struct Foo on the Rust side, just wrap the calls to set|get_string in safe, abstracting methods:

    // creating a module to contain the code that needs to be checked for soundness
    mod foo_mod {
        use std::ffi::{c_char, CString, CStr};
        struct foo;
        pub struct Foo {
            // musn't be `pub`
            pointer: *mut foo,
        }
        impl Foo {
            pub fn set_string(&mut self, s: CString) {
                extern "C" {
                    fn set_string(this: *mut foo, s: *const c_char);
                }
                // SAFETY:
                // this is safe assuming:
                // - `self.pointer` is always a valid pointer to a `struct foo`
                // - `set_string` in C does not deallocate `s` (unless by leveraging the appropriate Rust code to do so)
                unsafe { set_string(self.pointer, s.into_raw()) }
            }
            pub fn get_string(&self) -> &CStr {
                extern "C" {
                    fn get_string(this: *mut foo) -> *const c_char;
                }
                // SAFETY:
                // this is safe assuming:
                // - `self.pointer` is always a valid pointer to a `struct foo`
                // - `char* get_string()` always returns a valid C string given a valid `struct foo` pointer
                unsafe {
                    let ptr = get_string(self.pointer);
                    CStr::from_ptr(ptr)
                }
            }
            pub fn new() -> Self {
                // this or any constructors must make sure to only create `Foo`s that contain a valid `pointer`
                todo!()
            }
        }
    }
    

    Playground


    Note: Foo::set_string in Rust is safe under the stated assumptions, but it leaks the passed in CString everytime you use it. In a production environment that's unlikely what you want, but properly dealing with it depends on how Cs set_string expects the string to be passed as kmdreko notes.