Search code examples
rustffibox

Why is it not very reliable for rust to pass data on the heap through the Box method?


I want to pass struct data to my own dll. To ensure that data is not dropped, I use Box to move the struct to the heap, and get the pointer by Box::into_raw. Unfortunately sometimes I encounter some ERROR when I printing the data from my dll. This is very weird because ERROR occur randomly.

My dllcodes as follows:

#[repr(C)]
#[derive(Debug, Clone)]
pub enum MyValue {
    Bool(bool),
    String(*const i8),
}

#[repr(C)]
#[derive(Debug, Clone)]
pub struct StoreData {
    values: *const MyValue,
}

#[no_mangle]
extern "C" fn read_data(data: *const StoreData) {
    println!(" ");
    println!("==== in dll ====");
    println!("01 {:?}", data);
    unsafe {
        println!("02 {:?}", *(*data).values);
        // let MyValue::String(value) = *(*data).values.offset(0);
        let MyValue::String(value) = *(*data).values.offset(0) else {
            panic!("Dll Error")
        };
        println!("03 {:?}", value);
    };
    do something...
}

My main codes as follows, and I use libloading = "0.8.4" to load my dll:

use libloading::{Library, Symbol};
use std::ffi::CString;

#[repr(C)]
#[derive(Debug, Clone)]
pub enum MyValue {
    Bool(bool),
    String(*const i8),
}

#[repr(C)]
#[derive(Debug, Clone)]
pub struct StoreData {
    values: *const MyValue,
}

#[repr(C)]
#[derive(Debug, Clone)]
pub struct EntryPoint {}

impl EntryPoint {
    pub fn init(values: Vec<String>) -> StoreData {
        let mut container = Vec::new();

        for var in values {
            let c_var = CString::new(var.as_str()).unwrap();
            container.push(MyValue::String(c_var.into_raw() as *const i8));
        }

        StoreData {
            values: container.as_ptr(),
        }
    }

    pub fn call_dll(values: *mut StoreData) {
        type DLLFUNC = extern "C" fn(data: *const StoreData);
        unsafe {
            let lib = Library::new("something_dll.dll").unwrap();
            let func: Symbol<DLLFUNC > = lib.get(b"read_data").unwrap();

            func(values);

            lib.close().unwrap()
        }
    }
}

fn main() {
    let values = vec![String::from("hello"), String::from("world")];
    let sd = EntryPoint::init(values);

    let sd_addr = Box::into_raw(Box::new(sd));

    println!("==== main func get data ====");
    println!("01 {:?}", sd_addr);
    unsafe {
        println!("02 {:?}", *(*sd_addr).values);
        // let MyValue::String(value) = *(*sd_addr).values.offset(0);
        let MyValue::String(value) = *(*sd_addr).values.offset(0) else {
            panic!("main ERROR")
        };
        println!("03 {:?}", value);
    };

    EntryPoint::call_dll(sd_addr);
    unsafe {
        let _ = Box::from_raw(sd_addr);
    }
}

The ERROR info as follows:

note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: Rust cannot catch foreign exceptions
error: process didn't exit successfully: `target\debug\entry_point.exe` (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN)

I found that when this ERROR occurs, the struct memory address of the data does indeed change, like this:

==== main func get data ====
01 0x18f89dfe7f0
02 String(0x18f89dfe810)   <- value real address.
03 0x18f89dfe810

==== in dll ====
01 0x18f89dfe7f0
02 String(0x18f89dfc2a0)   <- value address has been changed.

I tried to keep only one enumeration value in my MyValue like this:

#[repr(C)]
#[derive(Debug, Clone)]
pub enum MyValue {
    // Bool(bool),          <- comment out Bool
    String(*const i8),
}

The values I print from the my dll are always correct. But I have to keep the MyValue::Bool because I need to pass some bool type values to my dll to do something.

I don't understand why this problem occurs. In my impression, Box is a very reliable tool to maintain the lifecycle and ownership of data in rust. Does anyone know what's going on? And how should I optimize my code?


Solution

  • The reason for this error is that you're hitting UB due to your use of raw pointers in EntryPoint::init and afterwards.

    Actually, when I try to reproduce the issue, I have another output:

    ==== main func get data ====
    01 0x55686e047a30
    02 Bool(true)
    thread 'main' panicked at app/src/main.rs:59:13:
    main ERROR
    

    ...that is, I don't evet get the same enum variant I've just written. That heavily hints on some kind of undefined behavior, and there is one indeed.

    When you create the StoreData at the end of EntryPoint::init, you use container.as_ptr, which takes &self - that is, it borrows the Vec, not consuming it. So, container is still in scope at the end of the function and not moved out of it as part of its return value - therefore it's dropped, and sd.values becomes dangling; dereferencing it is now a use-after-free and can result in anything.

    You can try this approach instead:

    StoreData {
        values: Box::into_raw(container.into_boxed_slice()) as *const _,
    }
    

    (the last as *const _ is necessary since Box::into_raw returns *mut _). In this case, Box is effectively leaked (though you can reconstruct it later), and the corresponding pointer can be used freely (as long as you ensure that the access is synchronized and reference aliasing rules are followed, of course).