Search code examples
linuxrustsystem-callsglibcunsafe-pointers

'Invalid utf-8 sequence of 1 bytes from index 1' and 'munmup_chunk(): invalid pointer' error in my Rust unit-tests


There`s only lib.rs file:

use libc::{self, O_RDONLY, strerror, open, close, __errno_location, read};
use std::ffi::{CString, CStr};
use std::{fmt, process};

// File descriptor
struct Fd {
    fd: i32,
}

impl Fd {
    fn new(fd: i32) -> Fd {
        Fd { fd }
    }

    fn get(&self) -> i32 {
        self.fd
    }
}

impl Drop for Fd {
    fn drop(&mut self) {
        unsafe {
            println!("closing fd - {}", self.fd);
            match close(self.fd) {
                -1 => Err( LacError::new().unwrap_or_else(|e| {
                        eprintln!("{}", e);
                        process::exit(1); 
                    }) ).unwrap_or_else(|e| {
                        eprintln!("{}", e);
                        process::exit(1);}),
                _ => (),
            }
        }
    }
}

// Linux API call Error
#[derive(Debug)]
struct LacError(String);

impl LacError {
    fn new() -> Result<LacError, Box<dyn std::error::Error>> {
        unsafe {
            Ok( LacError( CString::from_raw(strerror(*__errno_location()))
                .into_string()?) )
        }
    }
}

impl fmt::Display for LacError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}", self.0)
    }
}

impl std::error::Error for LacError {}

// lac (Linux API call) functions
fn lac_open(path: &str) -> Result<Fd, Box<dyn std::error::Error>> {
    unsafe {
        let path_holder = CString::new(path)?;
        let path_ptr = path_holder.as_ptr();

        match open(path_ptr, O_RDONLY) {
            -1 => Err(Box::new(LacError::new()?)),
            fd => Ok(Fd::new(fd)) 
        }
    }
}

fn lac_read(fd: &Fd,
            buf: &mut String,
            count: usize) -> Result<isize, Box<dyn std::error::Error>>  {
    let buf_holder = CString::new("")?;
    let buf_ptr = buf_holder.as_ptr();

    unsafe {
        match read(fd.get(), buf_ptr as *mut libc::c_void, count) {
            0 => Ok(0),
            -1 => Err(Box::new(LacError::new()?)),
            num_of_red_bytes => {
                buf.push_str(CStr::from_ptr(buf_ptr).to_str()?);
                Ok(num_of_red_bytes)
            },
        }
    }
}    
 
// TESTS
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn whether_correctly_open() {
        let path = "test_file.txt";
        assert_ne!(match lac_open(path) {Ok(fd) => fd.get(), Err(_) => -1},
        -1);
    }

    #[test]
    #[should_panic]
    fn whether_correctly_panic() {
        let path = "testfile.txt";// For first issue "testfile.txt", for second "test_file.txt"
        match lac_open(path) {
            Ok(_) => (),
            Err(e) => panic!("{}", e),
        }
    }

    #[test]
    fn whether_correctly_read() {
        let path = "test_file.txt"; 
        let mut buf = String::from("");

        let fd = lac_open(path)
            .unwrap_or_else(|e| {panic!("{}", e);});
        let count = lac_read(&fd, &mut buf, 1)
            .unwrap_or_else(|e| {panic!("{}", e);});
        println!("{} {}", buf, count);
        assert_eq!(buf, "s"); 
    }
}

At first when I run 'cargo test -- -show-output' first test passes successfully but second (third test omit for a while) not just fails (

running 3 tests
test tests::whether_correctly_open ... ok
munmap_chunk(): invalid pointer
error: test failed, to rerun pass '--lib'

Caused by:
process didn't exit successfully: `/home/Michail/rPrgs/usl/target/debug/deps/usl-1be62c27ff5543fb --show-output` (signal: 6, SIGABRT: process abort signal)

) OS dispatch signal to that process probably because of operations in LacError::new()-method:

impl LacError {
    fn new() -> Result<LacError, Box<dyn std::error::Error>> {
        unsafe {
            Ok( LacError( CString::from_raw(strerror(*__errno_location()))
                .into_string()?) )
        }
    }
}

And I exactly don`t know where I make a mistake.

At second, when I swap "testfile.txt" with "test_file.txt" and made second test really fail I run 'cargo test -- --show-output' and (I believe `cause 'cargo test' run in few threads by default) receive

running 3 tests
test tests::whether_correctly_open ... ok
test tests::whether_correctly_panic ... FAILED
test tests::whether_correctly_read ... ok

successes:

---- tests::whether_correctly_open stdout ----
closing fd - 3

---- tests::whether_correctly_read stdout ----
s 1
closing fd - 3


successes:
    tests::whether_correctly_open
    tests::whether_correctly_read

failures:

---- tests::whether_correctly_panic stdout ----
closing fd - 3
note: test did not panic as expected

failures:
    tests::whether_correctly_panic

test result: FAILED. 2 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass '--lib'

and if few times run

test tests::whether_correctly_open ... ok
test tests::whether_correctly_panic ... FAILED
test tests::whether_correctly_read ... FAILED

successes:

---- tests::whether_correctly_open stdout ----
closing fd - 3


successes:
    tests::whether_correctly_open

failures:

---- tests::whether_correctly_panic stdout ----
closing fd - 3
note: test did not panic as expected
---- tests::whether_correctly_read stdout ----
thread 'tests::whether_correctly_read' panicked at 'invalid utf-8 sequence of 1 bytes from index 2', src/lib.rs:119:34
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
closing fd - 3


failures:
    tests::whether_correctly_panic
    tests::whether_correctly_read

test result: FAILED. 1 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass '--lib'

that is, over time when I repeat run command from above I receive different results. That issue with utf-8 as I think relate to that code

fn lac_read(fd: &Fd,
            buf: &mut String,
            count: usize) -> Result<isize, Box<dyn std::error::Error>>  {
    let buf_holder = CString::new("")?;
    let buf_ptr = buf_holder.as_ptr();

    unsafe {
        match read(fd.get(), buf_ptr as *mut libc::c_void, count) {
            0 => Ok(0),
            -1 => Err(Box::new(LacError::new()?)),
            num_of_red_bytes => {
                buf.push_str(CStr::from_ptr(buf_ptr).to_str()?);//<-----------------------
                Ok(num_of_red_bytes)
            },
        }
    }
}    

In addition when I repeatly run 'cargo test -- --show-output --test-threads=1' third test always fails.

UPD: Before tests I wrote common latin chars in "test_file.txt" via

echo sometext > test_file.txt

Final UPDate: First issue I decided just by creating new owned String from borrowed strerror string:

impl LacError {
    fn new() -> Result<LacError, Box<dyn std::error::Error>> {
        unsafe {
            Ok( LacError( String::from( CStr::from_ptr(strerror(*__errno_location())).to_str()? ) ) )
        }
    }
}

And second is most interesting, because CString is not growable string and read syscall inserts chars into spaces. I decided write for loop for pushing 'count' number of spaces in temporary 'container' that is growable owned string:

fn lac_read(fd: &Fd,
            buf: &mut String,
            count: usize) -> Result<isize, Box<dyn std::error::Error>>  {
    let mut container = String::new();
    for _ in 0..count {
        container.push(' ');
    }
    let buf_holder = CString::new(container.as_str())?;
    let buf_ptr = buf_holder.into_raw();

    unsafe {
        match read(fd.get(), buf_ptr as *mut libc::c_void, count) {
            0 => Ok(0),
            -1 => Err(Box::new(LacError::new()?)),
            num_of_red_bytes => {
                buf.push_str(CString::from_raw(buf_ptr).to_str()?);
                Ok(num_of_red_bytes)
            },
        }
    }
}    

Solution

  • The issue is that CString will try to de-allocate its string when it goes out of scope. Instead you should use CStr which will not do it:

    impl LacError {
        fn new() -> Result<LacError, Box<dyn std::error::Error>> {
            unsafe {
                Ok(LacError(
                    CStr::from_ptr(strerror(*__errno_location())).to_string_lossy().to_string()
                ))
            }
        }
    }
    

    Usually whoever creates/allocates/etc is responsible for the destruction/de-allocation/etc, unless it's explicitly stated otherwise by the docs. As the string was allocated by the OS, it's an error to try to de-allocate it from the application.

    The next question: why does your third test fail:

    The read() call reads up to count number of bytes into the provided buffer. But it DOES NOT resize the buffer. Your application creates a buffer of size 1 (it's 1, because C strings are 0 terminated):

    let buf_holder = CString::new("")?;
    

    The problem is that C strings must be 0 terminated. So if you read something that is not 0, CStr::from_ptr() will try to read from uninitialized memory outside of the buffer, which is undefined behaviour. You can make it 100% reproducible by changing the count:

    let count = lac_read(&fd, &mut buf, 100).unwrap();
    

    Now your app will try to read 100 bytes into that 1 byte buffer, corrupting everything after it. With that I always get:

    malloc(): corrupted top size
    

    So in order to fix the issue, you must make sure that your buffer is large enough to hold the data (and do not forget the trailing 0!)