Search code examples
rustunsafe

Cast bytes::BytesMut to bytes::Bytes using unsafe code without using freeze


// create buf to hold each udp packet to avoid creating allocations
let mut buf = BytesMut::with_capacity(2000);

loop {

    // Receive data from the socket:
    let (len, src_ep_sip) = udp_sip.recv_from(&mut buf).await.unwrap();

    // in real code I check for errors instead of using unwrap. but this is not the question
    
    // I now need a Bytes object because a lot of my functions that I must call depends on that.

    // I cannot do this because it will not compile. 
    //  let request = buf.clone().freeze();

    // So instead I do this. Is this safe?
    let unsafe_bytes: Bytes = unsafe { Bytes::from_static(std::slice::from_raw_parts(buf.as_ptr(), len)) };

    // FROM NOW ON IT IS VERY IMPORTANT TO NOT MODIFY buf. It will not be modified until next iteration

    // call the functions that depend on this
    func_1(unsafe_bytes);
    func_2(unsafe_bytes);
    // etc

    if something {
        
        // in here I will have to allocate memory since I will be creating a new async task..
        let request_clone = Bytes::copy_from_slice(&buf);

        tokio::spawn(async move {
            // can safely use request_clone...
        }
    }
}

This code compiles and works. ChatGPT says it is risky to use this because from_static is not referencing static memory. The reason why I want to use this is because:

  1. I will not have to refactor my code and tests. I have a lot of stuff that depends on Bytes.
  2. I know I can do a clone of my buf and avoid having unsafe code but it will be nice to avoid having one allocation per packet.
  3. Using freeze to convert BytesMut to Bytes makes my code not compile. It works if I will not have a loop.

But the downside is that I do not want my program to crash on the middle of the night. I know the solution is to use a &[u8] or something different as I am never manipulating data. But since all my methods depend already on Bytes it will be nice if I can have this unsafe code. Is it this dangerous even though I am not writing to the buffer and treating it as read-only? Is it dangerous to use from_static even though it is not static data?


Test to show that this works even though chatGPT and sources from internet claim this is unsafe:

#[cfg(test)]
mod tests {
    use bytes::{Bytes, BytesMut};

    #[test]
    fn test_bytes_unsafe() {
        // Create buffer outside the loop
        let mut buf = BytesMut::with_capacity(2000);

        for i in 0..1000 {
            println!("Iteration {}", i);

            // Write some data into the buffer
            buf.clear();
            buf.extend_from_slice(format!("Test data iteration {}", i).as_bytes());

            // unsafe cast. 
            let unsafe_bytes: Bytes =  unsafe { 
                Bytes::from_static(std::slice::from_raw_parts(buf.as_ptr(), buf.len())) 
            };

            // Try to use the bytes
            println!("Length: {}", unsafe_bytes.len());
            println!("Content: {:?}", std::str::from_utf8(&unsafe_bytes));

            // Add some delay to make potential issues more visible
            std::thread::sleep(std::time::Duration::from_millis(1));
            
            println!("Bytes len: {}", unsafe_bytes.len());
            println!("Bytes content: {:?}", std::str::from_utf8(&unsafe_bytes));
            println!("-------------------");
        }
    }

Solution

  • You can avoid cloning by using freeze and try_into_mut to convert back and forth between the Bytes and BytesMut, and an Option to keep the buffer between loop iterations:

    let mut outer_buf = Some (BytesMut::with_capacity (2000);
    
    loop {
       //
       let mut buf = outer_buf.take().unwrap();
    
       let (len, src_ep_sip) = udp_sip.recv_from(&mut buf).await.unwrap();
    
       let buf = inner_buf.freeze();
    
       func_1 (buf.clone()); // no allocations
       func_2 (buf.clone()); // no allocations
    
       // Put the buffer back into `outer_buf` so that it will be available
       // for the next loop iteration
       outer_buf = Some (buf.try_into_mut().unwrap());
    }
    

    Note: I think that the Option makes the intent clearer, but since BytesMut implements Default in a way that does not allocate, you can also do it without the Option if you prefer:

    let mut outer_buf = BytesMut::with_capacity (2000));
    
    loop {
       //
       let mut buf = outer_buf.take().unwrap();
    
       let (len, src_ep_sip) = udp_sip.recv_from(&mut buf).await.unwrap();
    
       let buf = inner_buf.freeze();
    
       func_1 (buf.clone()); // no allocations
       func_2 (buf.clone()); // no allocations
    
       // Put the buffer back into `outer_buf` so that it will be available
       // for the next loop iteration
       outer_buf = buf.try_into_mut().unwrap();
    }