Search code examples
pointersrustbyteoffset

How to get pointer offset in bytes?


While raw pointers in Rust have the offset method, this only increments by the size of the pointer. How can I get access to the pointer in bytes?

Something like this in C:

var_offset = (typeof(var))((char *)(var) + offset);

Solution

  • TL;DR: This answer invokes Undefined Behavior, according to RFC-2582.

    In particular, references must be aligned and dereferencable, even when they are created and never used.

    There are also discussions that field accesses themselves impose extra requirements not solved by the proposed &raw, due to usage of getelementptr inbounds, see offsetof woes at the bottom of the RFC.


    From the answer I linked to your previous question:

    macro_rules! offset_of {
        ($ty:ty, $field:ident) => {
            //  Undefined Behavior: dereferences a null pointer.
            //  Undefined Behavior: accesses field outside of valid memory area.
            unsafe { &(*(0 as *const $ty)).$field as *const _ as usize }
        }
    }
    
    fn main() {
        let p: *const Baz = 0x1248 as *const _;
        let p2: *const Foo = ((p as usize) - offset_of!(Foo, memberB)) as *const _;
        println!("{:p}", p2);
    }
    

    We can see on the computation of p2 that a pointer can be converted painless to an integer (usize here), on which arithmetic is performed, and then the result is cast back to a pointer.

    isize and usize are the universal byte-sized pointer types :)


    Were RFC-2582 to be accepted, this implementation of offset_of! is my best shot:

    macro_rules! offset_of {
        ($ty:ty, $field:ident) => {
            unsafe {
                //  Create correctly sized storage.
                //
                //  Note: `let zeroed: $ty = ::std::mem::zeroed();` is incorrect,
                //        a zero pattern is not always a valid value.
                let buffer = ::std::mem::MaybeUninit::<$ty>::uninit();
    
                //  Create a Raw reference to the storage:
                //  - Alignment does not matter, though is correct here.
                //  - It safely refers to uninitialized storage.
                //
                //  Note: using `&raw const *(&buffer as *const _ as *const $ty)`
                //        is incorrect, it creates a temporary non-raw reference.
                let uninit: &raw const $ty = ::std::mem::transmute(&buffer);
    
                //  Create a Raw reference to the field:
                //  - Alignment does not matter, though is correct here.
                //  - It points within the memory area.
                //  - It safely refers to uninitialized storage.
                let field = &raw const uninit.$field;
    
                //  Compute the difference between pointers.
                (field as *const _ as usize) - (uninit as *const_ as usize)
            }
        }
    }
    

    I have commented each step with the reasons I believe they are sound, and why some alternatives are not -- something I encourage heavily in unsafe code -- and hopefully not missed anything.