Search code examples
c++undefined-behavior

What is undefined behaviour in these pointer conversions?


EDIT: Wow, this is a lot of information to the point, answers the "why" part, and solved my immediate problem. Still struggling with the "how" part, however. Your answers spark additional explanation, and bonus questions at the bottom.


I am designing a very minimal offset pointer class.

Its ultimate purpose are linked data structures in shared memory, which need to cope with varying virtual address offset in different processes. The idea is inspired by boost::interprocess::offset_ptr

My first test showed optimization level dependent output, suggesting undefined behaviour. How could it be fixed? See for example

#include <cstddef>
#include <iostream>

template<typename T>
class NullableOffsetPtr
{
public:
    NullableOffsetPtr(T const* ptr) :
        offset_{
            ptr == nullptr ? std::ptrdiff_t{1} : reinterpret_cast<std::byte const*>(ptr) - reinterpret_cast<std::byte const*>(this)}
    {
    }

    T* get()
    {
        return offset_ == 1 ? nullptr : reinterpret_cast<T*>(reinterpret_cast<std::byte*>(this) + offset_);
    }

private:
    std::ptrdiff_t offset_; // Using units of bytes, as alignments of *this and *ptr may not match.
};

class Sample
{
};

int main(int argc, char* argv[])
{
    Sample sample;
    NullableOffsetPtr<Sample> ptr{&sample};

    std::cout << "(ptr.get() == &sample): " << (ptr.get() == &sample) << std::endl;
    std::cout << "ptr.get(): " << ptr.get() << std::endl;
    std::cout << "&sample  : " << &sample << std::endl;

    return 0;
}

With all optimization levels -O1 and above, the output of gcc compiled executables is

(ptr.get() == &sample): 0
ptr.get(): 0x7ffe503cd42f
&sample  : 0x7ffe503cd42f

with the pointer comparison evaluating to false, despite individual values (varying with each run) are shown to be the same!

Of course, reinterpret_cast immediately is ringing the UB bell, but from my understanding should be within the limits of Type aliasing in reinterpret_cast.

See also godbolt example

Bonus question 1: I understand the reasoning behind expr.add#4.2, restricting + and - to pointers targeting elements of the same array, which was explained in several comments. However, the very same could be accomplished by both pointer and pointer plus offset targets being within the same (nested) composition, not necessarily an array. Otherwise the offsetof macro would be pointless. Example:

#include <iostream>

struct Inner
{
    int i1;
    int i2;
};

struct Outer
{
    int i1;
    int i2;
    Inner inner;
};

int main(int argc, char* argv[])
{
    Outer outer;
    std::cout << "&outer.i2 - &outer.i1: " << &outer.i2 - &outer.i1 << std::endl;
    std::cout << "&outer.inner.i2 - &outer.i1: " << &outer.inner.i2 - &outer.i1 << std::endl;

    return 0;
}

returns

&outer.i2 - &outer.i1: 1
&outer.inner.i2 - &outer.i1: 3

All invold pointer targets are within the object outer, but no array is involved.

expr.add#4.2 talks about "(possibly-hypothetical) array element". Would that be met within the mentioned composition, or what else does "possibly-hypothetical" mean?

Bonus question 2: To which extent can boost::interprocess::offset_ptr prevent UB? Found the term "illegal but correct" in the context of inlining. However I'm neither understanding how inlining relates, nor whether there is a widely accepted illegal correctness, that could be verified. Any suggestions?


Solution

  • Sample is an empty class and since this has no observable effect in your code, it can share an address with NullableOffsetPtr. As a result, &sample and ptr.get() could give you the same output when printed.

    However, the pointer subtraction in the constructor of NullableOffsetPtr is undefined behavior:

    When two pointer expressions P and Q are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std​::​ptrdiff_t in the header ([support.types.layout]).

    • If P and Q both evaluate to null pointer values, the result is 0.
    • Otherwise, if P and Q point to, respectively, array elements i and j of the same array object x, the expression P - Q has the value i−j.
    • Otherwise, the behavior is undefined.