Search code examples
c++language-lawyerundefined-behaviorpointer-arithmetic

Pointer arithmetic using cast to "wrong" type


I have an array of structs, and I have a pointer to a member of one of those structs. I would like to know which element of the array contains the member. Here are two approaches:

#include <array>
#include <string>

struct xyz
{
    float x, y;
    std::string name;
};

typedef std::array<xyz, 3> triangle;

// return which vertex the given coordinate is part of
int vertex_a(const triangle& tri, const float* coord)
{
    return reinterpret_cast<const xyz*>(coord) - tri.data();
}

int vertex_b(const triangle& tri, const float* coord)
{
    std::ptrdiff_t offset = reinterpret_cast<const char*>(coord) - reinterpret_cast<const char*>(tri.data());
    return offset / sizeof(xyz);
}

Here's a test driver:

#include <iostream>

int main()
{
    triangle tri{{{12.3, 45.6}, {7.89, 0.12}, {34.5, 6.78}}};
    for (const xyz& coord : tri) {
        std::cout
            << vertex_a(tri, &coord.x) << ' '
            << vertex_b(tri, &coord.x) << ' '
            << vertex_a(tri, &coord.y) << ' '
            << vertex_b(tri, &coord.y) << '\n';
    }
}

Both approaches produce the expected results:

0 0 0 0
1 1 1 1
2 2 2 2

But are they valid code?

In particular I wonder if vertex_a() might be invoking undefined behavior by casting float* y to xyz* since the result does not actually point to a struct xyz. That concern led me to write vertex_b(), which I think is safe (is it?).

Here's the code generated by GCC 6.3 with -O3:

vertex_a(std::array<xyz, 3ul> const&, float const*):
    movq    %rsi, %rax
    movabsq $-3689348814741910323, %rsi ; 0xCCC...CD
    subq    %rdi, %rax
    sarq    $3, %rax
    imulq   %rsi, %rax

vertex_b(std::array<xyz, 3ul> const&, float const*):
    subq    %rdi, %rsi
    movabsq $-3689348814741910323, %rdx ; 0xCCC...CD
    movq    %rsi, %rax
    mulq    %rdx
    movq    %rdx, %rax
    shrq    $5, %rax

Solution

  • Neither is valid per the standard.


    In vertex_a, you're allowed to convert a pointer to xyz::x to a pointer to xyz because they're pointer-interconvertible:

    Two objects a and b are pointer-interconvertible if [...] one is a standard-layout class object and the other is the first non-static data member of that object [...]

    If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_­cast.

    But you can't do the cast from a pointer to xyz::y to a pointer to xyz. That operation is undefined.


    In vertex_b, you're subtracting two pointers to const char. That operation is defined in [expr.add] as:

    If the expressions P and Q point to, respectively, elements x[i] and x[j] of the same array object x, the expression P - Q has the value i − j; otherwise, the behavior is undefined

    Your expressions don't point to elements of an array of char, so the behavior is undefined.