Search code examples
c++carmvectorizationneon

ARM Neon: How to convert from uint8x16_t to uint8x8x2_t?


I recently discovered about the vreinterpret{q}_dsttype_srctype casting operator. However this doesn't seem to support conversion in the data type described at this link (bottom of the page):

Some intrinsics use an array of vector types of the form:

<type><size>x<number of lanes>x<length of array>_t

These types are treated as ordinary C structures containing a single element named val.

An example structure definition is:

struct int16x4x2_t    
{
    int16x4_t val[2];     
};

Do you know how to convert from uint8x16_t to uint8x8x2_t?

Note that that the problem cannot be reliably addressed using union (reading from inactive members leads to undefined behaviour Edit: That's only the case for C++, while it turns out that C allows type punning), nor by using pointers to cast (breaks the strict aliasing rule).


Solution

  • Based on your comments, it seems you want to perform a bona fide conversion -- that is, to produce a distinct, new, separate value of a different type. This is a very different thing than a reinterpretation, such as the lead-in to your question suggests you wanted. In particular, you posit variables declared like this:

    uint8x16_t  a;
    uint8x8x2_t b;
    
    // code to set the value of a ...
    

    and you want to know how to set the value of b so that it is in some sense equivalent to the value of a.

    Speaking to the C language:

    The strict aliasing rule (C2011 6.5/7) says,

    An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

    • a type compatible with the effective type of the object, [...]
    • an aggregate or union type that includes one of the aforementioned types among its members [...], or
    • a character type.

    (Emphasis added. Other enumerated options involve differently-qualified and differently-signed versions of the of the effective type of the object or compatible types; these are not relevant here.)

    Note that these provisions never interfere with accessing a's value, including the member value, via variable a, and similarly for b. But don't overlook overlook the usage of the term "effective type" -- this is where things can get bolluxed up under slightly different circumstances. More on that later.

    Using a union

    C certainly permits you to perform a conversion via an intermediate union, or you could rely on b being a union member in the first place so as to remove the "intermediate" part:

    union {
        uint8x16_t  x1;
        uint8x8_2_t x2;
    } temp;
    temp.x1 = a;
    b = temp.x2;
    

    Using a typecast pointer (to produce UB)

    However, although it's not so uncommon to see it, C does not permit you to type-pun via a pointer:

    // UNDEFINED BEHAVIOR - strict-aliasing violation
        b = *(uint8x8x2_t *)&a;
    // DON'T DO THAT
    

    There, you are accessing the value of a, whose effective type is uint8x16_t, via an lvalue of type uint8x8x2_t. Note that it is not the cast that is forbidden, nor even, I'd argue, the dereferencing -- it is reading the dereferenced value so as to apply the side effect of the = operator.

    Using memcpy()

    Now, what about memcpy()? This is where it gets interesting. C permits the stored values of a and b to be accessed via lvalues of character type, and although its arguments are declared to have type void *, this is the only plausible interpretation of how memcpy() works. Certainly its description characterizes it as copying characters. There is therefore nothing wrong with performing a

    memcpy(&b, &a, sizeof a);
    

    Having done so, you may freely access the value of b via variable b, as already mentioned. There are aspects of doing so that could be problematic in a more general context, but there's no UB here.

    However, contrast this with the superficially similar situation in which you want to put the converted value into dynamically-allocated space:

    uint8x8x2_t *c = malloc(sizeof(*c));
    memcpy(c, &a, sizeof a);
    

    What could be wrong with that? Nothing is wrong with it, as far as it goes, but here you have UB if you afterward you try to access the value of *c. Why? because the memory to which c points does not have a declared type, therefore its effective type is the effective type of whatever was last stored in it (if that has an effective type), including if that value was copied into it via memcpy() (C2011 6.5/6). As a result, the object to which c points has effective type uint8x16_t after the copy, whereas the expression *c has type uint8x8x2_t; the strict aliasing rule says that accessing that object via that lvalue produces UB.