Search code examples
cbuilt-inpowerpcaltivec

How to obtain a VSX value of zero?


We permute a vector in a few places, and we need the distinguished 0 value to use with the vec_perm built-in. We have not been able to locate a vec_zero() or similar, so we would like to know how we should handle things.

The code currently use two strategies. The first strategy is a vector load:

__attribute__((aligned(16)))
static const uint8_t z[16] =
    { 0,0,0,0,  0,0,0,0,  0,0,0,0,  0,0,0,0 };

const uint8x16_p8 zero = vec_ld(0, z);

The second strategy is an xor using the mask we intend to use:

__attribute__((aligned(16)))
static const uint8_t m[16] =
    { 15,14,13,12,  11,10,9,8,  7,6,5,4, 3,2,1,0 };

const uint8x16_p8 mask = vec_ld(0, m);
const uint8x16_p8 zero = vec_xor(mask, mask);

We have not started benchmarks (yet), so we don't know if one is better than the other. The first strategy uses a VMX load and it could be expensive. The second strategy avoids the load but introduces a data dependency.

How do we obtain a VSX value of zero?


Solution

  • I'd suggest to let the compiler handle it for you. Just initialise to zero:

    const uint8x16_p8 zero = {0};
    

    - which will likely compile to an xor.

    For example, a simple test:

    vector char foo(void)
    {
        const vector char zero = {0};
        return zero;
    }
    

    On my machine, this compiles to:

    0000000000000000 <foo>:
       0:   d7 14 42 f0     xxlxor  vs34,vs34,vs34
       4:   20 00 80 4e     blr
        ...