Search code examples
assemblyneonarm64micro-optimization

Neon 64 bit aarch: compare vector to zero


I want to compare all the 16 elements of a vector in Neon 64 bit and have a branch if all are equal to zero.

Right now, I have:

uaddlv h1, v0.16b
umov w0, v1.s[0]
cmp w0, #0
beq .exit

I also tried:

uaddlv h1, v0.16b
fcmp s1, #0.0
beq .exit

Is this correct? Is there a way to do better? With one single instruction?


Solution

  • This should work

    umaxv h1, v0.16b // Get max value across vector
    umov w0, v1.s[0] // Move to arm register
    cbz w0 .exit // Branch if equal to zero
    

    Using intrinsics in C...

    if(vmaxvq_u16(vector) == 0) { // Is max value zero
        goto exit; // Goto label in C code
    }