I want to compare all the 16 elements of a vector in Neon 64 bit and have a branch if all are equal to zero.
Right now, I have:
uaddlv h1, v0.16b
umov w0, v1.s[0]
cmp w0, #0
beq .exit
I also tried:
uaddlv h1, v0.16b
fcmp s1, #0.0
beq .exit
Is this correct? Is there a way to do better? With one single instruction?
This should work
umaxv h1, v0.16b // Get max value across vector
umov w0, v1.s[0] // Move to arm register
cbz w0 .exit // Branch if equal to zero
Using intrinsics in C...
if(vmaxvq_u16(vector) == 0) { // Is max value zero
goto exit; // Goto label in C code
}