How do I perform a bitwise NOT in SSE/AVX?

Is it my imagination, or is a PNOT instruction missing from SSE and AVX? That is, an instruction which flips every bit in the vector.

If yes, is there a better way of emulating it than PXOR with a vector of all 1s? Quite annoying since I need to set up a vector of all 1s to use that approach.

Solution

For cases such as this it can be instructive to see what a compiler would generate.

E.g. for the following function:

#include <immintrin.h>

__m256i test(const __m256i v)
{
  return ~v;
}

both gcc and clang seem to generate much the same code:

test(long long __vector(4)):
        vpcmpeqd        ymm1, ymm1, ymm1
        vpxor   ymm0, ymm0, ymm1
        ret