Search code examples
c++csimdintrinsicsavx

Fill constant floats in AVX intrinsics vec


I am doing vectorization using AVX intrinsics, I want to fill constant floats like 1.0 into vector __m256. So that in one register I got a vector{1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0} Does anyone knows how to do it?

It is similar to this question constant float with SIMD

But I am using AVX not SSE


Solution

  • See here for the AVX intrinsics load and store operations. You simply need to declare, a float array, an AVX vector __m256, and then use the appropriate operation to load the float array as an AVX vector.

    In this case, the instruction _mm256_load_ps is what you want.

    Update: As mentioned in the comments, the data must be 32 bit aligned. See Intel data alignment documentation for a detailed explanation. I've made the solution code cleaner, as per Peter's comments. With optimisation enabled (-O3), this produces the same code as Paul's answer (also with optimisation enabled). Without optimisations enabled, however, the number of instructions are the same, but all 8 floating point numbers are stored, rather than a single floating point answer as in Paul's answer.

    Here is the modified example:

    #include <immintrin.h> // For AVX instructions
    
    #ifdef __GNUC__
      #define ALIGN(x) x __attribute__((aligned(32)))
    #elif defined(_MSC_VER)
      #define ALIGN(x) __declspec(align(32))
    #endif
    
    static constexpr ALIGN(float a[8]) = {1.0f,1.0f,1.0f,1.0f,1.0f,1.0f,1.0f,1.0f};
    
    int main() {
      // Load the float array into an avx vector
      __m256 vect = _mm256_load_ps(a);
    }
    

    You can easily check the assembly output with a few compilers by using the Godbolt interactive C++ compiler.