Search code examples
c++vectorizationintrinsicsavxvector-class-library

Compile multi-architecture code using Agner's Vector Class Library


How can I create a library that will dynamically switch between SSE, AVX, and AVX2 code paths depending on the host processor/OS? I am using Agner Fog's VCL (Vector Class Library) and compiling with GCC for Linux.


Solution

  • See the section "Instruction sets and CPU dispatching" in the manual to the Vector Class Library. In that section Agner writes

    The file dispatch_example.cpp shows an example of how to make a CPU dispatcher that selects the appropriate code version.

    Read the source code to distpatch_example.cpp. At the start of the file you should see the comment

    # Compile dispatch_example.cpp five times for different instruction sets:
    | g++ -O3 -msse2    -c dispatch_example.cpp -od2.o
    | g++ -O3 -msse4.1  -c dispatch_example.cpp -od5.o
    | g++ -O3 -mavx     -c dispatch_example.cpp -od7.o
    | g++ -O3 -mavx2    -c dispatch_example.cpp -od8.o
    | g++ -O3 -mavx512f -c dispatch_example.cpp -od9.o
    | g++ -O3 -msse2 -otest instrset_detect.cpp d2.o d5.o d7.o d8.o d9.o
    | ./test
    

    The file instrset_detect.cpp. You should read the source code to this also. This is what calls CPUID.

    Here is a summary of some, but not all of, my questions and answers on CPU dispatchers.