Search code examples
gcccompiler-optimizationauto-vectorization

GCC Auto Vectorization


In gcc compiler is there a way to enable auto vectorization only? I do know that -ftree-vectorize flag enables auto vectorization. But it requires at least -O2 optimization level. Is there a way to enable auto vectorization without using the -O2 optimization flag?

Thanks in advance.


Solution

  • You could actually get decent auto vectorization with -ftree-vectorize combined with -O1, for example: Godbolt.

    With -O0, however, vectorized code won't be generated, even for very simple examples. I suspect that gcc's tree vectorizer isn't even called with -O0, or called and bails out, but that has to be verified in the gcc source code.

    Generally, -O0 and auto vectorization don't mix very well. In compilers, optimizations happen in phases, where each optimization phase prepares the ground for the next one. For auto vectorization to occur, at least on non trivial examples, the compiler has to perform some optimizations beforehand. For example, loops that contain jumps usually cannot be vectorized, unless branches are eliminated and replaced with predicated instructions by an optimization called if-conversion - resulting in a flat block of code, which could be vectorized more conviniently.

    Footnote - I came across this nice presentation about GCC auto vectorization, which you may find interesting - it gives a good introduction to auto vectorization with gcc, compiler flags and basic concepts.