Search code examples
c++cgccvectorizationicc

vectorization fails with GCC


I am trying to understand vectorization but to my surprise this very simple code is not being vectorized

#define n 1024
int main () {
  int i, a[n], b[n], c[n];

  for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
  for(i=0; i<n; i++) c[i] = a[i]+b[i];
}

While the Intel compiler vectorizes for some reason the initialization loop, line 5.

> icc -vec-report a.c
a.c(5): (col. 3) remark: LOOP WAS VECTORIZED

With GCC, I get nothing it seems

> gcc -ftree-vectorize -ftree-vectorizer-verbose=2 a.c

Am I doing something wrong? Shouldn't this be a very simple vectorizable loop? All the same operations, continuous memory etc. My CPU supports SSE1/2/3/4.

--- update ---

Following the answer below, this example works for me.

#include <stdio.h>
#define n 1024

int main () {
  int i, a[n], b[n], c[n];

  for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
  for(i=0; i<n; i++) c[i] = a[i]+b[i];

  printf("%d\n", c[1023]);  
}

With icc

> icc -vec-report a.c
a.c(7): (col. 3) remark: LOOP WAS VECTORIZED
a.c(8): (col. 3) remark: LOOP WAS VECTORIZED

And gcc

> gcc -ftree-vectorize -fopt-info-vec -O a.c
a.c:8:3: note: loop vectorized
a.c:7:3: note: loop vectorized

Solution

  • I've slightly modified your source code to be sure that GCC couldn't remove the loops:

    #include <stdio.h>
    #define n 1024
    
    int main () {
      int i, a[n], b[n], c[n];
    
      for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
      for(i=0; i<n; i++) c[i] = a[i]+b[i];
    
      printf("%d\n", c[1023]);  
    }
    

    GCC (v4.8.2) can vectorize the two loops but it needs the -O flag:

    gcc -ftree-vectorize -ftree-vectorizer-verbose=1 -O2 a.c
    

    and I get:

    Analyzing loop at a.c:8

    Vectorizing loop at a.c:8

    a.c:8 note: LOOP VECTORIZED. Analyzing loop at a.c:7

    Vectorizing loop at a.c:7

    a.c:7 note: LOOP VECTORIZED. a.c: note: vectorized 2 loops in function.

    Using the -fdump-tree-vect switch GCC will dump more information in the a.c.##t.vect file (it's quite useful to get an idea of what is happening "inside").

    Also consider that: