I am trying to understand vectorization but to my surprise this very simple code is not being vectorized
#define n 1024
int main () {
int i, a[n], b[n], c[n];
for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
for(i=0; i<n; i++) c[i] = a[i]+b[i];
}
While the Intel compiler vectorizes for some reason the initialization loop, line 5.
> icc -vec-report a.c
a.c(5): (col. 3) remark: LOOP WAS VECTORIZED
With GCC, I get nothing it seems
> gcc -ftree-vectorize -ftree-vectorizer-verbose=2 a.c
Am I doing something wrong? Shouldn't this be a very simple vectorizable loop? All the same operations, continuous memory etc. My CPU supports SSE1/2/3/4.
--- update ---
Following the answer below, this example works for me.
#include <stdio.h>
#define n 1024
int main () {
int i, a[n], b[n], c[n];
for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
for(i=0; i<n; i++) c[i] = a[i]+b[i];
printf("%d\n", c[1023]);
}
With icc
> icc -vec-report a.c
a.c(7): (col. 3) remark: LOOP WAS VECTORIZED
a.c(8): (col. 3) remark: LOOP WAS VECTORIZED
And gcc
> gcc -ftree-vectorize -fopt-info-vec -O a.c
a.c:8:3: note: loop vectorized
a.c:7:3: note: loop vectorized
I've slightly modified your source code to be sure that GCC couldn't remove the loops:
#include <stdio.h>
#define n 1024
int main () {
int i, a[n], b[n], c[n];
for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
for(i=0; i<n; i++) c[i] = a[i]+b[i];
printf("%d\n", c[1023]);
}
GCC (v4.8.2) can vectorize the two loops but it needs the -O flag:
gcc -ftree-vectorize -ftree-vectorizer-verbose=1 -O2 a.c
and I get:
Analyzing loop at a.c:8
Vectorizing loop at a.c:8
a.c:8 note: LOOP VECTORIZED. Analyzing loop at a.c:7
Vectorizing loop at a.c:7
a.c:7 note: LOOP VECTORIZED. a.c: note: vectorized 2 loops in function.
Using the -fdump-tree-vect
switch GCC will dump more information in the a.c.##t.vect
file (it's quite useful to get an idea of what is happening "inside").
Also consider that:
-march=
switch could be essential to perform vectorization-ftree-vectorizer-verbose=n
is now being deprecated in favor of -fopt-info-vec
and -fopt-info-vec-missed
(see http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html)