When I compile a given file with the -opt-report or -vec-report options in ICC I get, among others, this message:
foo.c(226:7-226:7):VEC:function_foo: loop was not vectorized: subscript too complex
foo.c(226): (col. 7) warning #13379: loop was not vectorized with "simd"
vectorization support: call to function absorbing_apply cannot be vectorized
loop was not vectorized: not inner loop
loop was not vectorized: unsupported loop structure
loop was not vectorized: subscript too complex
I know the meaning of these messages. What concerns me is that in foo.c:226
there isn't any loop at all. In fact, what is there is the invocation of another function. That function does contain some loops that operate through a volume and which indeed vectorizes properly as icc reports it. However, all the calls to that function give the same messages as the ones I pasted.
Does icc get into a mess since it's showing vectorization messages in places with no loops at all? Or it's me that I misunderstood something?
EDIT: I have semi-replicated the issue. This time, the compiler tells that it vectorized a line of a code where there is a call to another function (in the original case is just the other way, it says that it cannot). Here is the code:
1
2
3 void foo(float *a, float *b, float *c, int n1, int n2, int n3, int ini3, int end3 ) {
4 int i, j, k;
5
6 for( i = ini3; i < end3; i++ ) {
7 for( j = 0; j < n2; j++ ) {
8 #pragma simd
9 #pragma ivdep
10 for( k = 0; k < 4; k ++ ) {
11 int index = k + j*n1 + i*n1*n2;
12 a[index] = b[index] + 2* c[index];
13 }
14 }
15 }
16
17 for( i = ini3; i < end3; i++ ) {
18 for( j = 0; j < n2; j++ ) {
19 #pragma simd
20 #pragma ivdep
21 for( k = n1-4; k < n1; k ++ ) {
22 int index = k + j*n1 + i*n1*n2;
23 a[index] = b[index] + 2* c[index];
24 }
25 }
26 }
27
28 return;
29 }
30 int main(void){
31 int n1, n2, n3;
32 int ini3 = 20;
33 int end3 = 30;
34 n1 = n2 = n3 = 200;
35
36 float *a = malloc( n1 * n2 * n3 * sizeof(float ));
37 float *b = malloc( n1 * n2 * n3 * sizeof(float ));
38 float *c = malloc( n1 * n2 * n3 * sizeof(float ));
39
40 foo( a,b,c, n1, n2, n3, ini3, end3 );
41
42 ini3 += 50;
43 end3 += 50;
44
45 foo( a,b,c, n1, n2, n3, ini3, end3 );
46
47 free(a); free(b); free(c);
48
49 return 0;
50 }
51
And the piece of the optimization report where ICC says it vectorized lines 40 and 45:
foo.c(40:4-40:4):VEC:main: LOOP WAS VECTORIZED
loop was not vectorized: not inner loop
loop was not vectorized: not inner loop
LOOP WAS VECTORIZED
loop was not vectorized: not inner loop
loop was not vectorized: not inner loop
foo.c(45:4-45:4):VEC:main: LOOP WAS VECTORIZED
loop was not vectorized: not inner loop
loop was not vectorized: not inner loop
Is this normal?
In the example you have posted, the function call to foo()
is being inlined. The loops within foo()
are vectorized after it is inlined.
The result is that all that code is "collapsed" into line 40 and 45. By the time the vectorizor touches the code, it has no idea that it was originally from a different function.
In the original example where you say that it is not vectorized, the same situation applies. The function call is inlined, but it contains non-vectorizable loops.
Perhaps, ICC could have preserved the line information through the function call. But then you would get a duplicate vectorization report each time the function is inlined. Furthermore, they would all point to the same line. That would arguably be even more confusing.