I'm using intel-cc to compile some C++ code and with the -Wall option it seems to be vectorizing a lot of my loops. I'm working under the assumption this is good for performance for now.
Now my question is this; if instead of a for loop I have unrolled it so we have for example
a[0] = b[0] + 1;
a[1] = b[1] + 1;
a[2] = b[2] + 1;
instead of
for(int i=0;i<3;++i) a[i] = b[i] + 1;
can the compiler still vectorize this code?
Further, if I access the elements using instead references does the compiler have any hope of recognising that the two are equivalent? E.g.
int &x, &y, &z;
x = a[0]; y = a[1]; z = a[2];
Then replacing the a's with x, y and z.
Any answers greatly appreciated! Thanks in advance.
So I had a delve into the assembly generated by the three simple cases. below;
for(int i=0;i<3;++i) a[i] = 1.0; // case 1
a[0] = a[1] = a[2] = 1.0; // case 2
a.x = a.y = a.z = 1.0; // case 3
The assembly generated for cases 2 and 3 was identical. This is good since in case 2 the compiler gave a "remark" about copying reference to temporary (operator[] is overridden for my class) this implies (correct me if I'm wrong) that the compiler is correctly utilizing Return Value Optimisation (RVO).
However in case 1 the compiler outputted a remark that it had vectorised the loop. The assembly was also slightly different. Specifically it contained this extra code;
.section .rodata, "a"
.align 16
.align 16
_2il0floatpacket.1:
.long 0x00000000,0x3ff00000,0x00000000,0x3ff00000
.type _2il0floatpacket.1,@object
.size _2il0floatpacket.1,16
_2il0floatpacket.2:
.long 0x00000000,0x3ff00000
.type _2il0floatpacket.2,@object
.size _2il0floatpacket.2,8
Now I have never worked with assembly so I am not entirely sure what this extra stuff means but it would seem to me to imply that the compiler cannot vectorize in the case of the unrolled loop or accessing through references. Also hinted at by the lack of a remark to this effect at compile time.
If anyone could confirm this it would be great.