The optimizer at levels −o2 and −o3 transforms the loop in Example 2−30(a) to something like the code in Example 2−30(b).
I don't understand why compiler made a optimization like that ? Is there any difference between downcounting and upcounting ?
Yes.
Most CPUs have internal arithmetic flags which hold the data on the last arithmetic operation.
One of such flags is "zero flag" which is set when the result is zero.
So when the loop counter is decremented from N
to 0, once 0 is reached, it is enough to perform a single operation which just checks for the zero flag in order to know if the loop exit condition is reached.
When the loop counter is incremented from 0 to N
, each iteration you need to perform comparison between the current loop counter and N
(which is basically a subtract operation in most cases) and then check the zero flag (to catch the case when the loop counter is equal to N
).
Thus in case (b) you perform one operation less compared to case (a).