#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void sort();
int main() {
int i;
for (i = 0; i < 100000; i++) {
sort();
}
}
void sort() {
int i, j, k, array[100], l = 99, m;
for (i = 0; i < 100; i++) {
array[i] = rand() % 1000 + 1;
}
for (k = 0; k < 99; k++) {
for (j = 0; j < l; j++) {
if (array[j + 1] > array[j]) {
int temp = array[j];
array[j] = array[j + 1];
array[j + 1] = temp;
}
}
l--;
}
for (m = 0; m < 100; m++) {
printf("%d ", array[m]);
}
}
On the linux shell, gcc sort -o sort.c
and then time ./sort >> out
.
Here if I do gcc -o2 sort -o sort.c
and similarly o3
and o4
then the time keeps on decreasing. How does the optimization options work? Please explain in terms of all real time, user time and system time.
PS: The code might be a little inefficient. Kindly ignore that.
Optimization options work between the reading of the source code and the writing of the binary instructions to the CPU.
GCC is a multi-phase compiler, where the phases roughly consist of:
Optimizations can impact a number of locations, typically they become active in the above mentioned steps 3 through 5. There are many optimizations, including:
Note that these are not all the optimizations that might be performed, but it starts to give you an idea.
For example, if the compiler sees
int s = 3;
while (s < 6) {
printf("%d\n", s);
s++;
}
and the flags are set to unroll loops, then it might write CPU instructions equivalent to
printf("%d\n", 3);
printf("%d\n", 4);
printf("%d\n", 5);
Those instructions might seem more wordy to us humans, but the CPU commands might be smaller, because there is no need to "lookup" the now-erased value of s
, nor is there the need to add one to it, or store the new updated value back into RAM.
GCC arranges the optimizations into categories, ranging from "safe" to "risky". -O2 is a good compromise between speed and safety. Higher -O numbers are riskier.