I got one problem at my exam for subject Principal of Programming Language. I thought for long time but i still did not understand the problem
Problem: Below is a program C, that is executed in MSVC++ 6.0 environment on a PC with configuration ~ CPU Intel 1.8GHz, Ram 512MB
#define M 10000
#define N 5000
int a[M][N];
void main() {
int i, j;
time_t start, stop;
// Part A
start = time(0);
for (i = 0; i < M; i++)
for (j = 0; j < N; j++)
a[i][j] = 0;
stop = time(0);
printf("%d\n", stop - start);
// Part B
start = time(0);
for (j = 0; j < N; j++)
for (i = 0; i < M; i++)
a[i][j] = 0;
stop = time(0);
printf("%d\n", stop - start);
}
Explain why does part A only execute in 1s, but it took part B 8s to finish?
Row-major order versus column-major order.
Recall first that all multi-dimensional arrays are represented in memory as a continguous block of memory. Thus the multidimensional array A(m,n) might be represented in memory as
a00 a01 a02 ... a0n a10 a11 a12 ... a1n a20 ... amn
In the first loop, you run through this block of memory sequentially. Thus, you run through the array traversing the elements in the following order
a00 a01 a02 ... a0n a10 a11 a12 ... a1n a20 ... amn
1 2 3 n n+1 n+2 n+3 ... 2n 2n+1 mn
In the second loop, you skip around in memory and run through the array traversing the elements in the following order
a00 a10 a20 ... am0 a01 a11 a21 ... am1 a02 ... amn
or, perhaps more clearly,
a00 a01 a02 ... a10 a11 a12 ... a20 ... amn
1 m+1 2m+1 2 m+2 2m+2 3 mn
All that skipping around really hurts you because you don't gain advantages from caching. When you run through the array sequentially, neighboring elements are loaded into the cache. When you skip around through the array, you don't get these benefits and instead keep getting cache misses harming performance.