Search code examples
cperformancememory-access

Which for loop is better?


The below question was asked in Microsoft placement test. I cannot figure out which one will be better. Can somebody help me?

code 1:

int MAX=1000;
int a[MAX][MAX];
for(i=0;i<MAX;i++)
    for(j=0;j<MAX;j++)
        a[j][i]=i*j;

code 2:

int MAX=1000;
int a[MAX][MAX];
for(i=0;i<MAX;i++)
    for(j=0;j<MAX;j++)
        a[i][j]=i*j;

Which is correct?

  1. code 1 is faster
  2. code 2 is faster
  3. both are same in RISC architecture
  4. both are about same

Solution

  • The difference is in how they access memory. Your array is laid out like this:

    row 0 - 1000 integers
    row 1 - 1000 integers
    etc.
    

    Now, your first loop accesses a[0][0], then a[1][0], etc. So it's going to locate row 0, then find column 0, and update it. Then it has to locate row 1, find column 0 in that row, and access it. So you end up accessing memory all over the place--essentially randomly. That's bad for the CPU cache because it has to reload with every memory access.

    Your second loop accesses a[0][0], then a[0][1], then a[0][2], etc. So it locates row 0, then it accesses the columns in sequence. This is good for the processor cache, and will execute faster because it doesn't have to reload as often.