Search code examples
c++cmultidimensional-arrayimplicit-conversionpointer-arithmetic

How C/C++ compiler distinguish regular two dimensional array and array of pointers to arrays?


Regular static allocated array looks like this, and may be accessed using the following formulas:

const int N = 3;
const int M = 3;

int a1[N][M] = { {0,1,2}, {3,4,5}, {6,7,8} };

int x = a1[1][2]; // x = 5 
int y = *(a1+2+N*1); // y = 5, this is what [] operator is doing in the background

Array is continuous region of memory. It looks different in case of dynamic array allocation, there is array of pointer to arrays instead:

int** a2 = new int*[N];
for (int i = 0; i < N; i++) 
   a2[i] = new int[M];

//Assignment of values as in previous example

int x = a2[1][2];
int y = *(*(a2+1))+2); // This is what [] operator is doing in the background, it needs to dereference pointers twice

As we can see, operations done by [] operator are completely different in case of typical continuous array and dynamically allocated array. My questions are now following:

  1. Is my understanding of [] operations correct?
  2. How C/C++ compiler can distinguish which [] operation it should perform, and where it's implemented? I can image implementing it myself in C++ by overloading [] operator, but how C/C++ treat this?
  3. Will it work correctly in C language using malloc instead of new? I don't see any reasons why not actually.

Solution

  • For this declaration of an array

    int a1[N][M] = { {0,1,2}, {3,4,5}, {6,7,8} };
    

    these records

    int x = a1[1][2];
    int y = *(a1+2+N*1); 
    

    are not equivalent.

    The second one is incorrect. The expression *(a1+2+N*1) has the type int[3] that is implicitly converted to an object of the type int * used as an initializer. So the integer variable y is initialized by a pointer.

    The operator a1[1] is evaluated like *( a1 + 1 ) . The result is a one-dimensional array of the type int[3].

    So applying the second subscript operator you will get *( *( a1 + 1 ) + 2 ).

    The difference between the expressions when used the two-dimensional array and the dynamically allocated array is that the designator of the two-dimensional array in this expression (a1 + 1) is implicitly converted to a pointer to its first element of the type int ( * )[3] while the pointer to the dynamically allocated array of pointers still have the same type int **.

    In the first case dereferencing the expression *(a1 + 1 ) you will get lvalue of the type int[3] that in turn used in the expression *( a1 + 1) + 2 is again implicitly converted to a pointer of the type int *.

    In the second case the expression *(a1 + 1) yields an object of the type int *.

    In the both cases there is used the pointer arithmetic. The difference is that when you are using arrays in the subscript operator then they are implicitly converted to pointers to their first elements.

    When you are allocating dynamically arrays when you are already deals with pointers to their first elements.

    For example instead of these allocations

    int** a2 = new int*[N];
    for (int i = 0; i < N; i++) 
       a2[i] = new int[M];
    

    you could just write

    int ( *a2 )[M] = new int[N][M];