Search code examples
carraysmultidimensional-arrayvariable-length-array

VLA prototype and multidimensional array argument


I created a C99 VLA function as such :

void create_polygon(int n, int faces[][n]);

I want to call this function in another function where I would allocate my two-dimensional array :

void parse_faces()
{
    int faces[3][6];

    create_polygon(6, faces);
}

When I pass a two-dimensional array as an argument, it passes a pointer to a 6 integer array, referencing the stack memory in the calling function.

The VLA argument here only acts as a type declaration (not allocating any actual memory), telling the compiler to access the data in row-major order with ((int*)faces)[i * 6 + j] instead of faces[i][j].

What is the difference between declaring functions with a VLA argument or with a fixed size ?


Solution

  • faces[i][j] always is equivalent to *(*(faces + i) + j), no matter if VLA or not.

    Now let's compare two variants (not considering that you actually need the outer dimension as well to prevent exceeding array bounds on iterating):

    void create_polygon1(int faces[][6]);
    void create_polygon2(int n, int faces[][n]);
    

    It doesn't matter if array passed to originally were created as classic array or as VLA, first function accepts arrays of length of exactly 6, second can accept arbitrary length array (assuming this being clear so far...).

    faces[i][j] will now be translated to:

    *((int*)faces + (i * 6 + j)) // (1)
    *((int*)faces + (i * n + j)) // (2)
    

    Difference yet looks marginal, but might get more obvious on assembler level (assuming all variables are yet stored on stack; assuming sizeof(int) == 4):

    LD     R1, i;
    LD     R2, j;
    MUL    R1, R1, 24; // using a constant! 24: 6 * sizeof(int)!
    MUL    R2, R2, 4;  // sizeof(int)
    ADD    R1, R2, R2; // index stored in R1 register
    
    LD     R1, i;
    LD     R2, j;
    LD     R3, m;      // need to load from stack
    MUL    R3, R3, 4;  // need to multiply with sizeof(int) yet     
    MUL    R1, R1, R3; // can now use m from register R3
    MUL    R2, R2, 4;  // ...
    ADD    R1, R2, R2; // ...
    

    True assembler code might vary, of course, especially if you use a calling convention that allows passing some parameters in registers (then loading n into into R3 might be unnecessary).


    For completeness (added due to comments, unrelated to original question):
    There's yet the int* array[] case: Representation by array of pointers to arrays.

    *((int*)faces + (i * ??? + j))
    

    doesn't work any more, as faces in this case is no contiguous memory (well, the pointers themselves are in contiguous memory, of course, but not all the faces[i][j]). We are forced to do:

    *(*(faces + i) + j)
    

    as we need to dereference the true pointer in the array before we can apply the next index. Assembler code for (for comparison, need a more complete variant of the pointer to 2D-array case first):

    LD     R1, faces;
    LD     R2, i;
    LD     R3, j;
    LD     R4, m;      // or skip, if no VLA
    MUL    R4, R4, 4;  // or skip, if no VLA
    MUL    R2, R2, R3; // constant instead of R3, if no VLA
    MUL    R3, R3, 4;
    ADD    R2, R2, R3; // index stored in R1 register
    ADD    R1, R1, R2; // offset from base pointer
    LD     R1, [R1];   // loading value of faces[i][j] into register
    
    LD     R1, faces;
    LD     R2, i;
    LD     R3, j;
    MUL    R2, R2, 8;  // sizeof(void*) (any pointer)
    MUL    R3, R3, 4;  // sizeof(int)
    ADD    R1, R1, R2; // address of faces[i]
    LD     R1, [R1];   // now need to load address - i. e. de-referencing faces[i]
    ADD    R1, R1, R3; // offset within array
    LD     R1, [R1];   // loading value of faces[i][j] into register