Search code examples
cpointersmemorymemory-management

C: pointer to array of pointers to structures (allocation/deallocation issues)


I've been getting back into C for something, but I'm having trouble remembering much of how this memory management works. I'd like to have a pointer to an array of pointers to structures.

Say I have:

struct Test {
   int data;
};

Then the array:

struct Test **array1;

Is this correct? My issue is working with this thing. So each pointer in the array points to something that is allocated separately. But I think I need to do this first:

array1 = malloc(MAX * sizeof(struct Test *));

I am having trouble understanding the above. Do I need to do this, and why do I need to do this? In particular, what does it mean to allocate memory for pointers if I am going to be allocating memory for each thing that the pointer points to?

Say now I have pointer to an array of pointers to structures. I now want it to point to the same array that I've created earlier.

struct Test **array2;

Do I need to allocate room for pointers like I did above, or can I just do:

array2 = array1

Solution

  • Allocated Array

    With an allocated array it's straightforward enough to follow.

    Declare your array of pointers. Each element in this array points to a struct Test:

    struct Test *array[50];
    

    Then allocate and assign the pointers to the structures however you want. Using a loop would be simple:

    array[n] = malloc(sizeof(struct Test));
    

    Then declare a pointer to this array:

                                   // an explicit pointer to an array 
    struct Test *(*p)[] = &array;  // of pointers to structs
    

    This allows you to use (*p)[n]->data; to reference the nth member.

    Don't worry if this stuff is confusing. It's probably the most difficult aspect of C.


    Dynamic Linear Array

    If you just want to allocate a block of structs (effectively an array of structs, not pointers to structs), and have a pointer to the block, you can do it more easily:

    struct Test *p = malloc(100 * sizeof(struct Test));  // allocates 100 linear
                                                         // structs
    

    You can then point to this pointer:

    struct Test **pp = &p
    

    You don't have an array of pointers to structs any more, but it simplifies the whole thing considerably.


    Dynamic Array of Dynamically Allocated Structs

    The most flexible, but not often needed. It's very similar to the first example, but requires an extra allocation. I've written a complete program to demonstrate this that should compile fine.

    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    struct Test {
        int data;
    };
    
    int main(int argc, char **argv)
    {
        srand(time(NULL));
    
        // allocate 100 pointers, effectively an array
        struct Test **t_array = malloc(100 * sizeof(struct Test *));
    
        // allocate 100 structs and have the array point to them
        for (int i = 0; i < 100; i++) {
            t_array[i] = malloc(sizeof(struct Test));
        }
    
        // lets fill each Test.data with a random number!
        for (int i = 0; i < 100; i++) {
            t_array[i]->data = rand() % 100;
        }
    
        // now define a pointer to the array
        struct Test ***p = &t_array;
        printf("p points to an array of pointers.\n"
           "The third element of the array points to a structure,\n"
           "and the data member of that structure is: %d\n", (*p)[2]->data);
    
        return 0;
    }
    

    Output:

    > p points to an array of pointers.
    > The third element of the array points to a structure,
    > and the data member of that structure is: 49
    

    Or the whole set:

    for (int i = 0; i < 100; i++) {
        if (i % 10 == 0)
            printf("\n");
        printf("%3d ", (*p)[i]->data);
    }
    
     35  66  40  24  32  27  39  64  65  26 
     32  30  72  84  85  95  14  25  11  40 
     30  16  47  21  80  57  25  34  47  19 
     56  82  38  96   6  22  76  97  87  93 
     75  19  24  47  55   9  43  69  86   6 
     61  17  23   8  38  55  65  16  90  12 
     87  46  46  25  42   4  48  70  53  35 
     64  29   6  40  76  13   1  71  82  88 
     78  44  57  53   4  47   8  70  63  98 
     34  51  44  33  28  39  37  76   9  91 
    

    Dynamic Pointer Array of Single-Dynamic Allocated Structs

    This last example is rather specific. It is a dynamic array of pointers as we've seen in previous examples, but unlike those, the elements are all allocated in a single allocation. This has its uses, most notable for sorting data in different configurations while leaving the original allocation undisturbed.

    We start by allocating a single block of elements as we do in the most basic single-block allocation:

    struct Test *arr = malloc(N*sizeof(*arr));
    

    Now we allocate a separate block of pointers:

    struct Test **ptrs = malloc(N*sizeof(*ptrs));
    

    We then populate each slot in our pointer list with the address of one of our original array. Since pointer arithmetic allows us to move from element to element address, this is straight-forward:

    for (int i=0;i<N;++i)
        ptrs[i] = arr+i;
    

    At this point the following both refer to the same element field

    arr[1].data = 1;
    ptrs[1]->data = 1;
    

    And after review the above, I hope it is clear why.

    When we're done with the pointer array and the original block array they are freed as:

    free(ptrs);
    free(arr);
    

    Note: we do NOT free each item in the ptrs[] array individually. That is not how they were allocated. They were allocated as a single block (pointed to by arr), and that is how they should be freed.

    So why would someone want to do this? Several reasons.

    First, it radically reduces the number of memory allocation calls. Rather then N+1 (one for the pointer array, N for individual structures) you now have only two: one for the array block, and one for the pointer array. Memory allocations are one of the most expensive operations a program can request, and where possible, it is desirable to minimize them (note: file IO is another, fyi).

    Another reason: Multiple representations of the same base array of data. Suppose you wanted to sort the data both ascending and descending, and have both sorted representations available at the same time. You could duplicate the data array, but that would require a lot of copying and eat significant memory usage. Instead, just allocate an extra pointer array and fill it with addresses from the base array, then sort that pointer array. This has especially significant benefits when the data being sorted is large (perhaps kilobytes, or even larger, per item) The original items remain in their original locations in the base array, but now you have a very efficient mechanism in which you can sort them without having to actually move them. You sort the array of pointers to items; the items don't get moved at all.

    I realize this is an awful lot to take in, but pointer usage is critical to understanding the many powerful things you can do with the C language, so hit the books and keep refreshing your memory. It will come back.