Search code examples
arrayscdynamic-memory-allocationc-stringscalloc

Why does memset fail and calloc succeed?


I'm trying to initialise an array of 26 strings. I'd prefer not to place the array on the heap, but I get a segmentation fault when I try to assign memory to the array using memset. The code to reproduce this is below:

char *string_array[26];
for (int x = 0; x < 26; x++)
    memset(string_array[x], 0, 3 + x); //= calloc(3+x, sizeof(char));

If I change the code so that memory is assigned to the heap, with calloc, I don't get a segmentation fault:

char *string_array[26];
for (int x = 0; x < 26; x++)
    string_array[x] = calloc(3 + x, sizeof(char));

Why is that?


Solution

  • In the snippet using calloc, you create 27 arrays: one array of pointers in automatic storage[1], and 26 arrays of chars on the heap.

    In the snippet using memset, you create the array of pointers, and that's it. You attempt to modify 26 arrays of chars, but you never created them.

    The following would be equivalent to the calloc snippet, but using only automatic storage.

    char string00[  3 ] = { 0 };
    char string01[  4 ] = { 0 };
    char string02[  5 ] = { 0 };
    char string03[  6 ] = { 0 };
    char string04[  7 ] = { 0 };
    char string05[  8 ] = { 0 };
    char string06[  9 ] = { 0 };
    char string07[ 10 ] = { 0 };
    char string08[ 11 ] = { 0 };
    char string09[ 12 ] = { 0 };
    char string10[ 13 ] = { 0 };
    char string11[ 14 ] = { 0 };
    char string12[ 15 ] = { 0 };
    char string13[ 16 ] = { 0 };
    char string14[ 17 ] = { 0 };
    char string15[ 18 ] = { 0 };
    char string16[ 19 ] = { 0 };
    char string17[ 20 ] = { 0 };
    char string18[ 21 ] = { 0 };
    char string19[ 22 ] = { 0 };
    char string20[ 23 ] = { 0 };
    char string21[ 24 ] = { 0 };
    char string22[ 25 ] = { 0 };
    char string23[ 26 ] = { 0 };
    char string24[ 27 ] = { 0 };
    char string25[ 28 ] = { 0 };
    char *string_array[ 26 ] = {
       string00, string01, string02, string03, string04,
       string05, string06, string07, string08, string09,
       string10, string11, string12, string13, string14,
       string15, string16, string17, string18, string19,
       string20, string21, string22, string23, string24,
       string25,
    };
    

    Now, if we assume there's no alignment restriction for char, we could simplify the above as follows:

    char buffer[ 403 ] = { 0 };  // 3+4+5+...+28 = 403
    char *string_array[ 26 ];
    for ( size_t j=0, i=0; j<26; ++j ) {
       string_array[ j ] = buffer + i;
       i += j + 3;
    }
    

    Finally, as @Ted Lyngmo points out, some compilers provide alloca/_alloca to allocate in automatic storage.

    char *string_array[ 26 ];
    for ( size_t j=0; j<26; ++j ) {
       string_array[ j ] = alloca( j + 3 );       // This is what you were missing.
       memset( string_array[ j ], 0, j + 3 );
    }
    

    C doesn't have a concept of stack and heap. The C standard doesn't mandate where the objects are stored, only the duration for which they can be accessed. This is called storage duration.

    So the following isn't based on the standard, but is nearly universally true:

    objects with automatic storage duration objects with allocated storage duration
    On the stack, in a register, or optimized away On the heap
    Needs less time/CPU to allocate Needs more time/CPU to allocate
    Needs less space/memory to allocate Needs more space/memory to allocate
    Limited control over lifespan of the allocated block Lifespan can extend beyond the function that allocated the memory
    Uncatchable dire consequences to over-allocating Over-allocating can be detected and handled
    Somewhat limited resource Vast resource
    Always available Not available on some systems (e.g. many embedded systems)
    Can be hard/impossible to allocate complex structures unless the compiler provides non-standard alloca or similar. Simple to allocate complex structures

    There are two storage durations not covered here: static and thread.


    1. There's no guarantee that a stack is used. But yeah, it's probably going to be a stack.