Search code examples
cperformancevariablesmalloccalloc

Is accessing a variable created using malloc inside a loop faster than a locally declared variable?


I have an application that is on a loop and uses a variable. Basically it just copies a string into the variable processes it and then move to the next string. I was wondering how should I declare the variable that I need to use so I wrote this code to test which one would be faster. Its interesting to see that malloc is faster than the variable that I declared locally. I have also thrown in calloc and its slower since its probably zeroing the memory.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <sys/resource.h>

struct rusage ruse;
#define CPU_TIME (getrusage(RUSAGE_SELF,&ruse), ruse.ru_utime.tv_sec +  ruse.ru_stime.tv_sec + 1e-6 *  (ruse.ru_utime.tv_usec + ruse.ru_stime.tv_usec))

void gen_random_letters(char *random_buffer, const int len)
{
    int i;
    static const char alphanum[] =
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        "abcdefghijklmnopqrstuvwxyz";

    /* initialize random seed: */
    srand (time(NULL));

    for ( i = 0; i < len; ++i) {
        random_buffer[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
    }

    random_buffer[len] = 0;
}

int main()
{
    clock_t tic;
    clock_t toc;
    int counter;
    char *buffer_calloc;
    char *buffer_malloc;
    char buffer_local_variable[1450];
    char copy_this[1450];
    double time_spent;
    double first, second;
    int loop_max;

    loop_max = 5000000;
    gen_random_letters(copy_this, sizeof(copy_this));

    /* Try the locally declared variable */
    tic  = clock();
    first = CPU_TIME;
    for ( counter = 0; counter <= loop_max; counter++ )
    {
        //memset(buffer_local_variable,0,sizeof(buffer_local_variable));
        memcpy(buffer_local_variable,copy_this,sizeof(buffer_local_variable));
    }
    toc = clock();
    second = CPU_TIME;
    time_spent = (toc - tic) / CLOCKS_PER_SEC;
    printf("cpu local_variable : %.2f secs\n", second - first);
    printf("Elapsed local_variable: %f seconds\n\n", time_spent);

    /* Try calloc */
    tic  = clock();
    first = CPU_TIME;
    for ( counter = 0; counter <= loop_max; counter++ ){
        buffer_calloc = calloc(1450,sizeof(char*));
        memcpy(buffer_calloc,copy_this,sizeof(buffer_calloc));
        free(buffer_calloc);
    }
    toc = clock();
    second = CPU_TIME;
    time_spent = (toc - tic) / CLOCKS_PER_SEC;
    printf("cpu calloc  : %.2f secs\n", second - first);
    printf("Elapsed calloc : %f seconds\n\n", time_spent);

    /* And now malloc */
    tic  = clock();
    first = CPU_TIME;
    for ( counter = 0; counter <= loop_max; counter++ ){
        buffer_malloc = malloc(1450 * sizeof(char*));
        memcpy(buffer_malloc,copy_this,sizeof(buffer_malloc));
        free(buffer_malloc);
    }
    toc = clock();
    second = CPU_TIME;
    time_spent = (toc - tic) / CLOCKS_PER_SEC;
    printf("Cpu malloc  : %.2f secs\n", second - first);
    printf("Elapsed malloc : %f seconds\n", time_spent);

    return 0;
}

Result:

cpu local_variable : 0.57 secs
Elapsed local_variable : 0.000000 seconds

cpu calloc  : 2.08 secs
Elapsed calloc : 2.000000 seconds

Cpu malloc  : 0.39 secs
Elapsed malloc : 0.000000 seconds

I was expecting the locally declared variable to be faster since the memory for it is already allocated unlike the malloc where it needs to be called every loop. Is my code flawed that is why malloc is faster or that is just the way it is.


Solution

  • Your code copies the wrong number of bytes in the calloc and malloc cases. sizeof(buffer_malloc) gives you the size of the pointer.

    Try using 1450 instead of sizeof(...) for those cases.

    Results on my laptop (2015 Macbook) with the above change:

    cpu local_variable : 0.16 secs
    Elapsed local_variable: 0.000000 seconds
    
    cpu calloc  : 1.60 secs
    Elapsed calloc : 1.000000 seconds
    
    Cpu malloc  : 0.56 secs
    Elapsed malloc : 0.000000 seconds
    

    UPDATE

    You're also allocating 1450 * sizeof(char*) bytes with malloc, when you should really be using 1450 * sizeof(char).

    After that fix, the results get a little closer:

    cpu local_variable : 0.16 secs
    Elapsed local_variable: 0.000000 seconds
    
    cpu calloc  : 0.76 secs
    Elapsed calloc : 0.000000 seconds
    
    Cpu malloc  : 0.57 secs
    Elapsed malloc : 0.000000 seconds