Search code examples
cconcatenationc-stringsfunction-definition

strncat causes buffer overflow


how can I use strncat with heap objects?

Im trying to write a simple function to concatenate 2 strings together returning the result, however, I can not run it without making the return buffer very large (adding approximately an additional 5000 to its length) for it to not overflow.

I'm probably just using the strncat function incorrectly using heap objects instead of character arrays of fixed length. but I don't know how I would write it any other way.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char *concatStrings(char *arg1, char *arg2) {
    char *ret = malloc(strlen(arg1) + strlen(arg2));
    strncpy(ret, arg1, strlen(arg1));
    strncat(ret, arg2, strlen(arg2));
    ret[strlen(arg1)+strlen(arg2)] = '\0';
    return ret;
}

int main(int argc, char *argv[]) {
    if (argc == 3) {
        char *print = concatStrings(argv[1], argv[2]);
        printf("\n%s", print);
        free(print);
    }
    return 0;
}

Solution

  • For starters the function should be declared like

    char * concatStrings( const char* arg1, const char* arg2 );
    

    because the strings pointed to by the pointers arg1 and arg2 are not being changed within the function.

    In this memory allocation

    char *ret = malloc(strlen(arg1) + strlen(arg2));
    

    you forgot to reserve memory for the null terminating character '\0'. You have to write

    char *ret = malloc( strlen(arg1) + strlen(arg2) + 1 );
    

    Using the magic number 10 in this call

    strncpy(ret,arg1,10);
    

    does not make a sense.

    If instead you will write for example

    strncpy(ret,arg1,strlen(arg1));
    

    then the next call

    strncat(ret,arg2,strlen(arg2));
    

    will invoke undefined behavior because the call strncpy did not append the null terminating character '\0' to the string pointed to by the pointer ret.

    It would be much better just to write at least

    strcpy( ret, arg1 );
    

    In any case your function implementation is inefficient. For example there are two times called the function strlen for the parameter arg2

    char *ret = malloc(strlen(arg1) + strlen(arg2));
    //...
    strncat(ret,arg2,strlen(arg2));
    

    Also the call of strncat is also inefficient because the function needs to traverse the whole target string to find its terminating zero.

    The function can be defined the following way

    char * concatStrings( const char* arg1, const char* arg2 )
    {
        size_t n1 = strlen( arg1 );
        char *ret = malloc( n1 + strlen( arg2 ) + 1 );
    
        if ( ret != NULL )
        {
            strcpy( ret, arg1 );
            strcpy( ret + n1, arg2 );
        }
    
        return ret;
    }
    

    Here is a demonstrative program.

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    char * concatStrings( const char* arg1, const char* arg2 )
    {
        size_t n1 = strlen( arg1 );
        char *ret = malloc( n1 + strlen( arg2 ) + 1 );
    
        if ( ret != NULL )
        {
            strcpy( ret, arg1 );
            strcpy( ret + n1, arg2 );
        }
    
        return ret;
    }
    
    int main(void) 
    {
        const char *argv1 = "Hello ";
        const char *argv2 = "World!";
        
        char *print = concatStrings( argv1, argv2 );
        
        if ( print != NULL ) puts( print );
        
        free( print );
        
        return 0;
    }
    

    The program output is

    Hello World!
    

    It would be even better to substitute the first call of strcpy for memcpy within the function. That is the function can also look like

    char * concatStrings( const char* arg1, const char* arg2 )
    {
        size_t n1 = strlen( arg1 );
        char *ret = malloc( n1 + strlen( arg2 ) + 1 );
    
        if ( ret != NULL )
        {
            memcpy( ret, arg1, n1 );
            strcpy( ret + n1, arg2 );
        }
    
        return ret;
    }