Search code examples
creallocstrtokcorruption

double free or corruption (fasttop) with large files


I'm working on a program made to read big CSV files. I have developped and tested it with smaller CSV files for debug and it worked. But when I use a real one (with 17k lines), it starts not working.

Here is the problematic function (with all the file) :

#include "split_string.h"

void _add_to_tab(char** ret, char* string, unsigned int len)
{
    ret = realloc(ret, (len + 1)*sizeof(char*));
    ret[len] = string;
}

char** st_split(char* source, const char* delimiter)
{
    unsigned int len = 0;
    char** ret = NULL;
    char* tmp = NULL;

    ret = malloc(1 * sizeof(char**));
    if(ret==NULL)
    {
        return NULL;
    }
    else
    {
        ret[0] = source;
        tmp = strtok(source, delimiter);
        while(tmp!=NULL)
        {
            _add_to_tab(ret, tmp, len);
            len++;
            tmp = strtok(NULL, delimiter);
        }

        return ret;
    }
}

I made a test and the critic point is on the 1202nd line, if my CSV gets more lines, the program returns me the following error :

*** Error in `test.x': double free or corruption (fasttop): 0x00000000018adfc0 ***

I did it with Valgrind :

valgrind --leak-check=yes test.x

and it returns me this :

==2132== Memcheck, a memory error detector
==2132== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2132== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==2132== Command: test.x
==2132== 
0 3 -1 -1 -1 -1 -1 -1 -1 -1 -1
==2132== Invalid free() / delete / delete[] / realloc()
==2132==    at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132==    by 0x4011B3: _add_to_tab (split_string.c:5)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132==  Address 0x54e12c0 is 0 bytes inside a block of size 8 free'd
==2132==    at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132==    by 0x4011B3: _add_to_tab (split_string.c:5)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132== 
==2132== Invalid write of size 8
==2132==    at 0x4011CE: _add_to_tab (split_string.c:6)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
==2132== 
==2132== 
==2132== Process terminating with default action of signal 11 (SIGSEGV)
==2132==  Access not within mapped region at address 0x8
==2132==    at 0x4011CE: _add_to_tab (split_string.c:6)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132==  If you believe this happened as a result of a stack
==2132==  overflow in your program's main thread (unlikely but
==2132==  possible), you can try to increase the size of the
==2132==  main thread stack using the --main-stacksize= flag.
==2132==  The main thread stack size used in this run was 8388608.
==2132== 
==2132== HEAP SUMMARY:
==2132==     in use at exit: 576 bytes in 2 blocks
==2132==   total heap usage: 4 allocs, 2 frees, 600 bytes allocated
==2132== 
==2132== 8 bytes in 1 blocks are definitely lost in loss record 1 of 2
==2132==    at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132==    by 0x4011B3: _add_to_tab (split_string.c:5)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132== 
==2132== LEAK SUMMARY:
==2132==    definitely lost: 8 bytes in 1 blocks
==2132==    indirectly lost: 0 bytes in 0 blocks
==2132==      possibly lost: 0 bytes in 0 blocks
==2132==    still reachable: 568 bytes in 1 blocks
==2132==         suppressed: 0 bytes in 0 blocks
==2132== Reachable blocks (those to which a pointer was found) are not shown.
==2132== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2132== 
==2132== For counts of detected and suppressed errors, rerun with: -v
==2132== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

Solution

  • The problem is here

    void _add_to_tab(char** ret, char* string, unsigned int len)
    {
        ret = realloc(ret, (len + 1)*sizeof(char*));
        ret[len] = string;
    }
    

    When realloc returns a different pointer only the local ret is changed but not passed to the caller, so the caller still works with the previous, now invalid pointer.

    Change it to

    char **_add_to_tab(char** ret, char* string, unsigned int len)
    {
        ret = realloc(ret, (len + 1)*sizeof(char*));
        ret[len] = string;
        return ret;
    }
    

    and call it as

    ret = _add_to_tab(ret, tmp, len);
    

    And don't forget to add some error checking (realloc might return NULL)