I'm working on a program made to read big CSV files. I have developped and tested it with smaller CSV files for debug and it worked. But when I use a real one (with 17k lines), it starts not working.
Here is the problematic function (with all the file) :
#include "split_string.h"
void _add_to_tab(char** ret, char* string, unsigned int len)
{
ret = realloc(ret, (len + 1)*sizeof(char*));
ret[len] = string;
}
char** st_split(char* source, const char* delimiter)
{
unsigned int len = 0;
char** ret = NULL;
char* tmp = NULL;
ret = malloc(1 * sizeof(char**));
if(ret==NULL)
{
return NULL;
}
else
{
ret[0] = source;
tmp = strtok(source, delimiter);
while(tmp!=NULL)
{
_add_to_tab(ret, tmp, len);
len++;
tmp = strtok(NULL, delimiter);
}
return ret;
}
}
I made a test and the critic point is on the 1202nd line, if my CSV gets more lines, the program returns me the following error :
*** Error in `test.x': double free or corruption (fasttop): 0x00000000018adfc0 ***
I did it with Valgrind :
valgrind --leak-check=yes test.x
and it returns me this :
==2132== Memcheck, a memory error detector
==2132== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2132== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==2132== Command: test.x
==2132==
0 3 -1 -1 -1 -1 -1 -1 -1 -1 -1
==2132== Invalid free() / delete / delete[] / realloc()
==2132== at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132== by 0x4011B3: _add_to_tab (split_string.c:5)
==2132== by 0x40124F: st_split (split_string.c:26)
==2132== by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132== by 0x40134A: main (test.c:11)
==2132== Address 0x54e12c0 is 0 bytes inside a block of size 8 free'd
==2132== at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132== by 0x4011B3: _add_to_tab (split_string.c:5)
==2132== by 0x40124F: st_split (split_string.c:26)
==2132== by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132== by 0x40134A: main (test.c:11)
==2132==
==2132== Invalid write of size 8
==2132== at 0x4011CE: _add_to_tab (split_string.c:6)
==2132== by 0x40124F: st_split (split_string.c:26)
==2132== by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132== by 0x40134A: main (test.c:11)
==2132== Address 0x8 is not stack'd, malloc'd or (recently) free'd
==2132==
==2132==
==2132== Process terminating with default action of signal 11 (SIGSEGV)
==2132== Access not within mapped region at address 0x8
==2132== at 0x4011CE: _add_to_tab (split_string.c:6)
==2132== by 0x40124F: st_split (split_string.c:26)
==2132== by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132== by 0x40134A: main (test.c:11)
==2132== If you believe this happened as a result of a stack
==2132== overflow in your program's main thread (unlikely but
==2132== possible), you can try to increase the size of the
==2132== main thread stack using the --main-stacksize= flag.
==2132== The main thread stack size used in this run was 8388608.
==2132==
==2132== HEAP SUMMARY:
==2132== in use at exit: 576 bytes in 2 blocks
==2132== total heap usage: 4 allocs, 2 frees, 600 bytes allocated
==2132==
==2132== 8 bytes in 1 blocks are definitely lost in loss record 1 of 2
==2132== at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132== by 0x4011B3: _add_to_tab (split_string.c:5)
==2132== by 0x40124F: st_split (split_string.c:26)
==2132== by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132== by 0x40134A: main (test.c:11)
==2132==
==2132== LEAK SUMMARY:
==2132== definitely lost: 8 bytes in 1 blocks
==2132== indirectly lost: 0 bytes in 0 blocks
==2132== possibly lost: 0 bytes in 0 blocks
==2132== still reachable: 568 bytes in 1 blocks
==2132== suppressed: 0 bytes in 0 blocks
==2132== Reachable blocks (those to which a pointer was found) are not shown.
==2132== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2132==
==2132== For counts of detected and suppressed errors, rerun with: -v
==2132== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
The problem is here
void _add_to_tab(char** ret, char* string, unsigned int len)
{
ret = realloc(ret, (len + 1)*sizeof(char*));
ret[len] = string;
}
When realloc returns a different pointer only the local ret
is changed but not passed to the caller, so the caller still works with the previous, now invalid pointer.
Change it to
char **_add_to_tab(char** ret, char* string, unsigned int len)
{
ret = realloc(ret, (len + 1)*sizeof(char*));
ret[len] = string;
return ret;
}
and call it as
ret = _add_to_tab(ret, tmp, len);
And don't forget to add some error checking (realloc might return NULL)