I have a function which takes an array of read-only strings, makes a copy of it and passes the new one to another function that, depending on some criteria, puts some of the strings in a linked list which is then returned to the initial caller. I need to make a copy of both the array and the strings in it because the original array must be preserved and the strings must be modified in some way before they end up in the list.
The result is always correct but valgrind complains about a memory leak and I've noticed that it happens depending on the criteria for choosing the strings to put in the list, or if the array contains duplicate strings (really, it's a mix of these two factors and I can't wrap my head around why it's behaving like this).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct node {
const char *str;
struct node *next;
} NODE;
typedef NODE *LIST;
LIST bar(char **arr, size_t n)
{
LIST l = NULL;
for (size_t i = 0; i < n; i++) {
/* FOOTNOTE #1 */
if (i == 3 || i == 4) {
NODE *n = malloc(sizeof(NODE));
n->str = arr[i];
n->next = l;
l = n;
}
}
return l;
}
LIST foo(const char **arr, size_t n)
{
char **copy = malloc(sizeof(char *) * n);
for (size_t i = 0; i < n; i++) {
/* each original string is copied into a new one, which is modified in some way before
* being saved in the new array. Here the original strings are saved directly in the
* array because the code that alters them doesn't affect the memory leak
*/
// the malloc at this line is the one that valgrind doesn't like
copy[i] = malloc(strlen(arr[i]) + 1);
strcpy(copy[i], arr[i]);
}
LIST l = bar(copy, n);
// checks which strings haven't been put in the list and frees them
for (size_t i = 0; i < n; i++) {
NODE *n = l;
while (n != NULL && strcmp(copy[i], n->str) != 0) {
n = n->next;
}
if (n == NULL) {
free((char *) copy[i]);
}
}
// frees the array, at this point only the strings in the list should be still allocated
free(copy);
return l;
}
int main(void)
{
/* FOOTNOTE #2 */
const char *arr[] = {"amet", "sit", "dolor", "sit", "amet"};
LIST l = foo(arr, sizeof(arr) / sizeof(arr[0]));
// for every node in the list prints the string in it and frees both the string and the node
while (l != NULL) {
printf("%s", l->str);
if (l->next != NULL) {
printf("%s", ", ");
}
NODE *tmp = l;
l = l->next;
free((char *) tmp->str);
free(tmp);
}
free(l);
puts("");
}
FOOTNOTE #1: this is the criterion for choosing which strings go in the list. If I use i == 3 || i == 4
and the array in FOOTNOTE #2 contains duplicate strings, valgrind finds a leak. If I use i % 2 == 1
it's fine, but if the array contains no duplicates it's also fine with i == 3 || i == 4
. Beats me.
Here's the output of valgrind for the code above (i == 3 || i == 4
with duplicates):
$ gcc prog.c -g -o prog
$ valgrind --tool=memcheck --leak-check=full ./prog
==12118== Memcheck, a memory error detector
==12118== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12118== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==12118== Command: ./prog
==12118==
amet, sit
==12118==
==12118== HEAP SUMMARY:
==12118== in use at exit: 9 bytes in 2 blocks
==12118== total heap usage: 9 allocs, 7 frees, 1,120 bytes allocated
==12118==
==12118== 9 bytes in 2 blocks are definitely lost in loss record 1 of 1
==12118== at 0x483980B: malloc (vg_replace_malloc.c:309)
==12118== by 0x40128E: foo (test2.c:38)
==12118== by 0x4013E1: main (test2.c:66)
==12118==
==12118== LEAK SUMMARY:
==12118== definitely lost: 9 bytes in 2 blocks
==12118== indirectly lost: 0 bytes in 0 blocks
==12118== possibly lost: 0 bytes in 0 blocks
==12118== still reachable: 0 bytes in 0 blocks
==12118== suppressed: 0 bytes in 0 blocks
==12118==
==12118== For lists of detected and suppressed errors, rerun with: -s
==12118== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Found the culprit thanks to the hint provided by Avi Berger. Given that the strings are dynamically allocated, the right way to free
them at the end of foo()
is to check if the address in the array is equal to the address in the list:
while (n != NULL && copy[i] != n->str) {
n = n->next;
}
That's because if a string in the list has a duplicate in the array, a comparison "by value" with strcmp
like the one I did before will leave all the duplicates allocated and that's where the memory leak was coming from.