Why is Valgrind finding errors in this hash table test case?

I took it upon myself to develop a concurrent generic hash table in C.

Relevant contents of hash_table.h:

typedef struct list_node {
    void * data;
    struct list_node * next;
} list_node_t;


typedef struct hash_table {
    int max_size;
    int count;
    list_node_t * * elements;
    pthread_rwlock_t * locks;
    pthread_rwlock_t global_table_lock;
    hash_table_compare_function compare;
    hash_table_hash_function hash;
} hash_table_t;

Relevant contents of hash_table.c:

#define LOCK_RD(lock)   pthread_rwlock_rdlock(&lock);
#define LOCK_WR(lock)   pthread_rwlock_wrlock(&lock);
#define UNLOCK(lock)    pthread_rwlock_unlock(&lock);

bool
hash_table_remove(hash_table_t * table, void * element)
{
    int hash_value = table->hash(element);
    list_node_t * node, * prev;

    LOCK_WR(table->locks[hash_value]);

    node = table->elements[hash_value];
    prev = NULL;

    while (node) {
        if (!table->compare(node->data, element)) {
            // value is first item in the list
            if (node == table->elements[hash_value]) {
                table->elements[hash_value] = node->next;
                free(node);
                UNLOCK(table->locks[hash_value]);
                LOCK_WR(table->global_table_lock);
                table->count--;
                UNLOCK(table->global_table_lock);
                return true;
            } else {
                // link previous node with one after current
                prev->next = node->next;
                free(node);
                UNLOCK(table->locks[hash_value]);
                LOCK_WR(table->global_table_lock);
                table->count--;
                UNLOCK(table->global_table_lock);
                return true;
            }
        }
        prev = node;
        node = node->next;
    }

    UNLOCK(table->locks[hash_value]);

    return false;
}

I wrote a test case which uses strings, in which this is the relevant code:

#include "hashtable.h"

#define NUM_THREADS 2
#define NUM_STRINGS 154560
#define NUM_LOOKUPS 10000


void *
do_work(void * data)
{
    int thread_id = *(int*)data;

    // write "threadX.txt" to filename, where X is the given thread id
    char filename[64];
    strcpy(filename, "thread");
    char thread_id_str[4];
    sprintf(thread_id_str, "%d", thread_id);
    strcat(filename, thread_id_str);
    strcat(filename, ".txt");

    FILE * file = fopen(filename, "r");
    char buffer[128];
    int i, num_str_per_thread = NUM_STRINGS / NUM_THREADS;
    char * str_array[num_str_per_thread];

    for (i = 0; i < num_str_per_thread; i++) {
        fgets(buffer, 128, file);

        str_array[i] = calloc((strlen(buffer) + 1), sizeof(char));
        strcpy(str_array[i], buffer);
    }

    fclose(file);

    for (i = 0; i < num_str_per_thread; i++)
        hash_table_insert(table, str_array[i]);

    for (i = 0; i < NUM_LOOKUPS; i++)
        hash_table_contains(table, str_array[rand() % num_str_per_thread]);

    for (i = 0; i < num_str_per_thread / 2; i++)
        hash_table_remove(table, str_array[rand() % num_str_per_thread]);

    //sleep(2); NOTE: no read errors reported if I leave this sleep() here.

    for (i = 0; i < num_str_per_thread; i++)
        if (str_array[i])
            free(str_array[i]);

    return NULL;
}


void
create_workers()
{
    pthread_t threads[NUM_THREADS];
    int ids[NUM_THREADS];
    int i;

    for (i = 0; i < NUM_THREADS; i++)
        ids[i] = i + 1;

    for (i = 0; i < NUM_THREADS; i++)
        pthread_create(&threads[i], NULL, do_work, (void*)&ids[i]);

    for (i = 0; i < NUM_THREADS; i++)
        pthread_join(threads[i], NULL);
}

The test case is supposed to work as follows: there are two files, thread1.txt and thread2.txt, each containing unique strings I have generated beforehand. I create two threads, and each will read from a file and store each string on an array of strings called str_array. They will then insert all these strings into the hash table and perform random searches (hash_table_contains) and deletions (hash_table_remove). Then, each will free their respective array of strings. However, when I run this test case, Valgrind reports the following:

Please note that there are no memory leaks. What I get from these errors is that a thread is, upon calling hash_table_remove, attempting to free memory already freed by free(str_array[i]). However, that makes no sense, since hash_table_remove is called before free(str_array[i]. I can't figure out what's giving me these invalid reads.

Thank you in advance!

Solution

Here, your thread removes at most half the strings it inserted:

for (i = 0; i < num_str_per_thread / 2; i++)
    hash_table_remove(table, str_array[rand() % num_str_per_thread]);

(in fact, it is most likely to remove about 39% of the strings it inserted).

Then, it goes on to free all the strings it inserted:

for (i = 0; i < num_str_per_thread; i++)
    if (str_array[i])
        free(str_array[i]);

However, at least half (and most likely ~61%) of those strings are still in the hash table, where the other threads will try to compare them as they scan through the chained hash bucket entries. That's your use-after-free error.

Instead of freeing all the strings, you could free them as you remove them:

for (i = 0; i < num_str_per_thread / 2; i++)
{
    int str_index = rand() % num_str_per_thread;

    if (str_array[str_index])
    {
        hash_table_remove(table, str_array[str_index]);
        free(str_array[str_index]);
        str_array[str_index] = NULL;
    }
}

At this point, the non-NULL entries in str_array[] are the strings still present in the hash table. You can't free them until they're removed from the hash table (or the hash table is no longer in use).

The fact that your test case got this wrong is a good indicator that the ergonomics of your interface are not as good as they could be. You should probably consider a design in which the ownership of the strings inserted is transferred to the hash table, so that hash_table_remove() is itself responsible for freeing the string.