Search code examples
cpointershashmaphashtable

Initialising a unique pointer in a loop (to be used in a hash table)


For the past several years I have been doing number crunching in Python using dictionaries. I am recapping C after several years, and I am trying to mimic python dictionary functionality using a hash table. My hash table works fine, but I'm struggling with adding new entry's programmatically in a loop. Here is my code:

#define TABLE_SIZE 16

typedef struct item{
    char key[64];
    int value;
    int list[5];
    struct item *next;
} item;

// Create hash table array
item *hash_table[TABLE_SIZE];

void init_hash_table(){
    for (int i = 0; i < TABLE_SIZE; i++)
        hash_table[i] = NULL;
}

int hash(char *key){
    unsigned long H = 5381;
    int c;
    while ((c = *key++))
        H = ((H << 5) + H) + c;
    H = H % TABLE_SIZE;
    int hash = (int)H;
    return hash;
}

int insert_item(item *pItem){
    if (pItem == NULL){
        printf("Invalid item");
        return -1;
    }
    else {
        int index = hash(pItem->key);
        pItem->next = hash_table[index];
        hash_table[index] = pItem;
        return index;
    }
}

void print_table(){
    printf("\n--- Table Start ---\n");
    for (int i = 0; i < TABLE_SIZE; i++){
        if (hash_table[i] == NULL)
            printf("\t %d \t --- \t\n", i);
        else {
            item *tmp = hash_table[i];
            printf("\t %d ", i);
            while(tmp != NULL){
                printf("\t --- \t %s", tmp->key);
                tmp = tmp->next;
            }
            printf("\n");
        }
    }
    printf("--- Table End ---");
}

int main(){

    // Initialise hash table
    init_hash_table();
    char letters[] = "ABCDEFGHIJ";

    // Insert data
    for (int i = 0; i < 10; i++){
        item itemN = {.key = letters[i]};
        printf("%s ", itemN.key);
        printf("%p\n", &itemN);
        insert_item(&itemN);
    }

    // Print the table
    print_table();

    return 0;
} 

Essentially I want 10 entries added to the hash table, each unique with the key as a different letter (A to J). The problem is each itemN in main has the same address in memory, so the same pointer is being overwritten each time. Hence my output looks like:

A 0061FEB0
B 0061FEB0
C 0061FEB0
D 0061FEB0
E 0061FEB0
F 0061FEB0
G 0061FEB0
H 0061FEB0
I 0061FEB0
J 0061FEB0

--- Table Start ---
         0       ---
         1       ---
         2       ---
         3       ---     J
         4       ---
         5       ---
         6       ---     J
         7       ---     J
         8       ---     J
         9       ---     J
         10      ---     J
         11      ---     J
         12      ---     J
         13      ---     J
         14      ---     J
         15      ---
--- Table End ---

The question is, how can I generate a new item variable in the loop with a unique address? This seems like pretty standard functionality to implement, since we cannot always initialise variables manually. Sorry if I am missing something obvious!

Thanks


Solution

  • The basic problem is that your hash table does not actually store items, it stores pointers, and your insert_item function takes a pointer and inserts that pointer directly in the hash table, which means it will only be valid as long as that object is alive. When the lifetime of the object ends, the pointer becomes dangling and your hash table becomes invalid, so pretty much anything after that is undefined behavior.

    Whenever you deal with pointers in C (which is pretty much all the time), you need to be aware of the lifetime of the pointed at objects and make sure that the pointer will not be in use after the lifetime ends. In your case, you are defining your item as a local (auto) within a loop in main, so the lifetime will end as soon as that iteration of the loop ends. So by the time you go to put a second item in the hash table, you have a dangling pointer in the table, and all bets are off.

    The usual way of dealing with this is to have your hash table "have ownership" of the objects in the table, and have your insert_item make a copy of the item (using malloc to allocate some memory for the copy):

    int insert_item(item *pItem){
        if (pItem == NULL){
            printf("Invalid item");
            return -1;
        }
        else {
            int index = hash(pItem->key);
            item *copy = malloc(sizeof(item));
            if (copy == NULL) {
                printf("Ran out of memory\n");
                return -1; }
            *copy = *pItem;
            copy->next = hash_table[index];
            hash_table[index] = copy;
            return index;
        }
    }
    

    This way, all the item objects actually in the hash table will have an extended lifetime -- they will live until you explicitly free them. You want to do that when you are done with the hash table itself, so you might have a destroy_table function to do that.