I'm writing a hash table and I'm using an opaque pointer to manage this ADT. This is how my code looks like:
hash_table.h
typedef struct hash_table *Hash_table;
Hash_table hash_table_init(int size, int(*compare)(void *key_a, void *key_b), int(*hash)(void *key, int size));
void hash_table_insert(Hash_table ht, void *item);
void* hash_table_search(Hash_table ht, void *key);
void hash_table_start_iteration(Hash_table ht);
void* hash_table_get_next_item(Hash_table ht);
void hash_table_destroy(Hash_table ht);
hash_table.c
#include <stdlib.h>
#include "hash_table.h"
struct hash_table{
void *v; //array of items (created with a malloc)
int n; //array size
int iterator; //iterator to retrive all the items
int (*compare)(void*, void*); //compare function
int (*hash)(void*, int); //hash function
};
Hash_table hash_table_init(int size, int(*compare)(void *key_a, void *key_b), int(*hash)(void *key, int size))
{...}
void hash_table_insert(Hash_table ht, void *item)
{...}
void* hash_table_search(Hash_table ht, void *key)
{...}
void hash_table_start_iteration(Hash_table ht)
{
ht->iterator = 0;
}
void* hash_table_get_next_item(Hash_table ht)
{
if(ht->iterator >= ht->n) return NULL;
return v[ht->iterator++];
}
void hash_table_destroy(Hash_table ht)
{...}
This is the code of the "for each" function that I wrote. It works great but I really don't enjoy it, I think that this isn't an elegant code.
How can I perform this in a better way? Thanks in advance
There are multiple ways of supporting iterations in abstract data types. It depends on how much you want to abstract and how much control you want your users to have.
If your data type supports random access, you could let the user in charge of the iteration (like arrays):
/* size of hash table */
unsigned hash_table_item_count(Hash_table ht) { return ht->n }
/* random access */
void * hash_table_item_at(Hash_table ht, unsigned n) { /* returns nth item */ }
And you use it like this:
int main() {
Hash_table table;
for (unsigned index = 0; index < hash_table_item_count(table); index++) {
printf("%p\n", hash_table_item_at(table, it));
}
return 0;
}
The user of your data type have control of how and when to iterate. This is very easy to use and understand and it doesn't cost you more memory.
A variation of this would be returning a const
pointer to the item array instead of making them go through a function to access it.
You could provide an iterator data type that knows how to iterate over hash tables. This is the C++ most used approach. I tend to like it because you could abstract any kind of iteration logic in it (i.e, only iterate on filled buckets) and have clear responsibility separation:
/* the hash table iterator control structure */
struct ht_iterator {
Hash_table table;
unsigned index;
};
typedef struct ht_iterator * Ht_iterator;
/* returns a iterator pointing to the first item */
Ht_iterator hash_table_begin(Hash_table ht) {
Ht_iterator it = malloc(sizeof(*it));
it->table = ht;
it->index = 0;
return it;
}
/* increments the iterator */
void ht_iterator_next(Ht_iterator it) {
it->index++;
}
/* checks if iterator is at end */
unsigned char ht_iterator_at_end(Ht_iterator it) {
return !(it->index < it->table->n);
}
/* returns the data this iterator is pointing at */
void * ht_iterator_data(Ht_iterator it) {
return ht_iterator_at_end(it) ? NULL : it->table->v[it->index];
}
/* frees iterator memory */
void ht_iterator_release(Ht_iterator it) { free(it); }
And you use it like this:
int main() {
Hash_table t;
for (Ht_iterator it = hash_table_begin(t); !ht_iterator_at_end(it); ht_iterator_next(it)) {
printf("%p\n", ht_iterator_data(it));
}
ht_iterator_release(it);
return 0;
}
It's a lot more verbose but as I said you gain the ability of totally abstracting iteration and still supporting control over when the iteration happens. You no longer have random access, though.
A third way would be to iterate over the items yourself and execute a user callback for each item:
/* typedef the process function */
typedef void (*ht_item_processor)(Hash_table t, unsigned i, void * item, void * priv);
/* iterates over all items, calling process() for each one of them */
void hash_table_traversal(Hash_table table, ht_item_processor process, void * priv) {
for (unsigned i = 0; i < table->n; i++) {
process(table, i, table->v[i], priv);
}
}
And you use it like this:
typedef struct {
/* holds any private state for you */
} my_state;
/* callback to process each item */
void my_process(Hash_table table, unsigned index, void * item, my_state * priv) {
printf("at %d: %p\n", index, item);
}
int main() {
Hash_table table;
my_state state;
table_traversal(table, my_process, &state);
return 0;
}
This way is less verbose and still abstracts iteration logic but users no longer controls the iteration. You could make hash_table_traversal
sensible to process()
return value. If it's zero, it would stop iteration, to give users some control.
The priv
pointer lets the user store state between each process
call, enabling them to use this code with C++ (e.g, priv
would point to a class instance) (but if you're using C++ I'd go with lambdas).
The way you're doing not only you're mixing responsibilities for your data types but you also lose multithread iteration.
I'm not really fond of macros when you can easly creater a solution that is clear both for you and to whomever is using your code, but, in any case, here is a link to a SO question that seems to provide what you wanted with macros.