Search code examples
cfileinitializationhashtable

How Do You Initialize a Hash Table to Store in a File in C


I am trying to implement a fixed-size hash table in C using linear probing for collision resolution. This hash table will be stored in a file for later (and quick) retrieval via the hash values that are generated. I have most of the kinks taken care of, but I seem to be stuck in initializing the table that I will be using. I thought using an array of the type of struct I am registering and then initializing all the values to either NULL or blank "" was the way to go, but now I am not so sure.

I am also unsure as to how the retrieval process would work. I haven't found any useful resources online that delineate this for the C language. When retrieving information for a specific index, do I need to load the whole hash table in memory (this seems impractical) or is there a way to load in the specific index contents (maybe using lseek to locate the right file location)?

I am including only the code associated with the hash table initialization, but please let me know if you need more. Thanks in advance:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>

#define HSZ 10

struct client{
char last_name[80];
char maiden_name[80];
char name[80];
}client;

void init_array(){

int fd, i;
int bucket_no = 0;
struct client buff;

   if ((fd = open("output", O_RDWR|O_CREAT,0777))<0){
        perror("Inside init function: ");
        printf("There was a problem initializing the hash table!");
   }
     for (i = 0; i<HSZ; i++){

            strcpy(buff.name,"_");
            strcpy(buff.last_name,"_");
            strcpy(buff.maiden_name,"_");

            lseek(fd, sizeof client * bucket_no, SEEK_SET);
            write(fd, &buff, sizeof client);

            bucket_no++;
        }
close(fd);
}

Solution

  • While mmap() is probably the simplest approach, as long as each bucket is a fixed size like in your example, it's easy to calculate an offset to seek to in the file before reading a given one - or do both steps in one using pread().

    Something like

    int bucket_number = 4; // 0 based like an array
    struct client bucket;
    lseek(hash_fd, sizeof bucket * bucket_number, SEEK_SET);
    read(hash_fd, &bucket, sizeof bucket);
    

    or

    pread(hash_fd, &bucket, sizeof bucket, sizeof bucket * bucket_number);
    

    And for writing buckets to the file, use write() or pwrite() with the same sort of math for individual ones - or write an entire array's worth at once when creating the file.