Search code examples
arrayscmallocnodeshashtable

Array of Strings only saving the most recent input


I am currently working on a hashtable program, but that isn't the problem, I have a insert function, but when I run it, the insert function only saves the most recent string read from the text.txt file, if the line is like "Finn 34" it has to insert Finn into the hash table with the value 34, if the hashed index already has something in it, it just reports a collision, however if it is the same String such as "Finn 98" it should report that Finn is already in that index. The problem is that every string points to the original name from the fscanf, I'm almost certain that the way to fix it is with malloc, but every-time I try to use it, it won't work.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define TableSize 45
#define Max 50
///// dont forget the edit in main

typedef struct node {
    char name[Max];
    int value;
} node;
node array[Max];


void init_array(){

    for (int i = 0; i < TableSize; i++){
      //array[i] = NULL;
    }
} 


void insert(int index, node *p){

  /*char * tempName = malloc (sizeof (char) * 50);
  strcpy(tempName, p->name);*/

  if (array[index].value != 0){
    //if (array[index]->name == p->name){
    if (strcmp(array[index].name, p->name) == 0){
      printf("Error %s already exists at index %d\n\n", array[index].name, index);
    }
    else{
      printf("Collision occured at index %d with\n\n", index);
    }
  }
  else{
    array[index] = *p;
    strcpy(array[index].name, p->name);
    printf("Stored %s with value of %d at index %d.\n\n", array[index].name, array[index].value, index);
  }

} 

int hash(char name[Max]){
  int key = 0;
  for (int i = 0; name[i] != '\0'; ++i){
    char x = name[i];
    key = key + x;
  }
  key = key % TableSize;
  return key;
}

int main(int argc, char *argv[]) {
  init_array();
  FILE *fp;
  char ch;
  char name[Max];
  int x, HaValue;


  fp = fopen(argv[1], "r");
  if (NULL == fp) {
        //printf("file can't be opened \n");
        fp = fopen("text.txt", "r"); ///////get rid of when done
    }
  node * p;  
  p = malloc(sizeof(struct node));
 do{
  x = 0;

  int counter = 0;
  if (fscanf(fp, "%49s", name) != 1) break;
  if (fscanf(fp, "%d", &x) != 1) counter = 1;
  HaValue = hash(name);
  p->value = x;
  strcpy(p->name, name);
  if (counter != 0) {
  }
  else{
    insert(HaValue, p);
  }
  
  } while(!feof(fp));

  printf("%d %s %d", 12, array[12].name, array[12].value);
  //test to see if the name was correctly saved. should be "12 Dog 12"

  
    // Closing the file
    fclose(fp);
  
  return 0;
}

This is the text.txt document

Brom 89
Paul 25
Jake 34
Yokai 45
Jake
Dog 20
Paul 30
Brom
Kron 40
Finn 234

Solution

  • I can see an array declared with static storage duration (array), as a fixed number of node pointers. You later declare the node to be pointed to with automatic storage duration, which is where your error occurs, as the nodes get effectively destroyed and then trampled "on the stack" when the functions return.

    I'd suggest that you change your array to be an array of node, instead.

    typedef struct node {
        char name[Max];
        int value;
    } node;
    node array[Max];
    

    You don't need to initialise this array when it's declared with static storage duration (outside of functions) as you have; it starts off zeroed. You also don't need NULL checks like array[index] != NULL to check if it's been assigned to, you can just skip straight to using strcmp to verify the result or just check the first character to see if it's nonzero...

    array[index] = p;
    

    You can also declare p to be struct node rather than struct node *, and this assignment will make sense, or you could change that to: array[index] = *p; for example.

    As far as malloc goes, knowing how to use it is good, but memory allocation is a lesson I'd prefer to keep separate from data structures & algorithms, and I reckon this is an data structures & algorithms puzzle. On that note, it's best to avoid unnecessary calls to malloc where a more suitable storage duration may be chosen, and this kinda design is suitable for some cases (just as malloc and realloc will become later when you develop more complex hashtables) so it's perfectly okay in my mind to keep it simple.

    I see another issue, which is that some of those lines of input don't have associated decimal digit values, and your error handling code for fscanf is incomplete and somewhat misguided. Let's cover those examples I noticed:

    if (fscanf(fp, "%s", name) == 3){}
    

    I don't know what you expect to happen when fscanf succeeds (or fails, for that matter, or the whys and hows, as we'll cover shortly) but assuming it only writes to the one variable (name, in this case) it should return 1. I prefer to handle errors in indented form, so the program flows relatively unindented along the success path, so I'd write something like:

    if (fscanf(fp, "%49s", name) != 1) break;
    

    I assume you can safely break from the loop here because if you can't read a word then fp is at end of file or in some state of error.

    if (fscanf(fp, "%d", &x) == 3){}
    

    Now we come to the hairy parts of fscanf, see... We discussed one mode of failure already (the feof/error bits) but not the other: the match failures. When you try to read a decimal digit integer sequence but there is none, as explained above, the return value will be positive, but not the same as the number of fields you expect to be assigned to. In this case, expect 1, but fscanf returns 0, so:

    switch (fscanf(fd, "%d", &x)) {
        case 0: // fscanf didn't assign to x, which means match failure occured...
                x = 0; // in this case I'll just zero x so we're not using stale data
        case 1: break; // ... this is where control flows when x is assigned to successfully
        default: if (!feof(fp)) {
                fputs("This is not good, because it means the stream is in state of error\n", stderr);
                exit(EXIT_FAILURE);
                 }
    }