Search code examples
chashtable

Finding duplicates in an array using C


I came across a problem in this link https://leetcode.com/problems/contains-duplicate/. There is an input array of integers. Find out if there any repeating integers, then return true else false.

  1. How can I optimize this code?

  2. Can I improve the logic further?

In my code below,there is an if condition if(hPtr && hPtr->key == *(nums+i)). I am using the array elements as keys. If so, each key can't be unique if the same element is repeated twice right? So can I modify the if condition as if(hPtr->key == *(nums + i))?

If any other mistakes, please feel free to point out.

There is already a hashtable library available in C http://troydhanson.github.io/uthash/userguide.html and wrote the below code.

struct hash {
        int key;
        int value;
        UT_hash_handle hh;
};

    struct hash *hashtable = NULL;

    void addToHash(int key, int value)
    {
      struct hash *map;
      //I am using the array elements as hash keys
      HASH_FIND_INT(hashtable, &key, map);

      if(map == NULL)
      {
        map = (struct hash*)malloc(sizeof(struct hash));
        map->key = key;
        HASH_ADD_INT(hashtable, key, map);
      }     
      map->value = value;
    }   

    struct hash *findInHash(int key)
    {
        struct hash *h;
        HASH_FIND_INT(hashtable, &key, h);
        return h;
    }

    bool containsDuplicate(int* nums, int numsSize) {
        struct hash *hPtr;
        int target = 0;
        hashtable = NULL;
        if((numsSize <= 1) || (nums == 0)) return false;

        int i, index1 = 0;   

        for(i = 0; i < numsSize; i++)
        {
            /*The below statement will look if the key is already present in 
              the hashtable*/
            hPtr = findInHash(*(nums + i) - target);
            /*If the key is found already, then it look for the value of that 
            key. If the value and the current array element is same, then a 
            duplicate exist*/
            if(hPtr && hPtr->key == *(nums+i))
               return true;
            addToHash(*(nums + i), i);
        }
        struct hash *temp;
        HASH_ITER(hh, hashtable, hPtr, temp) {free(hPtr);}
        return false;
    }

Solution

  • I think the solution is simpler than what you think:

    typedef struct {
      int capacity;
      int len;
      int **keys;
      int *values;
    } Map;
    

    My struct has keys as arrays of two integers, one for the identifier and the other is the index of the values array, that's how hash maps works indeed.

    void initMap(Map *map) {
      map -> capacity = 5;
      map -> len = 0;
      map -> keys = malloc(map -> capacity * sizeof(int));
      map -> values = malloc(map -> capacity * sizeof(int));
    }
    

    Then we have a function to initialize the map, easy...

    void add(Map *map, int k, int v) {
      if (map -> len == map -> capacity - 1) resize(map);
      map -> values[map -> len] = v;
      map -> keys[map -> len] = malloc(sizeof(int) * 2);
      map -> keys[map -> len][0] = k;
      map -> keys[map -> len][1] = map -> len;
      map -> len++;
    }
    

    a function to put elements in the map

    void resize(Map *map) {
      map -> capacity *= 2;
      map -> keys = realloc(map -> keys, map -> capacity * sizeof(int) * 2);
      map -> values = realloc(map -> keys, map -> capacity * sizeof(int));
    }
    

    and a function to resize the map if the limit is reached

    By decoupling the key index and the index on the values array you can have the keys sorted, making your live much easier. the values will be in their array in the same order as they come but the index will be sorted from 0 to N. To do this i'll use a simple selsort algorithm, it's not the best but the easiest...

    void sort(Map *map) {
      int i, j;
      int min, tmp;
      for (i = 0; i < map -> len - 1; i++) {
        min = i;
        for (j = i + 1; j < map -> len; j++) {
          if(map -> keys[j][0] < map -> keys[min][0] ) min = j;
        }
    
        tmp = map -> keys[min][0];
        map -> keys[min][0] = map -> keys[i][0];
        map -> keys[i][0] = tmp;
      }
    }
    

    with this you will have your index shorted. I'd execute it right after adding a new entry to the map, inside the add() function, it's out here now just for testing.

    Once you have your index sorted you can write a binary search algorithm, eeeasy. Now you can now if a key is already in the map.

    int find_recursive(Map *map, int start, int end, int key) {
       if (end >= start) {
            int mid = start + (end - start) / 2;
    
            if (map -> keys[mid][0] == key)
                return mid;
    
            if (map -> keys[mid][0] > key)
                return find_recursive(map, start, mid - 1, key);
    
            return find_recursive(map, mid + 1, end, key);
        }
        return -1;
    }
    
    int find(Map *map, int key) {
        return find_recursive(map, 0, map -> len, key);
    }
    

    Sooo

      Map map;
      initMap(&map);
    
      add(&map, 3, 12);
      add(&map, 12, 1);
      add(&map, 1, 2);
    
      printf("%d %d %d\n",
          map.keys[0][0],
          map.keys[1][0],
          map.keys[2][0]
          );
      // Should print 3 12 1
    
      sort(&map);
    
      printf("%d %d %d\n",
          map.keys[0][0],
          map.keys[1][0],
          map.keys[2][0]
          );
      // Should print 1 3 12
    
      printf("%d\n", find(&map, 12));
      // Should print 2 (the index, 3rd entry)
    

    I can't do a working example now, i cant compile with my machine, maybe later at home... sorry :(

    EDIT: forgot to say... to get the value you have to do map.values[map.keys[find(&map, key)][1]].

    Of course this should be a function:

    int get(Map *map, key) {
      int keyindex = find(map, key);
      int valueindex = map -> keys[index][1];
      return map -> values[valueindex];
    }
    

    Forgot to say that, by decoupling the keys from the values you can use as value any type you want, even whole structures...

    enjoy