I realize that the best practice is to use the largest prime number (smaller then the size of the array) in the mod function of the second hash function is best practice.
But my question is regarding the use of numbers that are not prime numbers. I'm not interested in a pseudo-code just the idea behind the concept.
Let's say I have an array m=20, and I have to choose between 6,9,12 and 15 as the values that will be entered in the second hash function. Which of them will give me the best 'spread'?
My first thought is to go for the same idea as choosing a prime number, only slightly modified, which means using the largest number the has the minimum amount of permutations:
6 -> 2,3
9 -> 3,3 = 3
12 -> 2,3,4,6
15 -> 3,5
Right of the bat I can rule 6 (a larger number with the same amount of permutations exists) and 12 (too many permutations) out.
Now the question arises, should I use 9 - has the least amount permutations, or should I choose 15 - although it has more permutations it is much larger the 9 and a lot closer to the size of the array (m=20).
Am I correct in using this approach? or is there a better way of choosing a number, given I can only choose from the numbers stated above?
I have found the answer I was looking for, so I'm leaving the question here with the correct answer in case anyone else ever needs it.
If we are forced to choose a number that is not a prime number as the number to be used in the second hash function (in the mod of that function):
The correct approach is to use the GCD function (Greatest Common Denominator), to find numbers that are "prime with respect to each other". This means that we are looking for any number that its gcd with 20 will result in 1.
In this case:
gcd(20,6)= 2
gcd(20,9)= 1
gcd(20,12)= 3
gcd(20,15)= 5
As we can see, the gcd between 20 and 9 is 1, which means that they have no common factors other than 1. Therefore, 9 is the correct answer.