What is the least number characters (starting from 1st character) which can make a set of strings unique.
For example, set of strings:
{
'january',
'february',
'march',
'april',
'may',
'june',
'july'
}
Now, we cannot use just 1st character since 'j' is both in 'june' as well as 'july' (also, 'm' is in 'march' and 'may'). We also cannot use 1st 2 characters as 'ma' is both in 'march' and 'may'.
But, we can use 1st 3 characters!
What can be an optimal algorithm to return this number (besides an obvious brute force) ?
First thing you can optimize is to do a binary search over the size of prefix needed. The function is monotonous- if a given number of chars n is enough, then any value larger than n will also do.
Second - you may compute rolling hashes for each string so that you can get the hash code for a given prefix of each string in constant time. Thus you will have to verify numbers(hash codes) for uniqueness instead of a string which of course is faster. I suggest a rolling hash like in rabin-karp.