I am working on google map and trying to find specified place. But, sometimes I get more results of specified place (as I am trying to search the place with its name in whole city). So I have limited the output to first three matched suggestions and excluded all results beyond that. But the issue is, there is still possibility that first 3 matched suggestions are of same place and I want to show only one suggestion from them to make the suggestions precise. For Example:
1st and 2nd suggestions are indicating same place.
Now I want to apply prediction approach (using probability) to find that 1st and 2nd suggestions are same places and 3rd is different place.
What was my approach:-
var p = [], ap = []; //p -- places & ap -- array of splitted strings
p[0] = "Pizza Hut, MI Road, Jaipur";
p[1] = "Pizza Hut, MI Road, Ajmeri Gate, Jaipur";
p[2] = "Pizza Hut, Malviya Nagar, Jaipur";
//split all places
ap[0] = p[0].split(",");
ap[1] = p[1].split(",");
ap[2] = p[2].split(",");
/*
--- Theoretically ---
### Splitting strings into symbols ###
string_symbols_1 = a1, a2, a3; --- a1 = "Pizza Hut", a2 = "MI Road", a3="Jaipur";
string_symbols_2 = b1, b2, b3, b4; --- b1 = "Pizza Hut", b2 = "MI Road", b3 = "Ajmeri
Gate", b4 = "Jaipur"
string_symbols_3 = c1, c2, c3; --- c1 = "Pizza Hut", c2 = "Malviya Nagar", c3 =
"Jaipur"
### On Prediction Basis ###
I am trying to evaluate that if 60% of the symbols match with
another string symbols then there is probability that both strings are
same.
From above case I am considering if I am able to find >40% unique
symbols in both strings (that is being compared) then there is
probability that both strings are unique. (It will reduce the 60%
comparison to 40% comparison in best cases).
Once found the unique strings return their indexes;
*/
//pseudo implementation
function findUniquePlaces(ap){
//stuck here..
//now match the splitted string arrays to find the unique places
//what should be the logic
return index(0 and 2)
}
I know how can I implement this. But I want to know what would be the best possible way to implement the same. I want to ensure that this task mustn't be computationally intensive task. I have heard of map reduce technique. Should I use map reduce technique or there is some other technique that could be more computationally cheaper.
The general theory behind this would be string metric https://en.wikipedia.org/wiki/String_metric calculating distance between strings and then just use one of ready solutions:
https://www.npmjs.com/package/fast-levenshtein
https://www.npmjs.com/package/js-levenshtein
https://www.npmjs.com/package/string-similarity
or look for anything else you can find in google npm string distance
Those are pretty fast so your issue may be more with size than speed.
Or you can use fuzzy search https://glench.github.io/fuzzyset.js/
Also, it's probable that this list you get already uses these algorithms underneath so just take first?