Search code examples
c#javachashlocality-sensitive-hash

Locality Sensitive Hash Implementation?


Are there any relatively simple to understand (and simple to implement) locality-sensitive hash examples in C/C++/Java/C#?

I'd like to learn more about the concept and so want to try an implementation on a few text files just to see how it works, so I don't need anything high-performance or anything... just an example of a hash function that returns similar hashes for similar inputs. I can learn more from it by example afterwards. :)


Solution

  • For strings you can use approximate matching algorithm.

    If the strings are equidistant from a reference string then chances are that they are similar to each other. And there you go you have a locality senitive hash implementation for strings.

    You can create different hash buckets for a range of distances.

    EDIT: You can try other variations of string distance. A simpler algorithm would just return no. of common characters between two strings.