Search code examples
javafuzzy-comparison

How to calculate matching score between two string in java?


I want classify two strings as similar or not similar. For example

s1 = "Token is invalid. DeviceId = deviceId: "345" "
s2 = "Token is invalid. DeviceId = deviceId: "123" "
s3 = "Could not send Message."

I am looking for a java library that can give a matching score between 2 strings and from that score I can determine if they are similar of not. My program only needs to work on a small data set (~2000 Strings). Do you know if there is something already available out there?


Solution

  • Check Levenshtein distance for matching score

    http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Java