Search code examples
javasimilarity

Method to find similar substrings from two strings


I'm using this piece of Java code to find similar strings:

if( str1.indexof(str2) >= 0 || str2.indexof(str1) >= 0 ) .......

but With str1 = "pizzabase" and str2 = "namedpizzaowl" it doesn't work.

how do I find the common substrings i.e. "pizza"?


Solution

  • If your algorithm says two strings are similar when they contain a common substring, then this algorithm will always return true; the empty string "" is trivially a substring of every string. Also it makes more sense to determine the degree of similarity between strings, and return a number rather than a boolean.

    This is a good algorithm for determining string (or more generally, sequence) similarity: http://en.wikipedia.org/wiki/Levenshtein_distance.