Search code examples
javamysqlalgorithmstring-comparisonknuth-morris-pratt

String comparison in Java, which algorithm should I use?


I have the requirement to compare the product name that user will search with the available products. I have the name of products stored in a MySQL db. I am collecting all the names, and getting it to application level (java) once when my java service starts.

Now my string comparison scenario is something like this:

Available product names:
1) Samsung galaxy s2
2) Samsung galaxy s3
3) Samsung galaxy s4

User input1: galaxy s3 - Then in this scenario my 2nd result should come first as it has 2 matching keywords 'galaxy' and 's3', where other 2 has only 1 matching keyword 'galaxy'.

User input2: s3 - Then here only 2nd result should come, because the other 2 has no matching key word.

User input3: samsung - Then here all three results should come.

Can anyone please suggest that which algorithm will be appropriate for this in Java? And 1 more thing, bringing all the product names to application level (java) from MySQL is the right way to do it? or can I do it at MySQL level as well? (PS: I don't want to use like query on MySQL side as it will be very slow)


Solution

  • Give you some hints to develop search feature in your project:

    • Use Lucene, just use the API or download the source code and use the custom score algorithm.
    • Develop term weighting or string similarity algorithm in you own application, it will enhance your searching accuracy. (You have to search about the two concept, or take a look at the book Information Retrieval, this really help you a lot.)
    • Use mysql SELECT ... FROM ... WHERE field LIKE '%keyword%' fuzzy searching (remember to create the index first), and apply the above term weighting or string similarity algorithm to rank the query result.