Search code examples
oraclefull-text-searchfuzzy-search

What does the Fuzzy operator do?


I'm using Oracle 11g.

I've tried the Oracle Text functionality and so far I love it. I now need to support searching on misspelled words. So, I've found that CONTAINS supports the Fuzzy operator which seems like the thing for me!

But I'd like to know more on what's going on inside the operator. What does it do? Does it use edit distance or jaro-winkler distance for similarity scoring? I specifically want the jaro-winkler distance to be used. The documentation says it is only supported for some languages. My language is not on the list. Can I still use the Fuzzy operator, or will it simply do nothing (NOP)?

If I can't use the Fuzzy operator for my problem, what are the alternatives? I don't completely understand NDATA, but I don't think it applies to my problem. I've found the JARO-WINKLER-SIMILARITY function which is exactly what I need, but how can I use it within the search (in the CONTAINS function)?


Solution

  • My understanding about Oracle fuzzy search is that they have their own "black box" function to define the similarity between two keywords, and users don't have much control on this function.

    If you want to implement fuzzy search using edit distance, you may consider using UDFs. Here's an example (using MySQL): http://flamingo.ics.uci.edu/toolkit/