I'm using Oracle 11g.
I've tried the Oracle Text functionality and so far I love it. I now need to support searching on misspelled words. So, I've found that CONTAINS
supports the Fuzzy
operator which seems like the thing for me!
But I'd like to know more on what's going on inside the operator. What does it do? Does it use edit distance or jaro-winkler distance for similarity scoring? I specifically want the jaro-winkler distance to be used. The documentation says it is only supported for some languages. My language is not on the list. Can I still use the Fuzzy
operator, or will it simply do nothing (NOP)?
If I can't use the Fuzzy
operator for my problem, what are the alternatives? I don't completely understand NDATA
, but I don't think it applies to my problem. I've found the JARO-WINKLER-SIMILARITY
function which is exactly what I need, but how can I use it within the search (in the CONTAINS
function)?
My understanding about Oracle fuzzy search is that they have their own "black box" function to define the similarity between two keywords, and users don't have much control on this function.
If you want to implement fuzzy search using edit distance, you may consider using UDFs. Here's an example (using MySQL): http://flamingo.ics.uci.edu/toolkit/