Search code examples
solrsolr4

SOLR: Fuzzy search on a text field with spaces


Here's my problem: I have a single text field that is indexed by SOLR, which is the usernames from our database. I'd like the search to be fuzzy and not an exact match. Eg; if the username is "krishnarayaprolu" and I search with a spelling mistake "krishnIrayaprolu", it should still return the record.

This is working fine for me except when the usernames have a space in them. So a username: "krishna rayaprolu", and a search string "krishnI rayaprolu~0.5" is not returning the record. It is returning fine if the spelling mistake is at the end like "krishna rayaprolI~0.5". Any ideas?

For my config, I tried WhiteSpaceTokenizerFactory and StandardTokenizerFactory. On the search side, I tried quotes and escaping the space. None of them helped with my space+fuzziness problem. I'm using the admin interface for searching. Appreciate any pointers.


Solution

  • I have solution for your problem, only need to add some fields in your schema.

    Create new ngram field and copy all you title name in ngram field.

    When you fire any query for missspell word and you get blank result then split the word and again fire the same query you will get results as expected.

    Example : Suppose user searching for word "krishna rayaprolu" but type it as "krishnI rayaprolu~0.5", then 
    create query in below way you will get results as expected hopefully.
    
    **(ngram:"krishnI rayaprolu~0.5" OR ngram:"kri" OR  ngram:"kris" OR ngram:"krish" OR ngram:"krishn" OR ngram:"krishnI" OR ngram:"ray" OR ngram:"raya" OR ngram:"rayap" ..... )**
    

    We have split the word sequence wise and fire query on field ngram.

    Hope it will help you.