Search code examples
javascriptregexmongodblevenshtein-distancefuzzy-search

Mongodb partial matching


How to get all documents in mongodb with one levenshtein distance.

I have collection for football teams.

{
    name: 'Real Madrir',
    nicknames: ['Real', 'Madrid', 'Real Madrir' ... ]
}

And user searched Real Madid of Maddrid or something else.

I want to return all documents that contains nicknames with 0 or 1 levenshtein distance to given search string.

I think there is two ways, mongodb full text search or regex.

So can I write such regex or query ?

Thank you.


Solution

  • For full-text search, first you must create a Text Index on your nicknames field. Documents inserted before an index has been created will not be searchable. The search only works for documents that have been inserted after the index has been created. Then when you perform a search using MongoDb's $text and $search operators, MongoDb will return the documents whose nicknames field corresponds to the search text. For regex matching, MongoDb has a $regex operator you can use.

    Here are a couple of short examples:

    Full Text Search

    1. Save this script as football.js. It will create a teams collection with a Text Index and two documents for us to search.
    // create football database
    var db = connect("localhost:27017/football");
    
    /* 
       note:
       You may also create indexes from your console
       using the MongoDb shell. Actually each of these
       statements may be run from the shell. I'm using
       a script file for convenience.
    */
    
    // create Text Index on the 'nicknames' field 
    // so full-text search works
    db.teams.createIndex({"nicknames":"text"});
    
    // insert two teams to search for
    db.teams.insert({
        name: 'Real Madrir',
        nicknames: ['Real', 'Madrid', 'Real Madrir' ]
    })
    
    db.teams.insert({
        name: 'Fake Madrir',
        nicknames: ['Fake']
    })
    
    1. Open your terminal and navigate to the directory where you saved football.js, then run this script against your local MongoDb instance by typing mongo football.js.

    2. Type mongo from your terminal to open the MongoDb Shell and switch to the football database by typing use football.

    3. Once you're in the football database, search for one of your documents using db.teams.find({"$text":{"$search":"<search-text>"}})

    > use football
    
    // find Real Madrir
    > db.teams.find({"$text":{"$search":"Real"}})
    
    // find Fake Madrir
    > db.teams.find({"$text":{"$search":"Fake"}})
    

    Regex

    If you want to search using a regex, you will not need to create an index. Just search using mongodb's $regex operator:

    //find Real Madrir
    db.teams.find({"nicknames": {"$regex": /Real/}})
    
    db.teams.find({"nicknames": {"$regex": /Real Madrir/}})
    
    //find Fake Madrir
    db.teams.find({"nicknames": {"$regex": /Fa/}})
    
    db.teams.find({"nicknames": {"$regex": /ke/}})
    

    Mongoose

    This is how each of these searches would work in NodeJS using mongoose:

    var searchText = "Madrir"; // or some value from request.body
    
    var searchRegex = new RegExp(searchText);
    
    var fullTextSearchOptions = {
      "$text":{
        "$search": searchText
      }
    };
    
    var regexSearchOptions = {
      "nicknames": {
        "$regex": searchRegex
      }
    };
    
    // full-text search
    Team.find(fullTextSearchOptions, function(err, teams){
    
      if(err){
        // ...
      }else if(teams){
        // ...
      }
    
    })
    
    // regex search
    Team.find(regexSearchOptions, function(err, teams){
    
      if(err){
        // ...
      }else if(teams){
        // ...
      }
    
    })