Search code examples
c#mongodbmongodb-.net-driver

MongoDB: Case insensitive and accent insensitive


I am looking for string "JESÚS" but only returns the document with the specified string, I need the search to ignore the accents and capital letters.

I am using C# and mongodb driver.

I have two documents saved in my mongodb:

_id:5d265f3129ea36365c7ca587
TRABAJADOR:"JESUS HERNANDEZ DIAZ"

_id:5d265f01db86a83148404711
TRABAJADOR:"JESÚS HERNÁNDEZ DÍAZ"

In visual c# with mongo driver:

var filter = Builders<BsonDocument>.Filter.Regex("TRABAJADOR", new BsonRegularExpression(string.Format(".*{0}.*", "JESÚS"), "i"));

var result = collection.Find(filter, new FindOptions() { Collation = new Collation("es", strength: CollationStrength.Primary, caseLevel:true) }).ToList();

output = JsonConvert.SerializeObject(result);
return output;

If I search for "JESÚS", actual output:

_id:5d265f01db86a83148404711
TRABAJADOR:"JESÚS HERNÁNDEZ DÍAZ"

But actually I am expecting following output:

_id:5d265f3129ea36365c7ca587
TRABAJADOR:"JESUS HERNANDEZ DIAZ"

_id:5d265f01db86a83148404711
TRABAJADOR:"JESÚS HERNÁNDEZ DÍAZ"

Solution

  • i recommend you create a text index with the default language set to "none" in order to make it diacritic insensitive and then doing a $text search as follows:

    db.Project.createIndex(
        {
            "WORKER": "text",
            "TRABAJADOR": "text"
        },
        {
            "background": false,
            "default_language": "none"
        }
    )
    
    db.Project.find({
        "$text": {
            "$search": "jesus",
            "$caseSensitive": false
        }
    })
    

    here's the c# code that generated the above queries. i'm using my library MongoDB.Entities for brevity.

    using MongoDB.Entities;
    using System;
    using System.Linq;
    
    namespace StackOverflow
    {
        public class Program
        {
            public class Project : Entity
            {
                public string WORKER { get; set; }
                public string TRABAJADOR { get; set; }
            }
    
            private static void Main(string[] args)
            {
                new DB("test");
    
                DB.Index<Project>()
                  .Key(p => p.WORKER, KeyType.Text)
                  .Key(p => p.TRABAJADOR, KeyType.Text)
                  .Option(o => o.DefaultLanguage = "none")
                  .Option(o => o.Background = false)
                  .Create();
    
                (new[] {
                    new Project { WORKER = "JESUS HERNANDEZ DIAZ"},
                    new Project { TRABAJADOR = "JESÚS HERNÁNDEZ DÍAZ"}
                }).Save();
    
                var result = DB.SearchText<Project>("jesus");
    
                Console.WriteLine($"found: {result.Count()}");
                Console.Read();
            }
        }
    }