Search code examples
c#mongodbmongodb-.net-driver

Searching a field inside a leaf in a mongo collection with regex


I have the following mongo collection with records as seen in the json example record below which represents one record. My goal is for example to search for the word "CONST" and find records where the "FullName" field inside the "Aliases" subtree contains this word. Taking the record shown as example, it should hit because the leaves 533-1 and 533-2 inside the "Aliases" tree contain the string "CONST" on their respective "FullName" fields. If the Aliases tree contained only the 533-3 leaf inside tree for example, it wouldn't hit.

I use C# and the MongoDb.Driver package and to search with Regex for normal fields - e.g. for "Key" on this collection shown I use "Builders.Filter.And(filter, Builders.Filter.Regex("Key", new BsonRegularExpression("")))" pattern to add a condition. But for these fields inside the "Aliases" subtree I don't know how to add a condition. What would be the simplest way to make a search on this field?

Please note that under "Aliases" each leaf has a different name (533-1, 533-2 etc). Each name of each leaf is unique in every record in the entire collection. Record example:

{
"_id" : "128c5c57-ae95-4a08-9fb8-d6eade663908",
"_t" : "Company",
"ActiveFrom" : "20190813154343",
"ActiveTo" : "30001001000001",
"GroupId" : "38547478-88ef-4caf-a6ce-1a9e6a85a59b",
"Key" : "533",
"FullName" : "RED BRIGADES",
"Countries" : [ 
    "ITALY"
],
"MoreInfo" : null,
"RecordDetail" : "<RecordDetail></RecordDetail>",
"Aliases" : {
    "533-1" : {
        "Key" : "533-1",
        "FullName" : "BRIGADAS ROJAS PARA LA CONSTRUCCIÓN DEL PARTIDO COMUNISTA COMBATIENTE",
        "OriginalKey" : "533"
    },
    "533-2" : {
        "Key" : "533-2",
        "FullName" : "BRIGADES ROUGES POUR LA CONSTRUCTION DU PARTI COMMUNISTE COMBATTANT",
        "OriginalKey" : "533"
    },
    "533-3" : {
        "Key" : "533-3",
        "FullName" : "BRIGATE ROSSE",
        "OriginalKey" : "533"
    },
    "533-4" : {
        "Key" : "533-4",
        "FullName" : "BRIGATE ROSSE PER LA COSTRUZIONE DEL PARTITO COMUNISTA COMBATTENTE",
        "OriginalKey" : "533"
    }
}

}


Solution

  • This might run slower than Ryan's query, but this is readable for you to grok it and you to optimize it.

    db.Companies.aggregate([
        // Add temporary field deconstructed_aliases to every document of collection
        {
            $addFields: {
                deconstructed_aliases: {
                    $objectToArray: "$Aliases"
                }
            }
        },
        // Unwind this newly created temporary field deconstructed_aliases
        {
            $unwind: "$deconstructed_aliases"
        },
        // Match on fullName now
        {
            $match: {
                "deconstructed_aliases.v.FullName": {
                    $regex: new RegExp('CONST', 'i')
                }
            }
        },
        // Project needed fields
        {
            $project: {
                _id: 0,
                key: "$deconstructed_aliases.v.Key",
                fullName: "$deconstructed_aliases.v.FullName"
            }
        }
    ]);
    

    The C# driver does not seem to have a addField helper method in the aggregate fluent interface.

    Also there's no direct support for objectToArray operator needed to normalize the object, hence creating magic strings as a workaround.

    Here's a corresponding C# program -

    using MongoDB.Bson;
    using MongoDB.Driver;
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Threading.Tasks;
    
    namespace Testing
    {
        class Program
        {
            static void Main(string[] args)
            {
                MongoClient client = new MongoClient("mongodb://localhost:27017/test");
                IMongoDatabase database = client.GetDatabase("test");
                IMongoCollection<BsonDocument> collection = database.GetCollection<BsonDocument>("Companies");
    
                string searchTerm = "CONST";
                string searchResult = FetchKeysBySearchTerm(collection, searchTerm).Result.ToJson();
                Console.WriteLine(searchResult);
            }
    
            public async static Task<List<BsonDocument>> FetchKeysBySearchTerm(IMongoCollection<BsonDocument> collection, string searchTerm)
            {
                string addFieldStage = @"{ $addFields: { deconstructed_aliases: { $objectToArray: ""$Aliases"" } } }";
                string unwindStage = @"{ $unwind: ""$deconstructed_aliases"" }";
                string projectStage = @"{ $project: { _id: 0, key: ""$deconstructed_aliases.v.Key"", fullName: ""$deconstructed_aliases.v.FullName"" } }";
    
                var aggregate = collection.Aggregate()
                                          .AppendStage<BsonDocument>(addFieldStage)
                                          .AppendStage<BsonDocument>(unwindStage)
                                          .Match(Builders<BsonDocument>.Filter.Regex("deconstructed_aliases.v.FullName", new BsonRegularExpression(searchTerm, "i")))
                                          .AppendStage<BsonDocument>(projectStage);
    
    
                List<BsonDocument> searchResults = await aggregate.ToListAsync();
                return searchResults;
            }
        }
    }
    

    Here's the sample response -

    [{ "key" : "533-1", "fullName" : "BRIGADAS ROJAS PARA LA CONSTRUCCIÓN DEL PARTIDO COMUNISTA COMBATIENTE" }, { "key" : "533-2", "fullName" : "BRIGADES ROUGES POUR LA CONSTRUCTION DU PARTI COMMUNISTE COMBATTANT" }]