Search code examples
arraysmongodbmongodb-queryaggregation-frameworkmongodb-.net-driver

MongoDB query to transform documents


I have the documents structured in MongoDB 6.0 as below:

[
    {
        "_id" : ObjectId("6555a7d7c2d9de74abb09de8"),
        "tagName" : "tag1",
        "tagValue" : "value1",
        "categories" : [
            "Category-1"
        ],
        "otherField1" : [],
        "otherField2" : "something-else-1"
    },
    {
        "_id" : ObjectId("6555a865c2d9de74abb09de9"),
        "tagName" : "tag2",
        "tagValue" : "value2",
        "categories" : [
            "Category-1",
            "Category-2"
        ],
        "otherField1" : [],
        "otherField2" : "something-else-2"
    },
    {
        "_id" : ObjectId("6555c652c2d9de74abb09df2"),
        "tagName" : "tag3",
        "tagValue" : "value3",
        "categories" : [
            "Category-2",
            "Category-3",
        ],
        "otherField1" : [],
        "otherField2" : "something-else-3"
    }
]

I would like to transform them as per the below structure.

[
    {
        "category" : "Category-1",
        "tags" : [
            {
                "name" : "tag1",
                "value" : "value1"
            },
            {
                "name" : "tag2",
                "value" : "value2"
            }
        ]
    },
    {
        "category" : "Category-2",
        "tags" : [
            {
                "name" : "tag2",
                "value" : "value2"
            },
            {
                "name" : "tag3",
                "value" : "value3"
            }
        ]
    },
    {
        "category" : "Category-3",
        "tags" : [
            {
                "name" : "tag3",
                "value" : "value3"
            }
        ]
    }
]

I used the $unwind and $group operators but was unable to get the final result. Any help is appreciated.

PS: At the end, I have to translate the MongoDB query using C#/.NET 7 driver.


Solution

  • Yes, your query should contains $unwind and $group stages.

    db.collection.aggregate([
      {
        $unwind: "$categories"
      },
      {
        $group: {
          _id: "$categories",
          tags: {
            $push: {
              name: "$tagName",
              value: "$tagValue"
            }
          }
        }
      },
      {
        $project: {
          _id: 0,
          category: "$_id",
          tags: 1
        }
      }
    ])
    

    For the MongoDB .NET Driver syntax, you should create models for the unwind and projection/result.

    Example:

    public class RootModel
    {
        public ObjectId Id { get; set; }
        public string TagName { get; set; }
        public string TagValue { get; set; }
        public List<string> Categories { get; set; }
        public List<string> OtherField1 { get; set; }
        public string OtherField2 { get; set; }
    }
    
    public class UnwindRootModel
    {
        public ObjectId Id { get; set; }
        public string TagName { get; set; }
        public string TagValue { get; set; }
        public string Categories { get; set; }
        public List<string> OtherField1 { get; set; }
        public string OtherField2 { get; set; }
    }
    
    [BsonNoId]
    public class ResultModel
    {
        public string Category { get; set; }
        public List<TagModel> Tags { get;set; }
    }
    
    [BsonNoId]
    public class TagModel
    {
        public string Name { get; set; }
        public string Value { get; set; }
    }
    

    Before querying from the collection, as your document's fields are in camel case, either you need to specify the field name as camel case to the classes' properties via [BsonElement] attribute or you should register the camel case pack.

    var pack = new ConventionPack();
    pack.Add(new CamelCaseElementNameConvention());
    ConventionRegistry.Register("camel case", pack, t => true);
    

    With Aggregate Fluent:

    var result = await _collection.Aggregate()
        .Unwind<RootModel, UnwindRootModel>(x => x.Categories)
        .Group(x => x.Categories,
            g => new ResultModel
            {
                Category = g.Key,
                Tags = g.Select(y => new TagModel
                {
                    Name = y.TagName,
                    Value = y.TagValue
                }).Distinct().ToList()
            })
        .ToListAsync();
    

    enter image description here

    If you face difficulties in writing the Aggregate Fluent query, you can provide the raw query with MongoDB Compass.

    Update: As mentioned by the post owner to remove the duplicate tags, apply .Distinct() to the Tags list in Aggregate Fluent/LINQ.