I have a tree structure stored in MongoDB where each node represents an object (e.g., folders, projects, documents, files). Each node has a list of ParentIds that reference its parent nodes. I want to use the $graphLookup aggregation stage to find all children (direct and indirect) of a given node by its _id.
public class TreeNode
{
[BsonId]
[BsonRepresentation(BsonType.ObjectId)]
public string Id { get; set; }
public string Name { get; set; }
public string ObjectType { get; set; }
[BsonRepresentation(BsonType.ObjectId)]
public List<string> ParentIds { get; set; } = new List<string>();
}
Test Data:
public async Task InsertNewTestData()
{
if (await CheckIfDataExistsInCollection())
return;
var id1 = ObjectId.GenerateNewId().ToString();
var id2 = ObjectId.GenerateNewId().ToString();
var id3 = ObjectId.GenerateNewId().ToString();
var id4 = ObjectId.GenerateNewId().ToString();
var id5 = ObjectId.GenerateNewId().ToString();
var id6 = ObjectId.GenerateNewId().ToString();
var id7 = ObjectId.GenerateNewId().ToString();
var objects = new List<TreeNode>
{
new TreeNode { Id = id1, Name = "Workspace", ObjectType = "folder", ParentIds = new List<string>() },
new TreeNode { Id = id2, Name = "Project 1", ObjectType = "project", ParentIds = new List<string> { id1 } },
new TreeNode { Id = id3, Name = "Document 1", ObjectType = "document", ParentIds = new List<string> { id2 } },
new TreeNode { Id = id4, Name = "File 1", ObjectType = "file", ParentIds = new List<string> { id3 } },
new TreeNode { Id = id5, Name = "File 2", ObjectType = "file", ParentIds = new List<string> { id3 } },
new TreeNode { Id = id6, Name = "Document 2", ObjectType = "document", ParentIds = new List<string> { id2 } },
new TreeNode { Id = id7, Name = "Project 2", ObjectType = "project", ParentIds = new List<string> { id1 } },
};
await _context.ObjectsTree.InsertManyAsync(objects);
}
Given this structure, how can I use the $graphLookup aggregation stage to find all children (direct and indirect) of a specific node by its _id? For example, if I provide the _id of "Project 1", I want to retrieve "Document 1", "File 1", "File 2", and "Document 2".
Here’s what I’ve tried so far, but it doesn’t seem to work as expected:
var pipeline = new[]
{
new BsonDocument("$match", new BsonDocument("_id", new ObjectId("_id"))),
new BsonDocument("$graphLookup", new BsonDocument
{
{ "from", "ObjectsTree" },
{ "startWith", "$ParentIds" },
{ "connectFromField", "_id" },
{ "connectToField", "ParentIds" },
{ "as", "children" },
{ "depthField", "depth" }
})
};
var result = await _context.ObjectsTree.Aggregate<BsonDocument>(pipeline).ToListAsync();
the problem here seems to be that the data for ParentIds is list of strings and it cannot search through them
{
"_id": {
"$oid": "679793f8e25c76dbf25bca3c"
},
"Name": "Project 1",
"ObjectType": "project",
"ParentIds": [
{
"$oid": "679793f8e25c76dbf25bca3b"
}
]
}
and it cannot search through list of strings.
If you want to solve this with Linq and .NET classes instead of defining the aggregation stages in JSON, you need to derive a class from TreeNode
that contains a property to store the children:
public class TreeNodeWithChildren : TreeNode
{
public List<TreeNode> Children { get; set; } = new();
}
Afterwards, you can use the fluent interface to run the aggregation pipeline, e.g. (id2
denotes the id you are looking for):
var result = coll
.Aggregate()
.Match(x => x.Id == id2)
.GraphLookup<TreeNode, TreeNode, string, List<string>, string, List<TreeNode>, TreeNodeWithChildren>(
coll,
x => x.Id,
x => x.ParentIds,
x => x.Id,
x => x.Children)
.ToList();