Search code examples
performancelinqlinq-to-dataset

Best way to populate Parent-Children using LINQ


I am trying to populate Parent and Children using LINQ, my code look like this:

var ds = myDAL.GetDataSet("mySP");
var Parent = ds.Tables[0];
var Children = ds.Tables[1];

var ParentChildren = from p in Parent.AsEnumerable()
                     select new 
                   { 
                     Id = p.Field<int>("Id"), 
                     Name = p.Field<string>("Name"), 
                     Children = 
                       ( from c in Children.AsEnumerable() 
                         where c.Field<int>("ParentId") = p.Field<int>("Id")
                         select new 
                         {
                            Id = c.Field<int>("Id"), 
                            Name = c.Field<string>("Name")
                         } 
                       )
                    };

I am afraid of performance issues as I assume it will run the nested query again and again so If I have 1000 Parents, nested query would run 1000 times?


Solution

  • It's wise to group the children by parent ID first, using ToLookup. This is an O(n log n) operation, and doing a lookup within that collection is an O(1) operation, so you can avoid the O(n²) complexity behavior of your current query.

    var ds = myDAL.GetDataSet("mySP");
    var Parent = ds.Tables[0];
    var Children = ds.Tables[1];
    var ChildrenByParentID = Children.AsEnumerable().ToLookup(
        c => c.Field<int>("ParentId")
        c => new 
             {
                 Id = c.Field<int>("Id"), 
                 Name = c.Field<string>("Name")
             });
    
    var ParentChildren = from p in Parent.AsEnumerable()
                         select new 
                         { 
                           Id = p.Field<int>("Id"), 
                           Name = p.Field<string>("Name"), 
                           Children = ChildrenByParentID[p.Field<int>("Id")]
                         };