Search code examples
c#vb.netlinqdatatablelinq-to-dataset

Improve Linq to Datatable Performance


I have a datatable with 500K rows in the following format;

Int | Decimal | String

We are using a singleton pattern and ultimately our DataTable needs ending up as a List(Of AssetAllocation) where AssetAllocation is:

Public Class AssetAllocation
    Property TpId() As Integer
    Property Allocation As List(Of Sector)
End Class

Public Class Sector
    Property Description() As String
    Property Weighting As Decimal
End Class

The linq I am using;

Private Shared Function LoadAll() As List(Of AssetAllocation)

        Dim rtn = New List(Of AssetAllocation)

        Using dt = GetRawData()

            Dim dist = (From x In dt.AsEnumerable Select x!TP_ID).ToList().Distinct()

            rtn.AddRange(From i As Integer In dist
                         Select New AssetAllocation With {
                            .TpId = i,
                            .Allocation = (From b In dt.AsEnumerable
                                           Where b!TP_ID = i Select New Sector With {
                                               .Description = b!DESCRIPTION.ToString(),
                                               .Weighting = b!WEIGHT
                                           }).ToList()})
        End Using

        Return rtn
    End Function

It is taking a long time to execute the linq which is due to the inner query constructing the list of sectors. The distinct list contains 80k

Can this be improved at all?


Solution

  • If I've understood what you are trying to do this query should have much better performance. The trick is to use GroupBy to avoid having to search the entire table for matching id's in every iteration. I have written it in C# but I'm sure you can translate it into VB.

    var rtn  = 
            dt.AsEnumerable()
            .GroupBy(x => x.Field<int>("TP_ID"))
            .Select(x => new AssetAllocation()
            { 
                TpId = x.Key, 
                Allocation = x.Select(y => new Sector
                {
                    Description =  y.Field<string>("Description"),
                    Weighting = y.Field<decimal>("WEIGHT") 
                }).ToList()
            }).ToList();