I have a datatable with 500K rows in the following format;
Int | Decimal | String
We are using a singleton pattern and ultimately our DataTable
needs ending up as a List(Of AssetAllocation)
where AssetAllocation
is:
Public Class AssetAllocation
Property TpId() As Integer
Property Allocation As List(Of Sector)
End Class
Public Class Sector
Property Description() As String
Property Weighting As Decimal
End Class
The linq I am using;
Private Shared Function LoadAll() As List(Of AssetAllocation)
Dim rtn = New List(Of AssetAllocation)
Using dt = GetRawData()
Dim dist = (From x In dt.AsEnumerable Select x!TP_ID).ToList().Distinct()
rtn.AddRange(From i As Integer In dist
Select New AssetAllocation With {
.TpId = i,
.Allocation = (From b In dt.AsEnumerable
Where b!TP_ID = i Select New Sector With {
.Description = b!DESCRIPTION.ToString(),
.Weighting = b!WEIGHT
}).ToList()})
End Using
Return rtn
End Function
It is taking a long time to execute the linq which is due to the inner query constructing the list of sectors. The distinct list contains 80k
Can this be improved at all?
If I've understood what you are trying to do this query should have much better performance. The trick is to use GroupBy
to avoid having to search the entire table for matching id's in every iteration.
I have written it in C# but I'm sure you can translate it into VB.
var rtn =
dt.AsEnumerable()
.GroupBy(x => x.Field<int>("TP_ID"))
.Select(x => new AssetAllocation()
{
TpId = x.Key,
Allocation = x.Select(y => new Sector
{
Description = y.Field<string>("Description"),
Weighting = y.Field<decimal>("WEIGHT")
}).ToList()
}).ToList();