I have been tasked with matching 1.7 million records with some results which have been passed to me in a csv file.
little bit of background to the below code, i have two lists...
Certs which contains 5 properties with ID being the equivalent of a PK.
Orders which contains a a list of ID's which should be contained in the certs list.
I need to match the two and do something with those Cert objects which are found.
foreach (Classes.CertOrder.IDS OrderUnitID in Order.AllIDs)
{
var Cert = (from C in Certs where C.ID.ToUpper() == OrderUnitID.ID.ToUpper() select C).FirstOrDefault();
if (Cert != null)
{
Output.add(Cert)
OrderUnitID.fulfilled = true;
}
}
This code works but its super slow (to be expected i guess with the amount of records) Is there any way i can speed this up?
Edit to Add, would love to be able to add the data to a SQL server to run the queries however the data is such that it is not allowed to leave the workstation on which the file is being processed or even allowed to touch the disk in an un-encrypted form.
In combination with the helpful answer below i have change my output to be list based, pre-sorted both lists by ID and now the processing takes seconds rather than hours! Thanks stack overflow!
Buid a dictionary from Certs
:
var certsMapping = Certs
.ToDictionary(_ => _.ID.ToUpper());
foreach (Classes.CertOrder.IDS OrderUnitID in Order.AllIDs)
{
if (certMapping.TryGetValue(OrderUnitID.ID.ToUpper(), out var cert))
{
Output.add(cert);
OrderUnitID.fulfilled = true;
}
}