I have two data tables in my C# code, each with a single column called ID of type string
.
I want to (inner) join the two tables on the ID column and get a count of the common rows. I don't need the IDs themselves, just the count of common IDs.
One of the tables may have duplicate IDs but the join must consider only distinct values so if there are two rows with ID=544 they should only count as one.
The two tables may have over a million rows each so performance is an issue. What would be the simplest and most efficient way to code this in C#?
It needs a simple inner join. If i understood you properly here i have given the solution. Try it and let me know if it solves your problem
using System;
using System.Linq;
using System.Data;
namespace JoinDatatablesConsoleApp
{
class Program
{
static void Main(string[] args)
{
DataTable dt1 = new DataTable();
dt1.Columns.Add("ID", typeof(string));
DataTable dt2 = new DataTable();
dt2.Columns.Add("ID", typeof(string));
for(int i=1;i<5;i++)
{
DataRow dr1 = dt1.NewRow();
dr1["ID"] = i + "A";
dt1.Rows.Add(dr1);
DataRow dr2 = dt2.NewRow();
dr2["ID"] = i + "A";
dt2.Rows.Add(dr2);
}
DataRow dr3 = dt2.NewRow();
dr3["ID"] = "7A";
dt2.Rows.Add(dr3);
var commonData = (from f1 in dt1.AsEnumerable()
join f2 in dt2.AsEnumerable()
on f1.Field<string>("ID")
equals f2.Field<string>("ID")
select f1.Field<string>("ID"))
.Distinct().ToList();
Console.WriteLine("Common Data : ");
foreach(var item in commonData)
{
Console.WriteLine(item);
}
Console.WriteLine("Common Data Count : " + commonData.Count);
Console.Read();
}
}
}