Search code examples
c#linqdatatable

Create a DataTable containing only unique values with LINQ in C#


I have a DataTable dt_Candidates

      Candidate      |   First Name   |   Last Name   
 --------------------|----------------|--------------- 
  John, Kennedy      | John           | Kennedy       
  Richard, Nixon     | Richard        | Nixon         
  Eleanor, Roosevelt | Eleanor        | Roosevelt     
  Jack, Black        | Jack           | Black         
  Richard, Nixon     | Richard        | Nixon         

I want to create a new DataTable containing ONLY unique values (based on the Candidate column) like below called dt_Candidates2, without nested loops and preferably using LINQ.

      Candidate      |   First Name   |   Last Name   
 --------------------|----------------|--------------- 
  John, Kennedy      | John           | Kennedy       
  Eleanor, Roosevelt | Eleanor        | Roosevelt     
  Jack, Black        | Jack           | Black         

And at the same time, create a list or an array called RejectedCandidates containing the distinct duplicates

RejectedCandidates = {"Richard, Nixon"}

Solution

  • As noted, I don't think it really needs LINQ here. It can go something like this:

    DataTable dt = new DataTable();
    dt.Columns.Add("Candidate");
    dt.Columns.Add("First");
    dt.Columns.Add("Last");
    dt.PrimaryKey = new []{ dt.Columns["Candidate"] }; //means that dt.Find() will work
    
    while(...){
      string candidate = ...
    
      if(dt.Rows.Find(candidate) != null)
        RejectList.Add(...);
      else
        dt.Rows.Add(...);
    }
    

    Avoid using LINQ's .Any on a DataTable for this. Not only is it a pain to get going because it needs casting steps or extension libraries (see here) to, it will then use loops to find the info you seek; the built-in mechanism for the PrimaryKey uses hash tables for much faster lookups.