Search code examples
sqlsql-serverparsingsql-server-2008duplicates

Parsing First Name and Last Name


I am using SQL Server 2008 for this. I am trying to parse first name and last name separately from a column that contains a full name value. After parsing, I would like to store that value in a different column. There are duplicate records in the table, such as customer number (Cust_No), but contact is different for each duplicate cust_no. So for example below, a customer can have many contacts:

cust_no     contact name
-------------------------
1234        John Doe
1234        Jack Smith

I would like to parse John separately from Doe and store the values in different columns called, FirstName1 and LastName1. Similarly, I would like to have FirstName2 and LastName2 columns for Jack and Smith respectively after having parsed the values from contact name column. The result should be as below with unique cust_no.

cust_no    FirstName1   LastName1    FirstName2   LastName2
-----------------------------------------------------------
1234       John         Doe          Jack         Smith

Using a SQL Server query designer (SQL Server 2008), I have tried creating a view on a customer table by just adding cust_no as output and using group by on cust_no. Further, I have used that view/query as a primary table on the left side and linked it with another table (on right) which has many records. I managed to parse the FirstName1 and LastName1 from the contact name column but I am unable to parse the second set of FirstName2 and LastName2 from the contact name column of SQL Server table with repeating cust_no.

My approach described above is probably ambiguous, but I wanted to know if there is function that I can use to parse the names separately based on duplicate cust_no. Is there a way to parse the values so that desired output can be achieved?


Solution

  • If your data is as consistent as you say. parsename() with a conditional aggregation is an option

    Example

    Select cust_no
          ,LName1 = max( case when RN=1 then LName end)
          ,FName1 = max( case when RN=1 then FName end)
          ,LName2 = max( case when RN=2 then LName end)
          ,FName2 = max( case when RN=2 then FName end)
     From ( 
            Select cust_no
                  ,RN    = row_number() over (partition by cust_no order by [contact name])
                  ,LName = parsename(replace([contact name],' ','.'),1)
                  ,FName = parsename(replace([contact name],' ','.'),2)
             From  YourTable
          ) A
     Group By cust_no
    

    Results

    cust_no LName1  FName1  LName2  FName2
    1234    Smith   Jack    Doe     John
    

    NOTE:

    If Jack Smit is duplicated, use dense_rank() rather than row_number(). To be honest, I don't recall if dense_rank() is available in 2008... It is pretty dated.