Search code examples
sqlsql-serversql-server-2017

Is there a way to determine in SQL Server 2017 if there are duplicates within a column of a query and change that input?


I am working on SQL Server 2017 writing a query that determines the emails of employees that looks like:

firstname.lastname123@email.com.

The three digits are part of their employee IDs. (I did not pick this way to determine emails and cannot change it).

I am running into a problem where there just happens to be employees with the same name and have the same last three digits of their employee ID. When this happens I need to add a number in sequencing order to the email to make them unique, i.e.,

Examples:

Name: John Doe, Employee ID: 00123 john.doe123@email.com

Name: John Doe, Employee ID: 12123 john.doe1231@email.com

Name: John Doe, Employee ID: 98123 john.doe1232@email.com

I do not need to assume there will be more than 10 of the same name and number so there should be no cases beyond having 4 digits within an email. Also, the way it determines who has the lesser digits within their emails is by the order of the employee ID. Who has the lower employee ID has the earlier email in the sequence described above.

The query that creates their email is simply pulling first name, last name, and last three digits of their ID and concatenating them together in a single column. (first + '.' + last + right(em_id,3) + 'email.com')

How/What can I implement in order to create the desired outcome described above if duplicates exist?


Solution

  • You can use row_number() with cast function to turn it into varchar like this:

    select (name + '.' 
            + surname 
            + right(Employee_ID,3) 
            + cast(row_number() over(partition by name, surname order by Employee_ID) as varchar) 
            + 'email.com') 
    from employee
    

    Wit this you will have results like :

    • john.doe1231@email.com
    • john.doe1232@email.com
    • john.doe1233@email.com
    • peter.peterson2441@email.com

    And with this modification:

    select case when row_number() over(partition by name, surname order by Employee_ID) > 1 
           then (name + '.' 
            + surname 
            + right(Employee_ID,3) 
            + cast(row_number() over(partition by name, surname order by Employee_ID)-1 as varchar) 
            + 'email.com')
           else (name + '.' 
            + surname 
            + right(Employee_ID,3)
            + 'email.com')
           end
    from employee    
    

    You will get the results you have asked for.

    Here is a DEMO