Search code examples
sqlssms

How I can filter a table to retrieve only one ocurence of each recors


I am trying to find only one occurrence for each customer.

However, in my database I have customers that have been added twice (following an ERP migration)

Currently,

If I try to find a customer that has two occurrences, I have to keep the customer that has a 'C' in the "customer_id" column

In this example we have "Manu Johns" who appears 2x so we must keep the one who has a 'C' in the customer_id column in the final table.

If I only find one occurrence of this customer. But, which does not have a 'C' in the customer_id column. We have to add it as is in the final table

In this example we have "Mathieu Wainers" which appears only once we keep it as it is in the final table

Which query would allow me to have this result : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=9484f43c0a6c1ccdae7d659ca53e1eab

CREATE TABLE PersonsInitial (
    tel int,
    firstname varchar(255),
    lastname varchar(255),
    Customer_ID varchar(255)
);
insert into PersonsInitial(tel,firstname,lastname,Customer_ID) values
('01234','Manu','Johns','456'),
('01234','Manu','Johns','C456'),
('21234','Fernand','Wajk','C389'),
('13554','Mathieu','Wainers','4683');

select distinct tel, firstname, lastname, customer_id from PersonsInitial

--if there is a person with the same tel number chose the customer id with 'C'
--if I don't have the choice add the customer without C

CREATE TABLE PersonsFinal (
    tel int,
    firstname varchar(255),
    lastname varchar(255),
    Customer_ID varchar(255)
);
insert into PersonsFinal(tel,firstname,lastname,Customer_ID) values
('01234','Manu','Johns','C456'),
('21234','Fernand','Wajk','C389'),
('13554','Mathieu','Wainers','4683');

select distinct tel, firstname, lastname, customer_id from PersonsFinal

Solution

  • You may rank them first based on whether it has or not "C" in the customer id. That's why cte is here for.

    with cte as (select row_number() over (partition by tel, firstname, lastname order by case when left(customer_id, 1) = 'C' then 0 else 1 end) rn, 
                        p.* 
                   from PersonsInitial p)
    select *
      from cte 
     where rn = 1; <-- selects only those with "C" or those for that there is no "C" lines
    

    dbfiddle