Search code examples
sqlsqlitegroup-byhavingsql-in

Find duplicate values based on specific criteria


I am sorry in advance if this question has already been answered. I am still pretty new to SQL.

I have a database that contains client data.

Each row in the database contains a customer_number and a end_record_date (which is either a date in the past or '00:00:00' if the customer is still active).

I need to find all rows of customers that appear at least twice and in the database (so their customer_number appears at least on two rows). But I only want to get the specific rows if for that customer_number they are active on at least two rows (so the value for end_record_date for that specific customer needs to be '00:00:00' on at least two rows).

So this is what I want to find:

row_id customer_number end_record date
1 12345 '00:00:00'
346 7568 '2021-01-01'
89 7568 '00:00:00'
1287 12345 '00:00:00'

In the above example i would want to get the rows 1 and 1287 returned because those meet my criteria. rowid 89 and 346 do not meet my criteria

How would i go on about this?

Im using sqlitestudio on a .sd3b database


Solution

  • You can get the customer_numbers that you want if you group by customer_number and set the condition in the HAVING clause:

    SELECT customer_number
    FROM tablename
    GROUP BY customer_number
    HAVING SUM(end_record_date = '00:00:00') >= 2;
    

    To get all the rows of the table that meet your condition, use the operator IN:

    SELECT *
    FROM tablename
    WHERE customer_number IN (
      SELECT customer_number
      FROM tablename
      GROUP BY customer_number
      HAVING SUM(end_record_date = '00:00:00') >= 2
    );
    

    See the demo.