Search code examples
sqlsql-serverconcatenationdata-cleaning

Remove duplicate (almost) rows based on value in the distinct column


Consider this result in SQL Server:

ID    Check
-----------
9052    N
9052    Y
2049    N
2049    Y
6940    N
6940    Y
7941    N
8118    N
8187    N

How can I delete the duplicate ID rows and leave only Y in Check column if there is at least one Y for the same ID. If there are IDs only with N, keep them. I have quite a few other columns not displayed in the example. So the expected results would be:

ID    Check
------------
9052    Y
2049    Y
6940    Y
7941    N
8118    N
8187    N

Solution

  • CHECK is a reserved word so I would avoid that column name. At any rate, since Y > N then you can use MAX.

    select 
     id, 
     max(check_col) as check_col
    from table1
    group by id
    order by id
    
    id check_col
    2049 Y
    6940 Y
    7941 N
    8118 N
    8187 N
    9052 Y

    fiddle