Search code examples
mysqlsqlcountsubquerywindow-functions

Find duplicate Values in a SQL Table and add unique value in a column


I have a table with locations and there are duplicates because there are locations with same name that belongs to different counties.

  +------+-----------+-----------
| ID   | location   | parent_id  | 
+------+-----------+------------+
|  1   | Country   |    0       |         
|  2   | County 1  |    1       |         
|  3   | County 2  |    1       |         
|  4   | A         |    2       |         
|  5   | B         |    2       |         
|  6   | C         |    2       |          
|  7   | A         |    3       |         
|  8   | E         |    3       |           
|  9   | B         |    3       |         
+---- -+-----------+------------+

I would like to create and insert in column duplicate for each duplicate ID value . The result should look like this: ( Would be better to add ID in duplicate column to all duplicates excepting the one with smallest ID)

| ID   | location     | parent_id  | duplicate|
+------+-----------+------------+----------+
|  1   | Country   |    0       |         |
|  2   | County 1  |    1       |         | 
|  3   | County 2  |    1       |         |  
|  4   | A         |    2       |     4   | 
|  5   | B         |    2       |     5   | 
|  6   | C         |    2       |         | 
|  7   | A         |    3       |     7   | 
|  8   | E         |    3       |         |    
|  9   | B         |    3       |     9   | 
+---- -+-----------+------------+---------+

I use this to list the rows:

SELECT a.* FROM mytable a

JOIN (SELECT location, COUNT(*)
FROM mytable 
GROUP BY location
HAVING count(*) > 1 ) b
ON a.location= b.location
ORDER BY a.location

Solution

  • It looks like a window count can do what you want:

    select t.*,
        case when count(*) over(partition by location) > 1 then id end duplicate
    from mytable t
    

    This requires MySQL 8.0. In earlier versions, an alternative is a correlated subquery:

    select t.*,
        case when (select count(*) from mytable t1 where t1.location = t.location) > 1 then id end duplicate
    from mytable t