Search code examples
pythonpsqlpeewee

How to find duplicate names from table?


I know we can use GROUP BY and HAVING COUNT > 1. But this works when you have duplicate data. I have a little bit different data.

Id Names
1  Rahul S
2  Rohit S
3  Rishu
4  Sinu
5   Rahul S
6  Rohit  S

In the above table id 1 and 5 are same and 2 and 6 are also same. But when I use group by it count as different because of spaces. So how can I write a query with fuzzy logic something which will return these kind of duplicate datas.

**Update

Can someone help me with a query which will remove spaces from a particular column and add a imaginary column on that we can group by having count > 1

SELECT replace(ltrim(rtrim(name)),' ','') as no_space FROM table GROUP BY no_space HAVING count(*) > 1 ORDER BY no_space;


Solution

  • Have you tried removing the spaces on selecting of the data?

    Doing it this way ought to Cut the spaces in the string, and provide similar data. Take in mind that i mean removing double spaces and replacing it with 1 space, then doing a left and right trim on the data

    Something like this :

    REPLACE(LTRIM(RTRIM(colName)), '  ', ' ')
    

    Just remember to include the having Count > 1