Search code examples
mysqlsqlgreatest-n-per-groupmysql-8.0

How to remove duplicate data in MySQL


i have table user like this

UserID   | UserName | UserEmail       | FlagUser
1        | Penny    | penny@gmail.com | A1
2        | Maika    | maika@gmail.com | A1
3        | Laila    | laila@gmail.com | A1
4        | Laura    | laura@gmail.com | A2
5        | Penny    | penny@gmail.com | A2
6        | Maika    | maika@gmail.com | A2

i want the result like this, if user has more than one data i want get the user with FlagUser A2

UserID   | UserName | UserEmail       | FlagUser
3        | Laila    | laila@gmail.com | A1
4        | Laura    | laura@gmail.com | A2
5        | Penny    | penny@gmail.com | A2
6        | Maika    | maika@gmail.com | A2

Solution

  • If you are using mysql 8.0 then row_number() WITH Common table expression is the convenient way:

    Schema:

     create table user(UserID int, UserName varchar(50), UserEmail varchar(50), FlagUser varchar(50));
     insert into user values(1,        'Penny'    , 'penny@gmail.com'  ,'A1');
     insert into user values(2,        'Maika'    , 'maika@gmail.com'  ,'A1');
     insert into user values(3,        'Laila'    , 'laila@gmail.com'  ,'A1');
     insert into user values(4,        'Laura'    , 'laura@gmail.com'  ,'A2');
     insert into user values(5,        'Penny'    , 'penny@gmail.com'  ,'A2');
     insert into user values(6,        'Maika'    , 'maika@gmail.com'  ,'A2');
    

    Query

     with cte as
     (
         select *,row_number()over(partition by username order by flaguser desc) rn
         from user
     )
     select * from cte where rn=1
    

    Output:

    UserID UserName UserEmail FlagUser rn
    3 Laila laila@gmail.com A1 1
    4 Laura laura@gmail.com A2 1
    6 Maika maika@gmail.com A2 1
    5 Penny penny@gmail.com A2 1

    db<>fiddle here

    You can also have the same result with Inner join and group by clause:

     select u.* from user u
     inner join
     (select username,max(flaguser) maxflaguser from user
     group by username) u2
     on u.username=u2.username and u.flaguser=u2.maxflaguser
    

    Output:

    UserID UserName UserEmail FlagUser
    3 Laila laila@gmail.com A1
    4 Laura laura@gmail.com A2
    5 Penny penny@gmail.com A2
    6 Maika maika@gmail.com A2

    Or you can just use subquery to select single row for each user with highest value of flaguser:

     Select * from user u
     where flaguser=(select max(flaguser) from user u2 where u.username=u2.username)
    

    Output:

    UserID UserName UserEmail FlagUser
    3 Laila laila@gmail.com A1
    4 Laura laura@gmail.com A2
    5 Penny penny@gmail.com A2
    6 Maika maika@gmail.com A2

    db<>fiddle here