Search code examples
mysqlsqlpeewee

How to remove duplicates based on a column from one or more resultset


I have two resultset from the same table in following fashion:

|    id    |    name    |    age    |
-------------------------------------
|    1     |    John    |    12     |
|    2     |    Jim     |    23     |
|    3     |    Joel    |    54     |
|    4     |    Jane    |    12     |

If i create resultset based on a an age interval and a row occurs as a duplicate between 1 or more resultsets, how can i efficiently filter these based on the name column? The amount of resultsets is 1..x.

Should i: loop through the sets, combine the sets and filter at the same time or can i make a query directly with SQL and remove duplicates?

Since i use peewee my queries are eg.:

result_1 = Users.select().where(Users.age > 10)
result_2 = Users.select().where(Users.age > 20)

UPDATE:

There are no duplicates in the actual table, only accross the resultsets, i wish to have either make one resultset without the duplicates, or combine them all without duplicates.


Solution

  • If I understand you correctly, this is what UNION does, out of the box:

    SELECT ID, Name, Age
    FROM Users
    WHERE Age > 10
    
    UNION
    
    SELECT ID, Name, Age
    FROM Users
    WHERE Age > 20;
    

    You can do this multiple selects. The requirement is that the selects must all return the same number of columns, with compatible types.

    SqlFiddle here

    Edit :
    If you need to additionally filter, wrap the union in a derived table:

    SELECT *
    FROM
    (
      SELECT ID, Name, Age
      FROM Users
      WHERE Age > 10
    
      UNION
    
      SELECT ID, Name, Age
      FROM Users
      WHERE Age > 20
    ) X
    WHERE x.Name LIKE 'Jo%';
    

    Updated Fiddle