Search code examples
sqlmysql

SQL filter for same values in different columns


I have a customer database from which we are trying to create a rooming list.

I am asking about when working with three of the columns name, roommate and eventid within the table guests. The issue is some guest names will appear in both columns in different rows.

I am wondering if there is an automagical SQL query that would return just one row when two rows contain the same names. It doesn't matter if the first or second shared name row gets returned.

basic database concept:

row name roommate eventid
1 John Smith Mary Smith trip12
2 Joe Blow Sally Blow trip12
3 Mary Smith John Smith trip12

I would like the SQL data returned for trip12 to be only two rows, #2 and either #1 or #3 but not both.

Not sure how to include a minimal reproducible example, as I haven't a clue as to how to write SQL to filter in this manner.


Solution

  • You can define name and roommate in ascending order of the name.

    least(name,roommate) as name1 ,greatest(name,roommate) as name2
    

    Then the string comparison will be unambiguous. You can use the row_number() function and use the list of columns and expressions for partition like this.

    row_number()over(partition by eventid, least(name,roommate), greatest(name,roommate)) rn
    

    Then select the first (rn=1) rows in each group.

    See example

    create table test(row_num int, name varchar(30), roommate varchar(30), eventid varchar(30));
    insert into test values
     (1,'John Smith','Mary Smith','trip12')
    ,(2,'Joe Blow'  ,'Sally Blow','trip12')
    ,(3,'Mary Smith','John Smith','trip12')
    ,(4,'Joe Blow'  ,'Sally Blow','trip12')
    ;
    
    row_num name roommate eventid
    1 John Smith Mary Smith trip12
    2 Joe Blow Sally Blow trip12
    3 Mary Smith John Smith trip12
    4 Joe Blow Sally Blow trip12

    Ranged rows

    select *
      ,row_number()over(partition by eventid, least(name,roommate),greatest(name,roommate)) rn
    from test
    
    row_num name roommate eventid rn
    2 Joe Blow Sally Blow trip12 1
    4 Joe Blow Sally Blow trip12 2
    1 John Smith Mary Smith trip12 1
    3 Mary Smith John Smith trip12 2

    Whole query

    select * 
    from(
    select *
      ,row_number()over(partition by eventid, least(name,roommate),greatest(name,roommate)) rn
    from test
    )a
    where rn=1
    
    row_num name roommate eventid rn
    2 Joe Blow Sally Blow trip12 1
    1 John Smith Mary Smith trip12 1

    fiddle