I have a customer database from which we are trying to create a rooming list.
I am asking about when working with three of the columns name
, roommate
and eventid
within the table guests
. The issue is some guest names will appear in both columns in different rows.
I am wondering if there is an automagical SQL query that would return just one row when two rows contain the same names. It doesn't matter if the first or second shared name row gets returned.
basic database concept:
row | name | roommate | eventid |
---|---|---|---|
1 | John Smith | Mary Smith | trip12 |
2 | Joe Blow | Sally Blow | trip12 |
3 | Mary Smith | John Smith | trip12 |
I would like the SQL data returned for trip12
to be only two rows, #2 and either #1 or #3 but not both.
Not sure how to include a minimal reproducible example, as I haven't a clue as to how to write SQL to filter in this manner.
You can define name and roommate in ascending order of the name.
least(name,roommate) as name1 ,greatest(name,roommate) as name2
Then the string comparison will be unambiguous. You can use the row_number() function and use the list of columns and expressions for partition like this.
row_number()over(partition by eventid, least(name,roommate), greatest(name,roommate)) rn
Then select the first (rn=1) rows in each group.
See example
create table test(row_num int, name varchar(30), roommate varchar(30), eventid varchar(30));
insert into test values
(1,'John Smith','Mary Smith','trip12')
,(2,'Joe Blow' ,'Sally Blow','trip12')
,(3,'Mary Smith','John Smith','trip12')
,(4,'Joe Blow' ,'Sally Blow','trip12')
;
row_num | name | roommate | eventid |
---|---|---|---|
1 | John Smith | Mary Smith | trip12 |
2 | Joe Blow | Sally Blow | trip12 |
3 | Mary Smith | John Smith | trip12 |
4 | Joe Blow | Sally Blow | trip12 |
Ranged rows
select *
,row_number()over(partition by eventid, least(name,roommate),greatest(name,roommate)) rn
from test
row_num | name | roommate | eventid | rn |
---|---|---|---|---|
2 | Joe Blow | Sally Blow | trip12 | 1 |
4 | Joe Blow | Sally Blow | trip12 | 2 |
1 | John Smith | Mary Smith | trip12 | 1 |
3 | Mary Smith | John Smith | trip12 | 2 |
Whole query
select *
from(
select *
,row_number()over(partition by eventid, least(name,roommate),greatest(name,roommate)) rn
from test
)a
where rn=1
row_num | name | roommate | eventid | rn |
---|---|---|---|---|
2 | Joe Blow | Sally Blow | trip12 | 1 |
1 | John Smith | Mary Smith | trip12 | 1 |