Search code examples
sqldatabaset-sqlsap-ase

SQL find identical group


Given a table like:

id    key   val
----  ----  -----
bob   hair  red
bob   eyes  green

And another table like:

id    key   val
----  ----  -----
fred  hair  red
fred  eyes  green
fred  shoe  42
joe   hair  red 
joe   eyes  green
greg  eyes  blue
greg  hair  brown

I'd like to find people in table b who match people in table a exactly, in this case Bob and Joe. Fred doesn't count because he also has a shoe size. This is in Sybase so there's no full outer join. I've come up with a select of a select with a union that returns people who definitely aren't the same, but I'm not sure how to efficiently select people who are.

Alternatively, if it's simpler, how can I check which groups in a occur in b more than once?


Solution

  • Try this

    select a.id,b.id
    from a 
    join b on a.[key] = b.[key] and a.val = b.val -- match all rows
    join (select id,count(*) total from a group by id) a2 on a.id = a2.id -- get the total keys for table a per id
    join (select id,count(*) total from b group by id) b2 on b.id = b2.id -- get the total keys for table b per id
    group by a.id,b.id,a2.total,b2.total
    having count(*) = a2.total AND count(*) = b2.total -- the matching row's total should be equal with each tables keys per id
    

    After @t-clausen.dk comments I made a revision of the original sql code. In this case i count each distinct pair/value that matches on both tables, with each tables distinct pair/value.

    select td.aid,td.bid
    from (
    select a.id as aid,b.id as bid, count(distinct a.[key]+' '+a.val) total
    from a 
    join b on a.[kry] = b.[key] and a.val = b.val
    group by a.id,b.id
    ) td -- match all distinct attribute rows
    join (select id,count(distinct [key]+' '+val) total from a group by id) a2 on td.aid = a2.id -- get the total distinct keys for table a per id
    join (select id,count(distinct [key]+' '+val) total from b group by id) b2 on td.bid = b2.id -- get the total keys for table b per id
    where td.total = a2.total AND td.total = b2.total -- the matching distinct attribute total should be equal with each tables distinct key-val pair
    
    Tested on
    
    Table a
    
    bob     hair    red
    bob     eyes    green
    nick    hair    red
    nick    eyes    green
    nick    shoe    45
    
    Table b
    
    fred    hair    red
    fred    eyes    green
    joe     hair    red
    joe     eyes    green
    fred    shoe    42