Search code examples
mysqlsqlapache-drilldremio

How to back fill data from a sql join only where count of field is equal to 1 while maintaning records from previous joins?


I have two tables I am able to join like so:

select * from 
(select * from table1 t1
left join table2 t2 on t1.id = t2.id )

I want to add a third table where I group by email, to back fill data in the join above but I only want to backfill the data for records that have an email count of 1 only. If there are duplicating email addresses for different records they should be excluded.

I Have been trying this query:

  select * from 
    (select * from table1 t1
    left join table2 t2 on t1.id = t2.id 
    inner join ((select email from table3 group by email
having count(*) =1) t3
     on t3.email = t1.emailaddress)

At this point when I coalesce the email field with others in the bigger join I still see records back filled with data while having email counts greater than 1 being backfilled.

i.e

table from LEFT JOIN only:

email      missing_id 
[email protected] 
[email protected]

table3 data ONLY

email       missing_id
[email protected]        1
[email protected]        2
[email protected]        3

All tables joined where email only occurs once, should back fill the data in the left join like so:

email      missing_id 
[email protected]       
[email protected]        3

Solution

  • First, your first query will return an error in almost any database, because you will have two columns with the same name in the subquery. But I get the idea.

    If I understand correctly, this should do what you want:

    select . . ., t3.id as missing_id
    from table1 t1 left join
         table2 t2
         on t1.id = t2.id left join
         (select t3.email, max(t3.id) as id
          from table3 t3
          group by t3.email
          having count(*) = 1
         ) t3
         on t3.email = t1.emailaddress;
    

    This is very close to your query, so I'm not sure if it will fix anything.