Search code examples
sqlsql-serverpivotaggregate-functionsfull-outer-join

How to combine data from 2 tables -- which join, what conditions?


Consider the following 2 tables.

TableDE

ID  country  key1   key2
------------------------
1   US       1      null
1   US       1      null
1   US       1      null
2   US       null   null
3   US       1      1
4   DE       1      1
5   DE       null   null
5   DE       null   null

TableUS

ID  key1  key2
--------------
1   null  null
2   null  1
4   1     1
8   null  1
2   null  1
2   null  1
9   1     null

I need a distinct overview of all IDs, combining data from both tables:

ID  inTableDe  country  DEkey1  DEkey2  inTableUS  USkey1  USKey2
-----------------------------------------------------------------
1   1          US       1       0       1          0       0
2   1          US       0       0       1          0       1
3   1          US       1       1       0          0       0
4   1          DE       1       1       1          1       1
5   1          DE       0       0       0          0       0
8   0          0        0       1       1          0       1
9   0          0        0       1       1          1       0

I hope it speaks for itself:

  • ID 8 and ID 9 have 0 in the first column bc they aren't in tableDE
  • ID 8 and ID 9 have 0 in the country column bc this field doesn't exist in tableUS
  • ID 3 has 0 in inTableUS bc it only exists in tableDE
  • the key values are copied from the original tables
  • an ID is not unique: it can appear many times in both tables. However: the values for key1 and key2 will always be the same for each ID within the same table.

I have been messing for hours now with this; I have this now:

select          de.[ID], 
                de.[country],
                case when (de.[ID] in (select distinct [ID] from [tableDE]) then 1 else 0 end as [inTableDE],
                case when (de.[ID] in (select distinct [ID] from [tableUS]) then 1 else 0 end as [inTableUS],
                de.[key1] as [DEKey1],
                de.[key2] as [DEKey2],
                us.[key1] as [USKey1],
                us.[key2] as [USKey2],
from            dbo.[tableDE] de
full outer join dbo.[tableUS] us on de.[ID] = us.[ID]
where           de.[country] = 'US'
and             (de.[key1] = 1 or de.[key2] = 1 or us.[key1] = 1 or us.[key2] = 1)
group by        de.[ID], us.[ID]

But this keeps giving me only values that are in both tables.

What am I doing wrong?


Solution

  • You sem to want aggregation on top of the full join:

    select          
        coalesce(de.id, us.id)                    as id,
        case when de.id is null then 0 else 1 end as intablede,
        max(de.country)                           as country,
        coalesce(max(de.key1), 0)                 as dekey1,
        coalesce(max(de.key2), 0)                 as dekey2,
        case when us.id is null then 0 else 1 end as intableus,
        coalesce(max(us.key1), 0)                 as uskey1,
        coalesce(max(us.key2), 0)                 as uskey2
    from dbo.tablede de
    full join dbo.tableus us on de.id = us.id
    group by de.id, us.id
    order by id
    

    Demo on DB Fiddle:

    id | intablede | country | dekey1 | dekey2 | intableus | uskey1 | uskey2
    -: | --------: | :------ | -----: | -----: | --------: | -----: | -----:
     1 |         1 | US      |      1 |      0 |         1 |      0 |      0
     2 |         1 | US      |      0 |      0 |         1 |      0 |      1
     3 |         1 | US      |      1 |      1 |         0 |      0 |      0
     4 |         1 | DE      |      1 |      1 |         1 |      1 |      1
     5 |         1 | DE      |      0 |      0 |         0 |      0 |      0
     8 |         0 | null    |      0 |      0 |         1 |      0 |      1
     9 |         0 | null    |      0 |      0 |         1 |      1 |      0