Search code examples
sqlteradataproc-sql

Find differences across two datasets


I would like to get information on users that are in a table but not in another one. Sample of data:

Table_1
user1 user2 flag_ind
Ambra Burn    1
Ambra Downs   0
Chris Leopold 1
Martin Stewart 0



Table 2
user1 user2 flag_ind
Ambra Burn    0
Greg  See     1
Curtis John   0


Expected output

    user1 user2 
    Ambra Downs    
    Chris Leopold 
    Martin Stewart 
    Greg See
    Curtis John

I used a left join to get the differences between the two datasets. I would need to compare values of user1 from Table 1 and values of user1 from Table 2, as well as values of user2 from Table 1 and values of user2 from Table 2 and get a final list. Now, what I have found is a difference in the missing values when I use a left join vs. minus. Could you help me to get the desired output and understand why left join and minus reproduce different values?

Thanks


Solution

  • I don't understand what you mean by left join minus. It's better that you can show your sql statements. But a solution could be follow:

    (SELECT *
    FROM Table_1
    EXCEPT DISTINCT
    SELECT *
    FROM Table_2)
    
    UNION DISTINCT
    
    (SELECT *
    FROM Table_2
    EXCEPT DISTINCT
    SELECT *
    FROM Table_1)
    

    The first part before UNION DISTINCT gives you what's in table_1 that's not in table_2.

    The later part after UNION DISTINCT gives what's in table_2 that are not in table_1.

    And you combine with UNION DISTINCT, you got what's differnece between the two tables.