Search code examples
sqlteradatateradata-sql-assistant

Comparison of two tables in Teradata


I'd need to compare two tables in Teradata SQL.

A sample of the two tables is

Table 1

User  Date          Ind1    Ind2
12   2010-05-23     Y       N
12   2010-06-23     Y       Y
12   2010-07-23     Y       N
23   2010-04-23     N       N
23   2010-05-23     N       N
23   2010-06-12     Y       N

Table 2

User  Date          Ind1    Ind2
12   2010-05-23     Y       N
12   2010-06-23     Y       Y
12   2010-07-23     N       N
24   2010-03-22     N       N
23   2010-04-23     N       Y
23   2010-05-23     N       Y
23   2010-06-12     Y       N

In this example, Table 1 differs from Table 2 because it has not the user 24 and also some values in Ind1 and Ind2 have changed. I usually compare data outside Teradata as I am not familiar with the code. Since the tables are very big, I'd like to know if there is a way to compare directly these tables there (for example, using join for users; I'm interested in looking at cases where there are more Ind1=Y in Table 1).

Output expected:

Table 1 (where Ind1 and Ind2 have at least 1 Y)
    User  Date          Ind1    Ind2
    12   2010-05-23     Y       N
    12   2010-06-23     Y       Y
    12   2010-07-23     Y       N
    23   2010-06-12     Y       N

Table 2

User  Date          Ind1    Ind2
12   2010-06-23     Y       Y      /* it differs from Ind2 */
12   2010-07-23     N       N      /* it differs from both indicators */
24   2010-03-22     N       N     /* Not included in Table 1 */
23   2010-04-23     N       Y     /* it differs from Ind2 */
23   2010-05-23     N       Y     /* it differs from Ind2 */ 
23   2010-06-12     Y       N     /* it differs from Ind1 */

Solution

  • A typical method for comparing such tables would use full join:

    select t1.*, t2.*
    from table1 t1 full join
         table2 t2
         on t1.user = t2.user and t1.date = t2.date and
            t1.ind1 = t2.ind1 and t1.ind2 = t2.ind2
    where t1.user is null or t2.user is null;
    

    This returns the rows that are in each table that are not in the other. Note: It doesn't handle duplicate rows very well -- 1 duplicate in one table matches any number in the other table, for instance.