Search code examples
sqlteradata-sql-assistantteradatasql

Removing duplicate row using UNION in Teradata SQL


I'm using Teradata sql to extract data using UNION.

SEL CAST(a.dttm AS DATE), count(a.cs) FROM  cin.cell a
LEFT JOIN cin.comm c ON a.cs_sk = c.cs_sk
LEFT JOIN CIN.CID d ON a.cn_cd = d.CN_CD
WHERE CAST(a.dttm AS DATE) >= CURRENT_DATE-10
GROUP BY 1
UNION 
SEL CAST(a.dttm AS DATE), count(a.cs) FROM  cin_ps.cell a
LEFT JOIN cin_ps.comm c ON a.cs_sk = c.cs_sk
LEFT JOIN CIN_ps.CID d ON a.cn_cd = d.CN_CD
WHERE CAST(a.dttm AS DATE) >= CURRENT_DATE-10
GROUP BY 1

but I'm getting duplicate rows in first column as below Please note, there might be the case there is no row for any particular day for any first set of table or second set of table

Current result:

N.  PROCESSED_DTTM  Count(cs)
1   4/8/2022    40
2   4/8/2022    66
3   4/9/2022    49
4   4/9/2022    71
5   4/10/2022   117
6   4/10/2022   1430
7   4/11/2022   261
8   4/11/2022   841

Required results:

N.  PROCESSED_DTTM  Count(cs)
1   4/8/2022    106
2   4/9/2022    120
5   4/10/2022   1547
7   4/11/2022   1102

Solution

  • You're not getting duplicates, you are have non-unique results in both sets. If you were to encapsulate the union in a subquery and select distinct you will receive the same set of data. What you want to do is aggregate the data using SUM on the Count column:

    SELECT  PROCESSED_DTTM,  
            SUM([Count(cs)]) [Count(cs)]
    FROM
    (
        SEL       CAST(a.dttm AS DATE) PROCESSED_DTTM, 
                  count(a.cs) [Count(cs)] 
        FROM      cin.cell a
        LEFT JOIN cin.comm c ON a.cs_sk = c.cs_sk
        LEFT JOIN CIN.CID d ON a.cn_cd = d.CN_CD
        WHERE     CAST(a.dttm AS DATE) >= CURRENT_DATE-10
        GROUP BY  1
        UNION ALL
        SEL       CAST(a.dttm AS DATE), count(a.cs) 
        FROM      cin_ps.cell a
        LEFT JOIN cin_ps.comm c ON a.cs_sk = c.cs_sk
        LEFT JOIN CIN_ps.CID d ON a.cn_cd = d.CN_CD
        WHERE     CAST(a.dttm AS DATE) >= CURRENT_DATE-10
        GROUP BY  1
    ) AS TBL1
    GROUP BY PROCESSED_DTTM