Search code examples
sqloracle-databaseoracle12clistaggoracle19c

Remove duplicates from an already aggregated LISTAGG using LISTAGG


I have below two tables

SELECT * FROM TABLE_ONE  
UNION
SELECT * FROM TABLE_TWO

Result

ID|PRODUCTS   |TOTAL_AMOUNT|
--|-----------|------------|
 1|TABLE|CHAIR|           8|
 1|TABLE|TV   |          12|
 2|CUP        |          13|
 2|PLATE      |          14|

Now i want to do a listagg and as well remove duplicates on oracle 19c, therefore i use below query

SELECT ID, listagg(DISTINCT PRODUCTS, '|') within group (order by PRODUCTS)  PRODUCTS, SUM(AMOUNT) FROM (    
SELECT * FROM TABLE_ONE  
UNION
SELECT * FROM TABLE_TWO
) GROUP BY ID

Result i get is

ID|PRODUCTS            |SUM(TOTAL_AMOUNT)|
--|--------------------|-----------------|
 1|TABLE|CHAIR|TABLE|TV|               20|
 2|CUP|PLATE           |               27|

Result that i want is

ID|PRODUCTS            |SUM(TOTAL_AMOUNT)|
--|--------------------|-----------------|
 1|TABLE|CHAIR|TV      |               20|
 2|CUP|PLATE           |               27|

Test data in db <> fiddle Although it is oracle18c, doesn't support distinct in listagg i am using oracle 19c


Solution

  • You can use the below to unlistagg and remove the duplicate and then listagg again.The DB Fiddle here

                WITH data
             AS (SELECT id,
                        Listagg(products, '|')
                          within GROUP (ORDER BY products) PRODUCTS,
                        SUM(amount)                        SUM_AMT
                 FROM   (SELECT *
                         FROM   table_one
                         UNION
                         SELECT *
                         FROM   table_two)
                 GROUP  BY id),
             d2
             AS (SELECT DISTINCT id,
                                 Regexp_substr(products, '[^|]+', 1, column_value) AS
                                 products,
                                 sum_amt
                 FROM   data
                        cross join TABLE(Cast(MULTISET (SELECT LEVEL
                                                  FROM   dual
                                                  CONNECT BY LEVEL <=
                                                  Regexp_count(products,
                                                  '[^|]+'))
                                                           AS
        sys.ODCINUMBERLIST)))
        SELECT id,
               Listagg(products, '|')
                 within GROUP (ORDER BY id) PRODUCTS,
               sum_amt
        FROM   d2
        GROUP  BY id,
                  sum_amt;