Search code examples
snowflake-cloud-data-platformsnowflake-schemasnowflake-connector

Need assistance in rewriting a SQL query


Select ID,LISTAGG(CATEGORY_CD, ',') WITHIN GROUP(ORDER BY CAST(P_NO as INT)) AS CATEGORY_CD
FROM table1
WHERE DATE = '2022-02-02'
AND CATEGORY_FLAG = '0'
AND CAT_SRC_CD NOT IN ('0','1','2','3')
group by ID

I need to rewrite the above query, such that it should be ORDERED BY P_NO and I need to display distinct values for CATEGORY_CD column.

I tried the logic mentioned below and I used ORDER BY P_NO in the sub query. Not sure if the logic I have written is correct.

SELECT ID,LISTAGG(distinct CATEGORY_CD, ',') WITHIN GROUP(ORDER BY CAST(CATEGORY_CD as INT)) AS CATEGORY_CD
 FROM 
      (SELECT distinct ID,CREATE_DT,CATEGORY_CD, 
          TRY_CAST(P_NO as INTEGER) as P_NO
        FROM table1 
        WHERE DATE = '2022-02-02'
        AND CATEGORY_FLAG = '0'
        AND CAT_SRC_CD NOT IN ('0','1','2','3')
        ORDER BY P_NO
      ) 
    GROUP BY 
      ID;

The original query which I have mentioned is giving the below results. I should display only distinct values for CATEGORY_CD column and the results should be ORDERED BY P_NO column.

      ID                CATEGORY_CD
      
      169174408          5,5,5,5
      169174408          2,1,1
      169174408          12,12,12,12,12,12
      839946105          2,2,2,2,3,3,3,3,3,3,3,3

Kindly guide me in implementing the correct logic. Thanks in advance :)

Also, please guide me if I need to add another few columns to my original query. I have added my original SQL query again adding another column.

Select ID,LISTAGG(CATEGORY_CD, ',') WITHIN GROUP(ORDER BY CAST(P_NO as INT)) AS CATEGORY_CD,
LISTAGG(CATEGORY_DTL, ',') WITHIN GROUP(ORDER BY CAST(P_NO as INT)) AS CATEGORY_DTL
FROM table1
WHERE DATE = '2022-02-02'
AND CATEGORY_FLAG = '0'
AND CAT_SRC_CD NOT IN ('0','1','2','3')
group by ID

Solution

  • Using QUALIFY to select distinct CATEGORY_CD per ID:

    WITH cte AS (
        SELECT *
        FROM table1
        WHERE DATE = '2022-02-02'
          AND CATEGORY_FLAG = '0'
          AND CAT_SRC_CD NOT IN ('0','1','2','3')
        QUALIFY ROW_NUMBER() OVER(PARTITION BY ID,CATEGORY_CD ORDER BY P_NO::INT) = 1
    )
    SELECT ID,LISTAGG(CATEGORY_CD, ',') WITHIN GROUP(ORDER BY P_NO::int) AS CATEGORY_CD
    FROM cte
    GROUP BY ID;