Search code examples
oracle-databasegroup-bysumoracle19c

Group by with custom no disjoint groups in Oracle SQL


Oracle 19c

I have a table with more than 20 columns with numeric values and I want to sum them by custom groups based in one column. For example in the next table, I want the sum of Quant_1 and Quant_2 columns, based on groups G1, G2 and G3 where G1 are the rows where Category (numeric values) is <= 2, G2 are the rows where Category is <= 4 even 1 and 2, and G3 are the rows where Category is <= 6.

Name Category Quant_1 Quant_2
name1 1 10 5
name1 1 10 5
name2 2 11 6
name1 3 12 7
name2 4 13 8
name1 5 14 9
name2 6 15 10
name1 7 20 20

So, in this example I want to do the aggregation by Name and my custom group to get this result:

Name Group Quant_1 Quant_2
name1 G1 20 10
name1 G2 32 17
name1 G3 46 26
name2 G1 11 6
name2 G2 24 14
name2 G3 39 24

How do I get those column sums based on these conditions? Thanks in advance for your help.


Solution

  • Thanks to the help of @MTO, I found one solution where I need to create only the number of columns as the number of custom groups. This solution has two versions depending on the version of Oracle.

    The idea is to create new columns, one for each custom group using CASE, and aggregate the quantities in grouping sets of Name and each of the columns that represent the custom groups. And finally UNPIVOT them.

    The difference between versions is because, as @MTO mentioned in his comment, in earlier versions is not possible to use alias columns in the HAVING clause.

    So, for oracle 23c you can use: Fiddle V1

    SELECT Name, grp, quant_1, quant_2
    FROM (
      SELECT name,
             SUM(quant_1) AS quant_1,
             SUM(quant_2) AS quant_2,
             CASE WHEN category <= 2 THEN 'G1' END AS g1,
             CASE WHEN category <= 4 THEN 'G2' END AS g2,
             CASE WHEN category <= 6 THEN 'G3' END AS g3         
      FROM   table_name
      GROUP BY GROUPING SETS ((name, g1), (name, g2), (name, g3))
      HAVING g1 IS NOT NULL or g2 IS NOT NULL or G3 IS NOT NULL
      ORDER BY name, g1, g2, g3
    )
    UNPIVOT (
      VALUE FOR grp IN (
        g1,g2,g3
      )
    )
    

    And for earlier versions, you need to use a subquery, with something like this: Fiddle V2

    SELECT Name, grp, quant_1, quant_2
    FROM (
      SELECT name,
             SUM(quant_1) AS quant_1,
             SUM(quant_2) AS quant_2,
             g1,
             g2,
             g3
      FROM   (
        SELECT name,
             quant_1,
             quant_2,
             CASE WHEN category <= 2 THEN 'G1' END AS g1,
             CASE WHEN category <= 4 THEN 'G2' END AS g2,
             CASE WHEN category <= 6 THEN 'G3' END AS g3         
        FROM   table_name
      )
      GROUP BY GROUPING SETS ((name, g1), (name, g2), (name, g3))
      ORDER BY name, g1, g2, g3
    )
    UNPIVOT (
      VALUE FOR grp IN (
        g1,g2,g3
      )
    )
    

    So, for the sample data:

    CREATE TABLE table_name (Name, Category, Quant_1, Quant_2) AS
    SELECT 'name1', 1, 10,  5 FROM DUAL UNION ALL
    SELECT 'name1', 1, 10,  5 FROM DUAL UNION ALL
    SELECT 'name2', 2, 11,  6 FROM DUAL UNION ALL
    SELECT 'name1', 3, 12,  7 FROM DUAL UNION ALL
    SELECT 'name2', 4, 13,  8 FROM DUAL UNION ALL
    SELECT 'name1', 5, 14,  9 FROM DUAL UNION ALL
    SELECT 'name2', 6, 15, 10 FROM DUAL UNION ALL
    SELECT 'name1', 7, 20, 20 FROM DUAL;
    

    With both methods, your result is going to be:

    NAME GRP QUANT_1 QUANT_2
    name1 G1 20 10
    name1 G2 32 17
    name1 G3 46 26
    name2 G1 11 6
    name2 G2 24 14
    name2 G3 39 24

    Also, you can find the excellent approach of @MTO here Fiddle MTO.