Search code examples
oracle-databasegroup-bycube

Oracle SQL group by cube with distinct ID


Example data (complete table has more columns and millions of rows):

invoice_number |year            |department      |euros
-------------------------------------------------------------
1234           |2010            |1               | 200
1234           |2011            |1               | 200
1234           |2011            |2               | 200           
4567           |2010            |1               | 450
4567           |2010            |2               | 450
4567           |2010            |3               | 450

My Objective:

I want to sum the euros for every year and every department in every possible combination.

How result should look:

year             |department         |euros
--------------------------------------------
2010             |1                  |650
2010             |2                  |450
2010             |3                  |450
2010             |(null)             |650
2011             |1                  |200
2011             |2                  |200
(null)           |1                  |650
(null)           |2                  |650
(null)           |3                  |450
(null)           |(null)             |650

My query:

select      year
,           department
,           sum(euros)
from        table1
group by    cube    (
                    year
            ,       department
                    )

Problem:

One invoice number can occur in several categories. For example, one invoice can have items from 2010 and 2011. This is no problem when I want to show the data per year. However, when I want the total over all years the euros will be summed twice, one time for each year. I want the functionality of 'group by cube' but I want to sum only distinct invoice numbers for aggregations.

Problem table:

year             |department         |euros
--------------------------------------------
2010             |1                  |650
2010             |2                  |450
2010             |3                  |450
2010             |(null)             |1550
2011             |1                  |200
2011             |2                  |200
(null)           |1                  |850
(null)           |2                  |650
(null)           |3                  |450
(null)           |(null)             |1950

Is it possible to do what I want? So far my search has yielded no results. I have created a SQL Fiddle, I hope it works


Solution

  • [Removed previous "solution"]

    New attempt: here is quite an ugly solution, but it seems to work, even when two invoices have the same amount. With two table accesses, you should check if performance is acceptable.

    SQL> with table1_cubed as
      2  ( select year
      3         , department
      4         , grouping_id(year,department) gid
      5      from table1
      6     group by cube(year,department)
      7  )
      8  , join_distinct_invoices as
      9  ( select distinct x.*
     10         , r.invoice_number
     11         , r.euros
     12      from table1_cubed x
     13           inner join table1 r on (nvl(x.year,r.year) = r.year and nvl(x.department,r.department) = r.department)
     14  )
     15  select year
     16       , department
     17       , sum(euros)
     18    from join_distinct_invoices
     19   group by year
     20       , department
     21       , gid
     22   order by year
     23       , department
     24  /
    
          YEAR DEPARTMENT           SUM(EUROS)
    ---------- -------------------- ----------
          2010 1                           650
          2010 2                           450
          2010 3                           450
          2010                             650
          2011 1                           200
          2011 2                           200
          2011                             200
               1                           650
               2                           650
               3                           450
                                           650
    
    11 rows selected.