I'm looking at an Oracle stored procedure which as the following (paraphrased) sql
select AB.*, count(*) over (PARTITION BY
TO_CHAR(AB.CAN_DATE,'MONTH'),
TO_CHAR(AB.CAN_DATE,'MM'),
'XMEG'
ORDER BY
TO_CHAR(AB.CAN_DATE,'MONTH'),
TO_CHAR(AB.CAN_DATE,'MM'),
'XMEG') from mytable AB;
Apparently this code works OK, but I have several questions
Your query will count the values for each month regardless of year. So all the values in October will be counted together regardless of whether it was October 2023, October 1900 or October 9999. If you want the count for individual months then you should use TRUNC(ab.can-date, 'MM')
or TO_CHAR(ab.can_date, 'YYYY-MM')
rather than TO_CHAR
with just the month.
- surely the to_char (...MM) is redundant, as the to_char(....Month) will do the same job?
TO_CHAR(AB.CAN_DATE,'MONTH')
finds the name of the monthTO_CHAR(AB.CAN_DATE,'MM')
finds the number of the monthSince both are unique to the month then they will have the same effect in the PARTITION BY
clause and, yes, the second one is redundant.
- what is the value of an order by clause in a COUNT(*) OVER? Does it have any effect? Why is it allowed.
If you wanted a cumulative (running) total starting from the earliest date to the last date then you could use an ORDER BY
clause. If you do not have an ORDER BY
clause then the count will be the total over the entire partition.
For example, if you have the sample data:
CREATE TABLE table_name (dt) AS
SELECT DATE '1900-10-01' FROM DUAL UNION ALL
SELECT DATE '1970-10-31' FROM DUAL UNION ALL
SELECT DATE '2023-10-01' FROM DUAL UNION ALL
SELECT DATE '2023-10-05' FROM DUAL UNION ALL
SELECT DATE '2023-10-13' FROM DUAL UNION ALL
SELECT DATE '2023-10-13' FROM DUAL UNION ALL
SELECT DATE '2023-10-28' FROM DUAL UNION ALL
SELECT DATE '9999-10-31' FROM DUAL UNION ALL
SELECT DATE '2023-12-31' FROM DUAL;
Then:
SELECT dt,
TO_CHAR(dt, 'MONTH') AS month_name,
TO_CHAR(dt, 'MM') AS month_number,
COUNT(dt) OVER (PARTITION BY TO_CHAR(dt, 'MONTH')) AS total_by_name,
COUNT(dt) OVER (PARTITION BY TO_CHAR(dt, 'MM')) AS total_by_number,
COUNT(dt) OVER (PARTITION BY TRUNC(dt, 'MM')) AS total_by_month,
COUNT(dt) OVER (PARTITION BY TO_CHAR(dt, 'MONTH') ORDER BY dt)
AS running_total_by_name
FROM table_name
Outputs:
DT | MONTH_NAME | MONTH_NUMBER | TOTAL_BY_NAME | TOTAL_BY_NUMBER | TOTAL_BY_MONTH | RUNNING_TOTAL_BY_NAME |
---|---|---|---|---|---|---|
2023-12-31 00:00:00 | DECEMBER | 12 | 1 | 1 | 1 | 1 |
1900-10-01 00:00:00 | OCTOBER | 10 | 8 | 8 | 1 | 1 |
1970-10-31 00:00:00 | OCTOBER | 10 | 8 | 8 | 1 | 2 |
2023-10-01 00:00:00 | OCTOBER | 10 | 8 | 8 | 5 | 3 |
2023-10-05 00:00:00 | OCTOBER | 10 | 8 | 8 | 5 | 4 |
2023-10-13 00:00:00 | OCTOBER | 10 | 8 | 8 | 5 | 6 |
2023-10-13 00:00:00 | OCTOBER | 10 | 8 | 8 | 5 | 6 |
2023-10-28 00:00:00 | OCTOBER | 10 | 8 | 8 | 5 | 7 |
9999-10-31 00:00:00 | OCTOBER | 10 | 8 | 8 | 1 | 8 |
And you can see that the final column has a running total (where values with the same ORDER BY
are counted together).
In your case, the PARTITION BY
and ORDER BY
clauses are identical so every value in a partition will have the same value and will be ordered identically so the ORDER BY
clause, in your case, is irrelevant and could be removed.
- what does it mean to include a constant ('XMEG') in the partition / order by. I'm guessing it will have no effect?
In this case, it has no effect.
You can do COUNT(*) OVER ()
if you want to count over the entire result set without any partitions. If you wanted to include a PARTITION BY
clause and still get the same effect then you could use COUNT(*) OVER (PARTITION BY 'a constant')
but that is about the only reason I can think to include a constant (and even then you could just omit the PARTITION BY
clause).