I have a dataset where it is somewhat common for fields to have NULL
as a valid value. This causes an issue when I want to use the ROLLUP
operator in MySQL, as I can't distinguish between the NULL
values it generates as part of its subtotals/totals and the actual NULL
values in the data.
My current query is as follows:
SELECT
COALESCE(car_score, "Total") AS car_score,
COUNT(DISTINCT id) AS volume
FROM cars_table
GROUP BY
car_score ASC WITH ROLLUP;
This provides me with the following table:
cars_score | volume
---------------------------
Total | 500
1 | 100
2 | 200
3 | 300
4 | 400
5 | 500
Total | 2000
when I'd like it to be:
cars_score | volume
---------------------------
NULL | 500
1 | 100
2 | 200
3 | 300
4 | 400
5 | 500
Total | 2000
This is a simple example, and it becomes more frustrating once I have multiple dimensions for the ROLLUP
. The reason I can't just change the NULL
value before to something else is that I also need to be able to aggregate the data in other parts of the application, so having a proper NULL
is important to me.
One option would be to wrap with a subquery which first replaces the actual NULL
values which indicate missing data. Then, use COALESCE()
as you were to replace the NULL
from the rollup with the string "Total"
:
SELECT
COALESCE(t.car_score, 'Total') AS car_score,
COUNT(DISTINCT t.id) AS volume
FROM
(
SELECT COALESCE(cars_score, 99) AS car_score, id
FROM cars_table
) t
GROUP BY t.car_score WITH ROLLUP
Here I have used 99
as a placeholder to indicate car scores which were missing. You can use any placeholder you want, other than NULL
.