So my SQL output is:
DUNF 2021-04-01 18
DUNF 2021-04-02 17
DUNF 2021-04-03 7
DUNF 2021-04-04 10
DUNF 2021-04-05 18
DUNF 2021-04-06 20
DUNF 2021-04-07 17
DUNF 2021-04-08 14
LEEDS 2021-04-01 4
LEEDS 2021-04-02 4
LEEDS 2021-04-03 5
LEEDS 2021-04-05 9
LEEDS 2021-04-06 3
LEEDS 2021-04-07 1
LEEDS 2021-04-08 3
etc.
But I need to have an entry for each day, even if the final field is a volume (sometimes nothing happens in that location on that day).
I've tried everything, but my SQL 5.7 knowledge is weak. I can generate a calendar table, but it doesn't fill the gaps in each site (you can see that Leeds is missing a day, but there are other sites that are missing dates too).
The code I'm using so far is:
SELECT location, begin_date, count(*) as volume
FROM abs_raw_data
WHERE begin_date >= (SELECT date_sub(max(begin_date), interval 7 day) from abs_raw_data)
GROUP BY location, begin_date
ORDER BY location asc, begin_date asc
How would I even go about doing this?
It seems that you need
SELECT locations.location,
calendar.begin_date,
COALESCE(COUNT(abs_raw_data.location), 0) as volume
FROM ( {calendar generating subquery} ) AS calendar
CROSS JOIN ( SELECT DISTINCT location
FROM abs_raw_data ) locations
CROSS JOIN ( SELECT MAX(begin_date) - INTERVAL 7 DAY AS begin_date
FROM abs_raw_data ) maxdate
LEFT JOIN abs_raw_data ON calendar.begin_date = abs_raw_data.begin_date
AND locations.location = abs_raw_data.location
AND abs_raw_data.begin_date >= maxdate.begin_date
GROUP BY locations.location, calendar.begin_date
ORDER BY locations.location ASC, calendar.begin_date ASC
The calendar generating subquery may be, for example,
SELECT abs_raw_data.begin_date - INTERVAL nums.n DAY
FROM ( SELECT MAX(begin_date) begin_date
FROM abs_raw_data) abs_raw_data
CROSS JOIN ( SELECT 0 n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 ) nums
-- https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=49ef261eaf01956973d4ce99f6203a91