I have a query that counts total messages sent by day, and groups them by gender.
If on a day there were no messages sent to males, there will be no row for male.
How do I insert a count value of 0 for males in this case?
Query:
SELECT d.date dated, count(se.id), gender
FROM (
select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date
FROM generate_series(0, 365, 1) AS offs
) d
LEFT OUTER JOIN (
SELECT id, customer_id, client_report.insert_time, profile.gender, profile.house_income, profile.address AS postcode, profile.age AS age_group, profile.is_employed, profile.is_married, profile.no_children, profile.no_cars, profile.shopping_frequency
FROM common.client_report
JOIN common.profile
ON client_report.profile_id = profile.uuid
WHERE sms_status = 'SUCCESS'
) se
ON (d.date=to_char(date_trunc('day', se.insert_time), 'YYYY-MM-DD'))
WHERE customer_id = 25::int AND d.date::date BETWEEN '2017-07-03'::date AND '2017-08-01'::date
GROUP BY d.date, gender
ORDER BY d.date ASC
I want the results to be consistent, such as:
day, count, gender
2017-07-01, 10, Female
2017-07-01, 5, Male
2017-07-02, 0, Female
2017-07-02, 8, Male
For the sake of simplicity, let's assume you've made a table with genders
:
CREATE TABLE genders
(
gender text primary key
) ;
INSERT INTO genders (gender)
VALUES
('MALE'),
('FEMALE') ;
If you want all dates/genders in your answer, you need to make a cross join
of all possible dates and genders, then LEFT JOIN
it with the data that actually answers your question:
SELECT
calendar.cal_date AS date, coalesce(gender_count, 0) AS gender_count, genders.gender
FROM
-- Virtual table with all dates/genders
genders
CROSS JOIN
(SELECT
generate_series(/*now()*/ date '2017-07-10' - interval '7 days' /* or '1 year' */,
/*now()*/ date '2017-07-10',
interval '1 day')::date AS cal_date
) AS calendar
-- The table with the actual data
LEFT JOIN
(
SELECT
date, gender, count(gender) AS gender_count
FROM
client_report
JOIN profile ON profile.uuid = client_report.profile_id
WHERE
-- conditions to actually select data
customer_id = 25
AND sms_status = 'SUCCESS'
GROUP BY
date, gender
) AS counts
ON counts.date = calendar.cal_date AND counts.gender = genders.gender
ORDER BY
calendar.cal_date, genders.gender ;
You'll get an answer like:
date | gender_count | gender :--------- | -----------: | :----- 2017-07-03 | 1 | FEMALE 2017-07-03 | 2 | MALE 2017-07-04 | 2 | FEMALE 2017-07-04 | 0 | MALE 2017-07-05 | 0 | FEMALE 2017-07-05 | 0 | MALE 2017-07-06 | 0 | FEMALE 2017-07-06 | 2 | MALE 2017-07-07 | 2 | FEMALE 2017-07-07 | 0 | MALE 2017-07-08 | 0 | FEMALE 2017-07-08 | 1 | MALE 2017-07-09 | 0 | FEMALE 2017-07-09 | 0 | MALE 2017-07-10 | 0 | FEMALE 2017-07-10 | 0 | MALE
Note that I haven't taken a full year, but just 8 days.
You can see the full example with sample data at dbfiddle here
NOTE: I have ignored the insert_date, and concentrated only on the date, because I don't know which exact meaning each has. I think your point was how to use a cross join, the rest is not that relevant to the scenario.