Say I had the following table
month region revenue
------ -------- ----------
jan north 100
feb north 150
mar north 250
How would I be able to query the above table to get the following results?:
month region revenue
------ -------- ----------
jan north 100
feb north 150
mar north 250
apr north 0
may north 0
jun north 0
0's can be null's or vice versa. Essentially trying to add empty/null fields into my query (in this case the apr, may, jun rows). Any help would be much appreciated
Thanks
Below is for BigQuery Legacy SQL, but please be advised - it is strongly recommended bu BigQuery Team to migrate to BigQuery Standard SQL
Below example should give you an idea
#legacySQL
SELECT
months.month_abr AS month_abr,
regions.region AS region,
COALESCE(revenues.revenue, 0) revenue
FROM months
CROSS JOIN (
SELECT region FROM revenues
) regions
LEFT JOIN revenues
ON months.month_abr = revenues.month_abr
AND regions.region = revenues.region
-- ORDER BY regions.region, months.month_number
where revenues
is your original table with revenues data, month
is a table (or you can use subquery as in below example) with list of month
You can test / play with above using below example with dummy data from your question
#legacySQL
SELECT
months.month_abr AS month_abr,
regions.region AS region,
COALESCE(revenues.revenue, 0) revenue
FROM (
SELECT month_number, month_abr FROM
(SELECT 1 month_number, 'jan' month_abr),
(SELECT 2 month_number, 'feb' month_abr),
(SELECT 3 month_number, 'mar' month_abr),
(SELECT 4 month_number, 'apr' month_abr),
(SELECT 5 month_number, 'may' month_abr),
(SELECT 6 month_number, 'jun' month_abr)
) AS months
CROSS JOIN (
SELECT region FROM (
SELECT region FROM
(SELECT 'jan' month_abr, 'north' region, 100 revenue),
(SELECT 'feb' month_abr, 'north' region, 150 revenue),
(SELECT 'mar' month_abr, 'north' region, 250 revenue)
) GROUP BY region
) regions
LEFT JOIN (
SELECT month_abr, region, revenue FROM
(SELECT 'jan' month_abr, 'north' region, 100 revenue),
(SELECT 'feb' month_abr, 'north' region, 150 revenue),
(SELECT 'mar' month_abr, 'north' region, 250 revenue)
) AS revenues
ON months.month_abr = revenues.month_abr
AND regions.region = revenues.region
ORDER BY regions.region, months.month_number
with result as below
Row month_abr region revenue
1 jan north 100
2 feb north 150
3 mar north 250
4 apr north 0
5 may north 0
6 jun north 0
And finally - below is how same can look like for BigQuery Standard SQL
#standardSQL
WITH regions AS (
SELECT DISTINCT region FROM revenues
), months AS (
SELECT EXTRACT(MONTH FROM month) month_number,
LOWER(FORMAT_DATE('%b', month)) month_abr
FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2010-01-01', DATE '2010-12-01', INTERVAL 1 MONTH)) month
)
SELECT month_abr, region, COALESCE(revenues.revenue, 0) revenue
FROM months
CROSS JOIN regions
LEFT JOIN revenues
USING(month_abr, region)
ORDER BY region, month_number
You can be test, play with this using dummy data from your questions
#standardSQL
WITH revenues AS (
SELECT 'jan' month_abr, 'north' region, 100 revenue UNION ALL
SELECT 'feb', 'north', 150 UNION ALL
SELECT 'mar', 'north', 250
), regions AS (
SELECT DISTINCT region FROM revenues
), months AS (
SELECT EXTRACT(MONTH FROM month) month_number,
LOWER(FORMAT_DATE('%b', month)) month_abr
FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2010-01-01', DATE '2010-12-01', INTERVAL 1 MONTH)) month
)
SELECT month_abr, region, COALESCE(revenues.revenue, 0) revenue
FROM months
CROSS JOIN regions
LEFT JOIN revenues
USING(month_abr, region)
ORDER BY region, month_number
You should be able to apply above to your real use case