Search code examples
sqlsql-servert-sqlsql-server-2016

Interpolate missing values in a query by date


Given the following query

SELECT
    DATEADD(DAY, DATEDIFF(DAY, 0, [Created]), 0) [Date], 
    [Type], COUNT(*) as [Total]
FROM 
    Submissions 
WHERE 
    [Offer] = 'template1'
GROUP BY 
    DATEADD(DAY, DATEDIFF(DAY, 0, [Created]), 0), 
    [Type]
ORDER BY 1;

I get the following output:

Date                    Type                 Total
----------------------- -------------------- -----------
2021-04-30 00:00:00.000 Online               1
2021-05-01 00:00:00.000 Mail                 1
2021-05-01 00:00:00.000 Online               2
2021-05-10 00:00:00.000 Mail                 1

My goal is to ensure that for each date, both types are summarized. In the event that no rows for a given type exist, I'd like to show 0 instead of missing the row entirely. How can I reform the query so that, for example, 2 rows exist for 2021-04-30, one with type Online as shown, and one with type Mail with a total of 0?

I got it working using something like below, but this seems like a pretty brute force way of going about it.

SELECT [Date], [Type], [Total] FROM
(
    SELECT
        DATEADD(DAY, DATEDIFF(DAY, 0, [Created]), 0) [Date],
        [Type]
    FROM 
        Submissions
    WHERE [Offer] = 'template1'
 ) t1
PIVOT (
    COUNT([Type])
    FOR [Type] in ([Mail],[Online])
) p
UNPIVOT 
(
    [Total] FOR [Type] in ([Mail],[Online])
) p2

This results in what I am looking for:

Date                    Type                Total
----------------------- ------------------- -----------
2021-04-30 00:00:00.000 Mail                0
2021-04-30 00:00:00.000 Online              1
2021-05-01 00:00:00.000 Mail                1
2021-05-01 00:00:00.000 Online              2
2021-05-10 00:00:00.000 Mail                1
2021-05-10 00:00:00.000 Online              0

Solution

  • Even your brute force approach doesn't work if the submission table has no rows for a particular date.

    The standard approach is to use dimension tables to create a template of all the rows you desire, then left join your fact table on to it.

    SELECT
      calendar.date,
      type.label,
      COUNT(fact.id)
    FROM
      calendar
    CROSS JOIN
      type
    LEFT JOIN
      submissions AS fact
        ON  fact.created >= calendar.date
        AND fact.created <  calendar.date + 1
        AND fact.type     = type.label
        AND fact.offer    = 'template1'
    WHERE
        calendar.date BETWEEN ? AND ?
        AND type.label IN ('Mail', 'Online')
    GROUP BY
      calendar.date,
      type.label