Search code examples
sqlclickhouse

How to count for every day based on date ranges in ClickHouse?


Imagine I have this table:

item state date
A blue 2022-12-27
A red 2022-12-31
B green 2022-12-28
B yellow 2022-12-31
C blue 2022-12-29
D red 2022-12-26

The date column represents the date range of when the item existed. For example, "A" existed from 2022-12-27 to 2022-12-30 in the state "blue" and on the 2022-12-31 in the state "red". Now, I want to have the count of the items per date if they are in the range. The result table should look like this:

date count_item comment
2022-12-26 1 (D)
2022-12-27 1 (A)
2022-12-28 2 (A & B)
2022-12-29 3 (A & B & C)
2022-12-30 2 (A & B)
2022-12-31 2 (A & B)

How can I achieve this in ClickHouse?

I tried to use this code, but it returns the wrong results.

SELECT
    toDate(date) AS d,
    COUNT(item) AS i
FROM my_table
GROUP BY d
ORDER BY d ASC WITH FILL STEP toIntervalDay(1)
INTERPOLATE ( i AS i );

Solution

  • WITH 
      MYTAB AS
    (
      SELECT c1 as item, c2 as state, c3 as date
      FROM VALUES
      (
        ('A', 'blue',   toDate ('2022-12-27'))
      , ('A', 'red',    toDate ('2022-12-31'))
      , ('B', 'green',  toDate ('2022-12-28'))
      , ('B', 'yellow', toDate ('2022-12-31'))
      , ('C', 'blue',   toDate ('2022-12-29'))
      , ('D', 'red',    toDate ('2022-12-26'))
      )
    )
    , D AS
    (
      SELECT toDate (arrayJoin (range (toUInt32 (min (date)), toUInt32 (max (date)) + 1))) AS dt
      FROM MYTAB 
    )
    , G AS 
    (
      SELECT item, MIN (date) AS date_min, MAX (date) AS date_max
      FROM MYTAB
      GROUP BY item
    )
    SELECT 
      D.dt AS date
    , count (1) AS count_item
    , groupArray (G.item) AS comment
    FROM D, G 
    WHERE D.dt BETWEEN G.date_min AND G.date_max
    GROUP BY D.dt
    ORDER BY D.dt
    FORMAT PrettyCompact
    ;
    

    The result is:

    +-------date-+-count_item-+-comment-------+
    | 2022-12-26 |          1 | ['D']         |
    | 2022-12-27 |          1 | ['A']         |
    | 2022-12-28 |          2 | ['B','A']     |
    | 2022-12-29 |          3 | ['B','C','A'] |
    | 2022-12-30 |          2 | ['B','A']     |
    | 2022-12-31 |          2 | ['B','A']     |
    +------------+------------+---------------+
    

    fiddle