I have a tasks in BigQuery with created date and last modified date. I would like to be able to report the number of task open and task close events by date in the same table if possible.
view: tasks {
derived_table: {
sql:
SELECT *
FROM UNNEST(ARRAY<STRUCT<CREATED_DATE DATE, LAST_MODIFIED DATE, ID INT64, STATE STRING>>[
('2020-12-01', '2020-12-01', 1, "OPEN"),
('2020-12-01', '2020-12-03', 2, "CLOSED"),
('2020-12-02', '2020-12-03', 3, "CLOSED"),
('2020-12-03', '2020-12-05', 4, "OPEN"),
('2020-12-05', '2020-12-05', 5, "CLOSED")])
;;
}
dimension_group: created {
type: time
datatype: date
sql: ${TABLE}.created_date ;;
}
dimension_group: last_modified {
type: time
datatype: date
sql: ${TABLE}.last_modified ;;
}
dimension: id {
type: number
}
dimension: state {
type: string
}
measure: number_of_tasks {
type: count_distinct
sql: ${id} ;;
}
measure: number_of_open_tasks {
type: count_distinct
sql: ${id} ;;
filters: {
field: "state"
value: "OPEN"
}
}
measure: number_of_closed_tasks {
type: count_distinct
sql: ${id} ;;
filters: {
field: "state"
value: "CLOSED"
}
}
}
explore: tasks {}
I can get the number of opened tasks using the created date.
I can get the number of tasks closed by counting tasks, where the last modified date is in the aggregating period and status is closed, with a filtered measure.
However, if I try to combine these in a single table I get a row for each combination of dates.
How can I count task state changes by date?
Date | Number of Opened Tasks | Number of Closed Tasks |
---|---|---|
2020-12-01 | 2 | 0 |
2020-12-02 | 1 | 0 |
2020-12-03 | 1 | 2 |
2020-12-04 | 0 | 0 |
2020-12-05 | 1 | 1 |
A colleague has suggested a solution. Stacking the tasks table on itself creates (up to) two rows per task.
view: tasks {
derived_table: {
sql:
WITH tab AS (
SELECT *
FROM UNNEST(ARRAY<STRUCT<CREATED_DATE DATE, LAST_MODIFIED DATE, ID INT64, STATE STRING>>[
('2020-12-01', '2020-12-01', 1, "OPEN"),
('2020-12-01', '2020-12-03', 2, "CLOSED"),
('2020-12-02', '2020-12-03', 3, "CLOSED"),
('2020-12-03', '2020-12-05', 4, "OPEN"),
('2020-12-05', '2020-12-05', 5, "CLOSED")])
)
SELECT *, 1 open_count, 0 closed_count, created_date AS action_date
FROM tab
UNION DISTINCT
SELECT *, 0 open_count, 1 closed_count, last_modified AS action_date
FROM tab
WHERE state = "CLOSED"
;;
}
dimension_group: created {
type: time
datatype: date
sql: ${TABLE}.created_date ;;
}
dimension_group: last_modified {
type: time
datatype: date
sql: ${TABLE}.last_modified ;;
}
dimension_group: action {
type: time
datatype: date
sql: ${TABLE}.action_date ;;
}
dimension: id {
type: number
}
dimension: state {
type: string
}
dimension: open_count {
type: number
hidden: yes
}
dimension: closed_count {
type: number
hidden: yes
}
measure: number_opened{
type: sum
sql: ${open_count} ;;
}
measure: number_closed {
type: sum
sql: ${closed_count} ;;
}
}
explore: tasks {}
The opened and closed tags can then be counted.