I wish to count events that occurred within a custom time interval : it could be within 24h, or within a week or 2-months span.
I am using google sheets: I can create a pivot table and group by month, however I'd like to explore insights using custom intervals (I'm looking for pattern in epilepsy).
As final result, I wanna have a table that, for each day, it is reported the number of frequencies within that interval.
Particularly, I wanna focus on the interval of 24h to count the number of events of epilepsy (known as cluster seizures).
And then, on custom days intervals to explore periodicity or trends - like each 48 hours, or each 15 or 30 days.
See a mockup of Google Sheet here:
I tried this attempts:
= IFERROR(
QUERY(
A:E,
"SELECT COUNT(A)
WHERE
A IS NOT NULL AND
E = FALSE AND
A >= date '" &
TEXT(
A2-30,
"yyyy-MM-dd"
) &"' AND
A <= date '" &
TEXT(
A2,
"yyyy-MM-dd"
) &"'
LABEL COUNT(A) '' "), "N/A")
Then, dragging the cell, I get the column "# events in the prior 30 days".
It works but seems a bit messy - especially for updating the intervals.
=query(B:E, "select B, count(E), -1+count(E) where E = FALSE group by B label B 'Date with Clusters', count(E) 'Cluster seizures '")
That produces the last table.
I like this approach better, but here I am just grouping by the same date, without possibility to have a custom interval.
As an example, I will have that two events will be counted within the same day, not withing the same 24h interval.
Could you tell a better approach to handle datetime differences, so to create binning and group by with custom intervals ?
Below an example: on the left table, data in input; on the middle column, result of first approach; on the right table, results of second approach.
given the table:
in order to group stuff with QUERY we need to "fix" the A column in order to get a custom period. lets say we need to group events every 3 weeks (21 days). we take the lowest and highest date and create a sequence with all the dates in between.
=INDEX(ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A))))
then we use running total on it to get every date which is 21 days apart from the previous/next one. we could use simple SEQUENCE (for min>max) to create this array but with SEQUENCE we cant go "back in time" (for max>min) so we use MMULT and negative number
therefore, to start from a frame of the first date and create 3 weeks group by windows (eg. min>max) we use:
=ARRAYFORMULA({MIN(A2:A); MIN(A2:A)+MMULT(TRANSPOSE((
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))<=TRANSPOSE(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))*21); SiGN(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))})
and to get a reverse of it and start from frame of end date and create 3 weeks windows backwards (eg. max>min) we use:
=ARRAYFORMULA({MAX(A2:A); MAX(A2:A)+MMULT(TRANSPOSE((
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))<=TRANSPOSE(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))*-21); SiGN(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))})
at this stage, we can start fixing the A column via VLOOKUP and 4th argument set to 1
- approximate mode (instead of 0
- exact match mode) so forward in time will be:
=ARRAYFORMULA(IFNA(VLOOKUP(A2:A; SORT({MIN(A2:A); MIN(A2:A)+MMULT(TRANSPOSE((
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))<=TRANSPOSE(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))*21); SIGN(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))}); 1; 1)))
and backward in time shall be:
=ARRAYFORMULA(IFNA(VLOOKUP(A2:A; SORT({MAX(A2:A); MAX(A2:A)+MMULT(TRANSPOSE((
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))<=TRANSPOSE(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))*-21); SIGN(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))}); 1; 1)))
and now we just create a virtual array {}
and pair fixed column A with column C and input it as range into QUERY
side note:
to put columns next to each other in english spreadsheets we use
,
to put columns next to each other in non-english spreadsheets we use
\
=ARRAYFORMULA(QUERY({IFNA(VLOOKUP(A2:A; SORT({MIN(A2:A); MIN(A2:A)+MMULT(TRANSPOSE((
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))<=TRANSPOSE(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))*21); SIGN(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))}); 1; 1))\ C2:C};
"select Col1,count(Col1)
where Col2 = FALSE
group by Col1
order by count(Col1) desc
label count(Col1)''"))
and backwards in time:
=ARRAYFORMULA(QUERY({IFNA(VLOOKUP(A2:A; SORT({MAX(A2:A); MAX(A2:A)+MMULT(TRANSPOSE((
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))<=TRANSPOSE(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))*-21); SIGN(
ROW(INDIRECT(MIN(A2:A)&":"&MAX(A2:A)))))}); 1; 1))\ C2:C};
"select Col1,count(Col1)
where Col2 = FALSE
group by Col1
order by count(Col1) desc
label count(Col1)''"))