Search code examples
sqlcountsumlooker

How to measure an average count from a set of days each with their own data points, in SQL/LookerML


I have the following table:

id | decided_at        | reviewer
1    2020-08-10 13:00   john
2    2020-08-10 14:00   john
3    2020-08-10 16:00   john
4    2020-08-12 14:00   jane
5    2020-08-12 17:00   jane
6    2020-08-12 17:50   jane
7    2020-08-12 19:00   jane

What I would like to do is get the difference between the min and max for each day and get the total count from the id's that are the min, the range between min and max, and the max. Currently, I'm only able to get this data for the past day.

Desired output:

Date       | Time(h) | Count | reviewer
2020-08-10   3         3       john
2020-08-12   5         4       jane

From this, I would like to get the average show this data over the past x number of days.

Example: If today was the 13th, filter on the past 2 days (48 hours) Output:

reviewer | reviews/hour
jane        5/4 = 1.25

Example 2: If today was the 13th, filter on the past 3 days (48 hours)

reviewer | reviews/hour
john       3/3 = 1
jane       5/4 = 1.25

Ideally, if this is possible in LookML without the use of a derived table, it would be nicest to have that. Otherwise, a solution in SQL would be great and I can try to convert to LookerML.

Thanks!


Solution

  • In SQL, one solution is to use two levels of aggregation:

    select reviewer, sum(cnt) / sum(diff_h) review_per_hour
    from (
        select
            reviewer,
            date(decided_at) decided_date,
            count(*) cnt,
            timestampdiff(hour, min(decided_at), max(decided_at)) time_h
        from mytable
        where decided_at >= current_date - interval 2 day
        group by reviewer, date(decided_at)
    ) t
    group by reviewer
    

    The subquery filters on the date range, aggregates by reviewer and day, and computes the number of records and the difference between the minimum and the maximum date, as hours. Then, the outer query aggregates by reviewer and does the final computation.

    The actual function to compute the date difference varies across databases; timestampdiff() is supported in MySQL - other engines all have alternatives.