Search code examples
sqlgoogle-bigquerydata-analysismetabase

Calculate Time spent by users on the platform


I'm trying to calculate the time spent by users on my platform on a weekly basis. I have the following api_request_logs table (*Note: each row is duplicate)

time                            user_id       url 
------------------------        -------       ----
January 6,2020, 08:56 PM           14           api\v3\user1
January 6,2020, 08:58 PM           14           api\v3\session 
January 6,2020, 08:59 PM           14           api\v3\notification 
January 6,2020, 09:01 PM           14           api\v3\announcements
January 6,2020, 09:02 PM           14           api\v3\user2 
January 6,2020, 09:28 PM           14           api\v3\session

[enter image description here][1]

I tried this query, however, it doesn't seems it is outputting correct calculation .

select  SUM(timespent) timespent, a.user_id as user
from (
    select  timestamp_diff(max(time),min(time),minute) timespent,  DATE_TRUNC(EXTRACT(date from time), day) as Time, a.user_id as user
    from (
    select * , (countif (start) over (order by time)) as  grp from (
    select *, ifnull (lag(url) over(order by time), '') !=url as start from logs.api_request_logs a
    INNER JOIN logs.cohort_members on a.user_id = cohort_members.user_id
    INNER JOIN `logs.cohorts` on cohort_members.cohort_id = `logs.cohorts`.cohort_id
    INNER Join logs.users on cohort_members.user_id = users.user_id
    WHERE a.user_id = 14  and time > timestamp ('2020-01-01 00:00:00') 
    )) group by user, grp)
    group by 2
    order by 2 asc 

Current output:

timespent     users        Time 
---------     -------      -----
32 min         14       January 6,2020, 12:00 AM

Desired output :

Timespent    users       Time 
----------  --------     -----
  6 min      14          January 6,2020, 12:00 AM 
    

What I'm doing wrong or do you have a better idea to do it? Thanks


Solution

  • I think you just want to group by your url as well ... try the following :-)

    #standardsql :

    SELECT
      SUM(timespent)   timespent,
      a.user_id AS user
    FROM (
      SELECT
        TIMESTAMP_DIFF(MAX(time),MIN(time),minute) timespent,
        DATE_TRUNC(EXTRACT(date  FROM  time), day) AS Time,
        a.user_id AS user
        ,url
      FROM (
        SELECT
          *,
          (countif (start) OVER (ORDER BY time)) AS grp
        FROM (
          SELECT
            *,
            ifnull (LAG(url) OVER(ORDER BY time),
              '') !=url AS start
          FROM
            logs.api_request_logs a
          INNER JOIN
            logs.cohort_members
          ON
            a.user_id = cohort_members.user_id
          INNER JOIN
            `logs.cohorts`
          ON
            cohort_members.cohort_id = `logs.cohorts`.cohort_id
          INNER JOIN
            logs.users
          ON
            cohort_members.user_id = users.user_id
          WHERE
            a.user_id = 14
            AND time > timestamp ('2020-01-01 00:00:00') ))
      GROUP BY
        user,
        gr,url)
    GROUP BY
      2
    ORDER BY
      2 ASC