Search code examples
sqlhive

SQL/Hive: Finding total number of users in a room when a user first joins a room


Data preview:

user room start_time leave_time 
  a1    A     07:44      08:02   
  b2    A     07:45      07:50
  c3    A     07:49      08:05 
  d4    A     08:03      08:05

Is there a way to add a column called 'total_num_user' counting total number of users in a room when each person first joined the room?

Desired output:

user room start_time leave_time total_num_user 
  a1    A     07:44      08:02         1
  b2    A     07:45      07:50         2
  c3    A     07:49      08:05         3
  d4    A     08:03      08:05         2 

I tried writing a pseudocode that count when start_time of other user < start_time of user and end_time of other user > end_time of user group by room, but stuck on how to actually implement this in SQL.


Solution

  • This is a little more than you wanted

    create table timeTable (username varchar(10), room varchar(10),start_time time, leave_time time, chek_total_num_user int);
    insert into timeTable values
     ('a1','A','07:44','08:02',1)
    ,('b2','A','07:45','07:50',2)
    ,('c3','A','07:49','08:05',3)
    ,('d4','A','08:03','08:05',2)
    ;
    with t1 as(
    select  username,room,start_time as event_time, 1 InOut
    from timeTable
    union all
    select  username,room,leave_time as event_time, -1 InOut
    from timeTable
    )
    select  * 
      ,sum(InOut)over(partition by room order by event_time) total_num_user_at_event
    from t1
    order by event_time
    
    username room event_time InOut total_num_user_at_event
    a1 A 07:44:00.0000000 1 1
    b2 A 07:45:00.0000000 1 2
    c3 A 07:49:00.0000000 1 3
    b2 A 07:50:00.0000000 -1 2
    a1 A 08:02:00.0000000 -1 1
    d4 A 08:03:00.0000000 1 2
    c3 A 08:05:00.0000000 -1 0
    d4 A 08:05:00.0000000 -1 0