Search code examples
google-bigqueryconditional-statementswhere-clause

BigQuery WHERE statement filter out pairs of values


I have a table with multiple rows and the following fields:

Visitor Host Event Time

The table lists visitors to a building. Visitor is the name of a visitor, Host is the company they are visiting, and Event is either 'IN' or 'OUT' indicating that the visitor has entered or exited the building at Time

Visitor Host Event Time
Alan GraphicsMan IN 12:34
Alan GraphicsMan OUT 13:44
Barry CodeMan IN 14:31
Carla CodeMan IN 15:10
Danielle SecureMan IN 15:13
Carla CodeMan OUT 15:31
Carla SecureMan IN 15:35

I want my query to return all visitors who are currently visiting and who they are visiting. In other words, I want all visitor-host pairs that have an 'IN' event but do not have an 'OUT' event. For the example table above, the query should return:

Visitor Host Event Time
Barry CodeMan IN 14:31
Danielle SecureMan IN 15:13
Carla SecureMan IN 15:35

The closest I've gotten is the following:

with entrances as (SELECT visitor FROM table where event = 'IN'), 
exits as (SELECT visitor FROM table WHERE event = 'OUT'),
SELECT * from table
WHERE visitor IN entrances AND visitor NOT IN exits

which returns

Visitor Host Event Time
Barry CodeMan IN 14:31
Danielle SecureMan IN 15:13

Carla will be excluded because she went in and out of the building before going back in to visit a different host. I'm trying to think of a way to check pairs of values (i.e. a visitor-host pair) in the where clause.


Solution

  • Assuming OUT is always timestamped after IN, COUNT can be used to find cases where IN does not have a corresponding OUT.

    SELECT visitor, host, time
    FROM (
        SELECT
            visitor, host,
            COUNTIF(event='IN') ins, COUNTIF(event='OUT') outs,
            MAX(time) time
        FROM table
        GROUP BY visitor, host
        HAVING ins > outs)
    

    It should return:

    Row visitor host time
    1 Barry CodeMan 14:31:00
    2 Carla SecureMan 15:35:00
    3 Danielle SecureMan 15:13:00