Search code examples
sqllocationsnowflake-cloud-data-platformlatitude-longitude

SQL: Determine time spent at locations (Snowflake)


I want to determine time spent at locations when an object is at a location that repeats. Any ideas how to derive that from this data sample using three ObjectID over two days? Thx

ObjectID DateTime Lat Lon
23 5/2/2021 12:00 40.11 -30.34
23 5/2/2021 16:00 40.11 -30.34
23 5/2/2021 23:00 40.11 -30.34
23 5/3/2021 12:00 40.11 -30.34
23 5/3/2021 16:00 39.88 -29.00
23 5/3/2021 23:00 40.11 -30.34
24 5/2/2021 12:00 40.11 -30.34
24 5/2/2021 16:00 40.11 -30.34
24 5/2/2021 23:00 39.88 -29.00
24 5/3/2021 12:00 39.88 -29.00
24 5/3/2021 16:00 40.11 -30.34
24 5/3/2021 23:00 39.88 -29.00
25 5/2/2021 12:00 40.11 -30.34
25 5/2/2021 16:00 39.88 -29.00
25 5/2/2021 23:00 40.11 -30.34
25 5/3/2021 12:00 39.88 -29.00
25 5/3/2021 16:00 40.11 -30.34
25 5/3/2021 23:00 40.11 -30.34

Desired Output shoule be the ObjectID, the earliest DateTime in the location, and the Time in that location:

ObjectID StartingDateTime Lat Lon TimeInLocation
23 5/2/2021 12:00 40.11 -30.34 24:00:00
24 5/2/2021 12:00 40.11 -30.34 04:00:00
24 5/2/2021 23:00 39.88 -29.00 13:00:00
25 5/3/2021 16:00 40.11 -30.34 07:00:00

ObjectID 23 was in the same location from 5/2 12:00-5/3 12:00, so one row is returned showing 24 hours

ObjectID 24 was in the same location twice:

  • 5/2 12:00-5/2 16:00, so one row is returned showing 4 hours

  • 5/2 23:00-5/3 12:00, so another row is returned showing 13 hours

ObjectID 25 was in the same location from 5/3 16:00-5/3 23:00, so one row is returned showing 7 hours


Solution

  • This is a type of gaps-and-islands problem, if I follow correctly. You can get the positions with more than one row using a difference of row numbers:

    select objectid, lat, lon,
           timestampdiff(second, max(datetime), min(datetime))
    from (select t.*,
                 row_number() over (partition by objectid, lat, lon order by datetime) as seqnum_2,
                 row_number() over (partition by objectid order by datetime) as seqnum
          from t
         ) t
    group by objectid, (seqnum - seqnum_2), lat, lon
    having count(*) > 1;