Search code examples
sqlpostgresqlgpsaggregate-functions

How to get the travel mode when the geographical coordinates is in the raw GPS table


I have the transportation information of users in a table called labels containing start and end time with type of travel mode used:

CREATE TABLE labels
(
    user_id integer NOT NULL,
    session_id int,
    start_timestamp timestamp with time zone NOT NULL,
    end_timestamp timestamp with time zone NOT NULL,
    travelmode text,
    PRIMARY KEY (user_id, start_timestamp, end_timestamp)
)

INSERT INTO labels (user_id,session_id,start_timestamp,end_timestamp,travelmode) 
 VALUES     (11,0,'2007-06-26 11:32:29+01','2007-06-26 11:40:29+01','bus'),
     (11,0,'2008-03-28 14:52:54+00','2008-03-28 15:59:59+00','train'),
      (11,0,'2008-03-28 16:00:00+00','2008-03-28 22:02:00+00','train'),
      (11,0,'2008-03-29 01:27:50+00','2008-03-29 15:59:59+00','train'),
      (11,0,'2008-03-29 16:00:00+00','2008-03-30 15:59:59+01','train'),
      (11,0,'2008-03-30 16:00:00+01','2008-03-31 03:13:11+01','train'),
      (11,0,'2008-03-31 04:17:59+01','2008-03-31 15:31:06+01','train'),
      (11,0,'2008-03-31 16:00:08+01','2008-03-31 16:09:01+01','taxi'),
      (11,0,'2008-03-31 17:26:04+01','2008-04-01 00:35:26+01','train')

In the other table trajectories is the GPS sampling (raw GPS, interval of about 1-5 sec sampling) for each user:

CREATE TABLE trajectories
(
    user_id int,
    session_id int NOT NULL,
    "timestamp" timestamp with time zone NOT NULL,
    lat double precision NOT NULL,
    lon double precision NOT NULL,
    alt double precision,
    PRIMARY KEY (session_id, "timestamp")
)

INSERT INTO trajectories (user_id,session_id,timestamp,lat,lon,alt)  
 VALUES     (11,1002008,'2008-03-30 16:00:39+01',41.147205,95.457762,-777),
      (11,1002008,'2008-03-30 16:01:38+01',41.153458,95.444897,-777),
      (11,1002008,'2008-03-30 16:02:37+01',41.154867,95.429467,-777),
      (11,1002008,'2008-03-30 16:03:36+01',41.154075,95.413863,-777),
      (11,1002008,'2008-03-30 16:04:35+01',41.152223,95.398515,-777),
      (11,1002008,'2008-03-31 02:52:11+01',43.697033,87.57619,-777),
      (11,1002008,'2008-03-31 02:51:12+01',43.69425,87.579275,-777),
      (11,1002008,'2008-03-31 02:50:13+01',43.689312,87.587815,-777),
      (11,1002008,'2008-03-31 02:43:56+01',43.656445,87.634753,-777),
      (11,1002008,'2008-03-31 02:42:56+01',43.649275,87.638028,-777),
      (11,1002008,'2008-03-31 02:42:04+01',43.64454,87.63572,-777)

Because I am interested in statistics of travel mode (count of each travel mode) within latitude 41.00 - 42.00 and longitude 87.5 - 95.30 only, I have to do a join with the trajectories table which contains lat lon information, matching the timestamp field if it falls with the start/end_timestamp of labels tables for that user.

How do I do this? I added this DB<>fiddle.


Solution

  • This is the query that you describe:

    select l.travelmode, count(*)
    from trajectories t join
         labels l
         on t.user_id = l.user_id and
            t.timestamp between l.start_timestamp and l.end_timestamp
    where t.lat between 41.00 and 42.00 and
          t.lon between 87.5 and 95.30 
    group by l.travelmode;
    

    On your sample data, though, this does not return any rows because the timestamps are in different years.