I have the transportation information of users in a table called labels
containing start and end time with type of travel mode used:
CREATE TABLE labels
(
user_id integer NOT NULL,
session_id int,
start_timestamp timestamp with time zone NOT NULL,
end_timestamp timestamp with time zone NOT NULL,
travelmode text,
PRIMARY KEY (user_id, start_timestamp, end_timestamp)
)
INSERT INTO labels (user_id,session_id,start_timestamp,end_timestamp,travelmode)
VALUES (11,0,'2007-06-26 11:32:29+01','2007-06-26 11:40:29+01','bus'),
(11,0,'2008-03-28 14:52:54+00','2008-03-28 15:59:59+00','train'),
(11,0,'2008-03-28 16:00:00+00','2008-03-28 22:02:00+00','train'),
(11,0,'2008-03-29 01:27:50+00','2008-03-29 15:59:59+00','train'),
(11,0,'2008-03-29 16:00:00+00','2008-03-30 15:59:59+01','train'),
(11,0,'2008-03-30 16:00:00+01','2008-03-31 03:13:11+01','train'),
(11,0,'2008-03-31 04:17:59+01','2008-03-31 15:31:06+01','train'),
(11,0,'2008-03-31 16:00:08+01','2008-03-31 16:09:01+01','taxi'),
(11,0,'2008-03-31 17:26:04+01','2008-04-01 00:35:26+01','train')
In the other table trajectories
is the GPS sampling (raw GPS, interval of about 1-5 sec sampling) for each user:
CREATE TABLE trajectories
(
user_id int,
session_id int NOT NULL,
"timestamp" timestamp with time zone NOT NULL,
lat double precision NOT NULL,
lon double precision NOT NULL,
alt double precision,
PRIMARY KEY (session_id, "timestamp")
)
INSERT INTO trajectories (user_id,session_id,timestamp,lat,lon,alt)
VALUES (11,1002008,'2008-03-30 16:00:39+01',41.147205,95.457762,-777),
(11,1002008,'2008-03-30 16:01:38+01',41.153458,95.444897,-777),
(11,1002008,'2008-03-30 16:02:37+01',41.154867,95.429467,-777),
(11,1002008,'2008-03-30 16:03:36+01',41.154075,95.413863,-777),
(11,1002008,'2008-03-30 16:04:35+01',41.152223,95.398515,-777),
(11,1002008,'2008-03-31 02:52:11+01',43.697033,87.57619,-777),
(11,1002008,'2008-03-31 02:51:12+01',43.69425,87.579275,-777),
(11,1002008,'2008-03-31 02:50:13+01',43.689312,87.587815,-777),
(11,1002008,'2008-03-31 02:43:56+01',43.656445,87.634753,-777),
(11,1002008,'2008-03-31 02:42:56+01',43.649275,87.638028,-777),
(11,1002008,'2008-03-31 02:42:04+01',43.64454,87.63572,-777)
Because I am interested in statistics of travel mode (count of each travel mode) within latitude 41.00 - 42.00
and longitude 87.5 - 95.30
only, I have to do a join with the trajectories
table which contains lat lon
information, matching the timestamp
field if it falls with the start/end_timestamp
of labels
tables for that user.
How do I do this? I added this DB<>fiddle.
This is the query that you describe:
select l.travelmode, count(*)
from trajectories t join
labels l
on t.user_id = l.user_id and
t.timestamp between l.start_timestamp and l.end_timestamp
where t.lat between 41.00 and 42.00 and
t.lon between 87.5 and 95.30
group by l.travelmode;
On your sample data, though, this does not return any rows because the timestamps are in different years.