I have the following table:
CREATE TABLE trajectory(
user_id int,
session_id int,
timestamp timestamp with time zone,
lat double precision,
lon double precision
);
INSERT INTO trajectory(user_id, session_id, timestamp, lat, lon) VALUES
(1, 25304,'2008-10-23 02:53:04+01', 39.984702, 116.318417),
(1, 25304, '2008-10-23 02:53:10+01', 39.984683, 116.31845),
(1, 25304, '2008-10-23 02:53:15+01', 39.984686, 116.318417),
(1, 25304, '2008-10-23 02:53:20+01', 39.984688, 116.318385),
(1, 20959,'2008-10-24 02:09:59+01', 40.008304, 116.319876),
(1, 20959,'2008-10-24 02:10:04+01', 40.008413, 116.319962),
(1, 20959,'2008-10-24 02:10:14+01', 40.007171, 116.319458),
(2, 55305, '2008-10-23 05:53:05+01', 39.984094, 116.319236),
(2, 55305, '2008-10-23 05:53:11+01', 39.984198, 116.319322),
(2, 55305, '2008-10-23 05:53:21+01', 39.984224, 116.319402),
(2, 34104, '2008-10-23 23:41:04+01', 40.013867, 116.306473),
(2, 34104, '2008-10-23 23:41:16+01', 40.013907, 116.306488);
Question:
I want to add a column trip_id
to this table, based on the user_id
and session_id
columns, so that when the user's session id changes, I know the user is on a new trip so I add that id to the new trip
column.
Required output:
user_id |session_id |timestamp | lat | lon | trip_id
--------|------------|-----------------------|--------------|-----------|-----------
1 | 25304 |2008-10-23 02:53:04+01 | 39.984702 |116.318417 | 1
1 | 25304 |2008-10-23 02:53:10+01 | 39.984683 |116.31845 | 1
1 | 25304 |2008-10-23 02:53:15+01 | 39.984686 |116.318417 | 1
1 | 25304 |2008-10-23 02:53:20+01 | 39.984688 |116.318385 | 1
1 | 20959 |2008-10-24 02:09:59+01 |40.008304 |116.319876 | 2
1 | 20959 |2008-10-24 02:10:04+01 |40.008413 |116.319962 | 2
1 | 20959 |2008-10-24 02:10:14+01 |40.007171 |116.319458 | 2
2 | 55305 |2008-10-23 05:53:05+01 |39.984094 |116.319236 | 1
2 | 55305 |2008-10-23 05:53:11+01 |39.984198 |116.319322 | 1
2 | 55305 |2008-10-23 05:53:21+01 |39.984224 |116.319402 | 1
2 | 34104 |2008-10-23 23:41:04+01 |40.013867 |116.306473 | 2
2 | 34104 |2008-10-23 23:41:16+01 |40.013907 |116.306488 | 2
How can I do this?
EDIT
Thanks for these great answers, but all the answers received are kind of retrieving table values, they do not modify the table. Plus, I added the timestamp
column that I initially omitted thinking that would simplify my question.
This can be accomplished by using the window function lag()
to retrieve the previous row and determine whether or not it has changed, followed by the window function sum()
to retrieve the trip_id
:
with cte as (
select *, case when
session_id - lag(session_id, 1, session_id)
over (partition by user_id order by timestamp) = 0
then 0 else 1 end as diff
from trajectory
)
select user_id, session_id, timestamp, lat, lon, 1 + sum(diff) over (partition by user_id order by timestamp) as trip_id
from cte
order by user_id, timestamp
Assuming the user_id/timestamp is unique, the update can be :
ALTER TABLE trajectory ADD COLUMN trip_id int;
with cte as (
select *, case when session_id - lag(session_id, 1, session_id) over (partition by user_id order by timestamp) = 0 then 0 else 1 end as diff
from trajectory
),
cte2 as (
select user_id, session_id, timestamp, lat, lon, 1 + sum(diff) over (partition by user_id order by timestamp) as trip_id
from cte
order by user_id, timestamp
)
UPDATE trajectory
SET trip_id = cte2.trip_id
FROM cte2
WHERE trajectory.timestamp = cte2.timestamp and trajectory.user_id = cte2.user_id
Result :
user_id | session_id | timestamp | lat | lon | trip_id |
---|---|---|---|---|---|
1 | 25304 | 2008-10-23 02:53:04+01 | 39.984702 | 116.318417 | 1 |
1 | 25304 | 2008-10-23 02:53:10+01 | 39.984683 | 116.31845 | 1 |
1 | 25304 | 2008-10-23 02:53:15+01 | 39.984686 | 116.318417 | 1 |
1 | 25304 | 2008-10-23 02:53:20+01 | 39.984688 | 116.318385 | 1 |
1 | 20959 | 2008-10-24 02:09:59+01 | 40.008304 | 116.319876 | 2 |
1 | 20959 | 2008-10-24 02:10:04+01 | 40.008413 | 116.319962 | 2 |
1 | 20959 | 2008-10-24 02:10:14+01 | 40.007171 | 116.319458 | 2 |
2 | 55305 | 2008-10-23 05:53:05+01 | 39.984094 | 116.319236 | 1 |
2 | 55305 | 2008-10-23 05:53:11+01 | 39.984198 | 116.319322 | 1 |
2 | 55305 | 2008-10-23 05:53:21+01 | 39.984224 | 116.319402 | 1 |
2 | 34104 | 2008-10-23 23:41:04+01 | 40.013867 | 116.306473 | 2 |
2 | 34104 | 2008-10-23 23:41:16+01 | 40.013907 | 116.306488 | 2 |