Search code examples
sqlpostgresqlwindow-functions

Postgres: update table column with subquery


I have this trip GPS sampling table:

SELECT * FROM trajecttories_splitted;
user_id |   session_id   |       timestamp        |    lat    |    lon     | alt 
---------+----------------+------------------------+-----------+------------+-----
       1 | 20081023025304 | 2008-10-23 02:53:04+01 | 39.984702 | 116.318417 | 492
       1 | 20081023025304 | 2008-10-23 02:53:10+01 | 39.984683 |  116.31845 | 492
       1 | 20081023025304 | 2008-10-23 02:53:15+01 | 39.984686 | 116.318417 | 492
       1 | 20081023025304 | 2008-10-23 02:53:20+01 | 39.984688 | 116.318385 | 492
       1 | 20081023025304 | 2008-10-23 02:53:25+01 | 39.984655 | 116.318263 | 492
       1 | 20081023025304 | 2008-10-23 02:53:30+01 | 39.984611 | 116.318026 | 493
       1 | 20081023025304 | 2008-10-23 02:53:35+01 | 39.984608 | 116.317761 | 493
       1 | 20081023025304 | 2008-10-23 02:53:40+01 | 39.984563 | 116.317517 | 496
       1 | 20081023025304 | 2008-10-23 02:53:45+01 | 39.984539 | 116.317294 | 500
       1 | 20081023025304 | 2008-10-23 02:53:50+01 | 39.984606 | 116.317065 | 505

Then for ease of analyses, I added a column sampling_rate to keep track of the GPS sampling intervals between consecutive rows.

Now I want to set the values of the new column to the difference: kind of row[sampling_rate]=timestamp -LAG(timestamp)

So I use:

    UPDATE trajectories_splitted 
    SET sampling_rate=timestamp -LAG(timestamp) OVER (
        PARTITION BY user_id
        ORDER BY session_id
    )
    
    ERROR:  window functions are not allowed in UPDATE
 LINE 2:     SET sampling_rate=timestamp -LAG(timestamp) OVER (

Expected results:

 user_id |   session_id   |       timestamp        |    lat    |    lon     | alt | sampling_rate 
---------+----------------+------------------------+-----------+------------+-----+---------------
       1 | 20081023025304 | 2008-10-23 02:53:04+01 | 39.984702 | 116.318417 | 492 |              
       1 | 20081023025304 | 2008-10-23 02:53:10+01 | 39.984683 |  116.31845 | 492 | 6             
       1 | 20081023025304 | 2008-10-23 02:53:15+01 | 39.984686 | 116.318417 | 492 | 5           
       1 | 20081023025304 | 2008-10-23 02:53:20+01 | 39.984688 | 116.318385 | 492 | 10             
       1 | 20081023025304 | 2008-10-23 02:53:25+01 | 39.984655 | 116.318263 | 492 | 5            
       1 | 20081023025304 | 2008-10-23 02:53:30+01 | 39.984611 | 116.318026 | 493 | 5             
       1 | 20081023025304 | 2008-10-23 02:53:35+01 | 39.984608 | 116.317761 | 493 | 5            
       1 | 20081023025304 | 2008-10-23 02:53:40+01 | 39.984563 | 116.317517 | 496 | 5             
       1 | 20081023025304 | 2008-10-23 02:53:45+01 | 39.984539 | 116.317294 | 500 | 5             
       1 | 20081023025304 | 2008-10-23 02:53:50+01 | 39.984606 | 116.317065 | 505 | 5

EDIT

Based on the first answer below, postgres reports the error:

ERROR:  column "sampling_rate" is of type timestamp with time zone but expression is of type interval
LINE 2:     set sampling_rate = ts.timestamp - tts.prev_timestamp

If it might help, this is the table structure:

\d trajectories_splitted
                    Table "postgres.trajectories_splitted"
    Column     |           Type           | Collation | Nullable | Default 
---------------+--------------------------+-----------+----------+---------
 user_id       | integer                  |           |          | 
 session_id    | bigint                   |           | not null | 
 timestamp     | timestamp with time zone |           | not null | 
 lat           | double precision         |           | not null | 
 lon           | double precision         |           | not null | 
 alt           | double precision         |           |          | 
 sampling_rate | timestamp with time zone |           |          | 
Indexes:
    "trajectories_splitted_pkey" PRIMARY KEY, btree (session_id, "timestamp")
    "traj_splitted" btree (user_id, session_id, "timestamp")

Solution

  • You need to calculate the value in a subquery and join:

    update trajectories_splitted ts
        set sampling_rate = ts.timestamp - tts.prev_timestamp
        from (select ts.*,
                     lag(timestamp) over (partition by user_id, session_id order by timestamp) as prev_timestamp
        
              from trajectories_splitted ts
             ) tts
        where tts.user_id = ts.user_id and tts.session_id = ts.session_id and
              tts.timestamp = ts.timestamp