Search code examples
postgresqlindexingdatabase-performance

Simple update query taking too long - Postgres


I have a table with 28 million rows that I want to update. It has around 60 columns and a ID column (primary key) with an index created on it. I created four new columns and I want to populate them with the data from four columns from other table which also has an ID column with an index created on it. Both tables have the same amount of rows and just the primary key and the index on the IDENTI column. The query has been running for 15 hours and since it is high priority work, we are starting to get nervous about it and we don't have so much time to experiment with queries. We have never update a table so big (7 GB), so we are not sure if this amount of time is normal.

This is the query:

UPDATE consolidated
SET IDEDUP2=uni.IDEDUP2
USE21=uni.USE21
USE22=uni.USE22
PESOXX2=uni.PESOXX2
FROM uni_group uni, consolidated con
WHERE con.IDENTI=uni.IDENTI

How can I make it faster? Is it possible? If not, is there a way to check how much longer it is going to take (without killing the process)?

Just as additional information, we have ran before much more complex queries for 3 million row tables (postgis) and It has taken it about 15 hours as well.


Solution

  • You should not repeat the target table in the FROM clause. Your statement creates a cartesian join of the consolidated table with itself, which is not what you want.

    You should use the following:

    UPDATE consolidated con
       SET IDEDUP2=uni.IDEDUP2
           USE21=uni.USE21
           USE22=uni.USE22
           PESOXX2=uni.PESOXX2
    FROM uni_group uni
    WHERE con.IDENTI = uni.IDENTI