I am using the below python code to update postgres DB column value
based on Id
. This loop has to run for thousands of records and it is taking longer time.
Is there a way where I can pass array of dataframe values instead of looping each row?
for i in range(0,len(df)):
QUERY=""" UPDATE "Table" SET "value"='%s' WHERE "Table"."id"='%s'
""" % (df['value'][i], df['id'][i])
cur.execute(QUERY)
conn.commit()
Depends on a library you use to communicate with PostgreSQL, but usually bulk inserts are much faster via COPY FROM command.
If you use psycopg2 it is as simple as following:
cursor.copy_from(io.StringIO(string_variable), "destination_table", columns=('id', 'value'))
Where string_variable is tab and new line delimited dataset like 1\tvalue1\n2\tvalue2\n
.
To achieve a performant bulk update I would do:
Create a temporary table: CREATE TEMPORARY TABLE tmp_table;
;
Insert records with copy_from;
Just update destination table with query UPDATE destination_table SET value = t.value FROM tmp_table t WHERE id = t.id
or any other preferred syntax