I have a table tbl
in Postgres with 50 million rows. The tbl
has an index on column_1
and there are a lot of queries to this table like
select * from tbl
where column_1 = 'value'
Each query returns 0-30 rows, 10 on avarage.
Once a day I completely update data in table. The query is like
delete from tbl;
insert into tbl
select * from tbl_2;
commit;
The challenge I face is the query runs too long: about 2-3 hours. It's probably because of index. Is there a way to speed up the data update and allow user to query tbl
while it's being updated.
If this is important - the update process is run in python Airflow and the queries come from python web app.
Build a copy of the table:
CREATE TABLE tbl_new (LIKE tbl);
INSERT INTO tbl_new
SELECT * FROM tbl_2;
Then, create all required indexes on that table. Once you are done, switch tables:
BEGIN;
DROP TABLE tbl;
ALTER TABLE tbl_new RENAME TO tbl;
COMMIT;
That will be fast.