I have two tables in a postgres database, posts
and users
. posts
has a user_id
foreign key that references the users.id
primary key column. Both tables are very large.
I have just deleted a random set of users (about 80% of the total users) and I want to delete all posts that reference deleted users, effectively an anti-join and delete. What is the most efficient way of doing this?
Currently I have this:
DELETE FROM posts l
WHERE NOT EXISTS
(
SELECT NULL
FROM users r
WHERE r.id = l.user_id
)
Is there a more efficient way of doing this?
If you want to delete 80% of users, then the fastest way is probably:
create table temp_posts as
select p.*
from posts p
where exists (select 1 from users u where u.id = p.user_id);
truncate table posts;
insert into posts
select *
from temp_posts;
Batch inserts are much less work than updating most of the rows in the table. Of course, you should test this carefully. Truncating the table is a fast way to remove all the rows from it.