I have a table like this:
Id Name ProductId
1 Apple 1
2 Apple null
3 Apple 2
4 Orange 1
5 Orange 2
6 Pear null
7 Lemon 1
8 Lemon null
I want to delete a row if it's ProductId is null
and if it's Name
is occurs more than once.
At this example if I run a proper delete query, it should delete these rows:
2 Apple null
8 Lemon null
Which kind of delete query can work for me?
I would recommend using aggregation or something similar before joining:
delete t from test t join
(select t.name, count(*) as cnt
from test t
group by t.name
) tt
on t.name = tt.name
where tt.cnt > 1 and t.product_id is null;
This is much better than a self join without aggregation. Why? Because each row is identified exactly once. In your sample data, a self-join without aggregation attempts to delete row id = 2 twice (once for the match to 1 and once for the match to 3). That is unnecessary. And it can become highly inefficient if a name
has many rows.
I also think that you don't simply want a cnt of 2 but you want a non-NULL
product id. That is:
delete t from test t join
(select t.name, count(*) as cnt
from test t
where product_id is not null
group by t.name
) tt
on t.name = tt.name
where tt.cnt >= 1 and t.product_id is null;
Here is a db<>fiddle.