I have a huge table with millions of purchase orders. I try to find a better way to achieve the goal from performance perspective as well.
here is some sample data, each order has several positions, position 0 is header and position 1-2 are some items. there are situations that buyer places an order with several positions and later he can adjust this positions,for example deletes all of them and in that case all positions get the flaf 'y' and position 0 as well(in our case order2 in June). In some cases he doesn't delete all of them just one as in our simple dataset (order1 in June).
create table orders (
po varchar2(6),
pos number,
flag char(1),
datum date
)
insert into orders values ('order1',0,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order1',1,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order1',2,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order1',0,'n',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order1',1,'y',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order1',2,'n',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order2',0,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order2',1,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order2',2,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order2',0,'y',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order2',1,'y',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order2',2,'y',to_date('01-06-2020','dd-MM-yyyy'));
the goal is to find just the order name with at least one deleted position (except position 0 this will be set autimatically by deleting all positions 1-2 )
In our case the result is just order1
this one will not work of course
select distinct po from orders where pos <> 0 and flag = 'y';
the solution is subquery or correlated subquery
select distinct po from orders
where pos <> 0 and flag = 'y'
and po not in (
select distinct po from orders
where pos = 0 and flag = 'y');
but I wonder if there is a better solution from performance perspective because the table is queried twice and with huge amount of POs it is too slow of course
any ideas?
You probably need conditional aggregation:
select po
from tab
group by po
having max(case when pos <> 0 then flag end) = 'y' -- at least one deleted position
and max(case when pos = 0 then flag end) = 'n' -- but no deleted header