Search code examples
sqloracleperformancesubquerycorrelated-subquery

find deleted positiones effectively in sql oracle


I have a huge table with millions of purchase orders. I try to find a better way to achieve the goal from performance perspective as well.

here is some sample data, each order has several positions, position 0 is header and position 1-2 are some items. there are situations that buyer places an order with several positions and later he can adjust this positions,for example deletes all of them and in that case all positions get the flaf 'y' and position 0 as well(in our case order2 in June). In some cases he doesn't delete all of them just one as in our simple dataset (order1 in June).

create table orders (
po varchar2(6),
pos number,
flag char(1),
datum date
)

insert into orders values ('order1',0,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order1',1,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order1',2,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order1',0,'n',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order1',1,'y',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order1',2,'n',to_date('01-06-2020','dd-MM-yyyy'));

insert into orders values ('order2',0,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order2',1,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order2',2,'n',to_date('01-01-2020','dd-MM-yyyy'));
insert into orders values ('order2',0,'y',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order2',1,'y',to_date('01-06-2020','dd-MM-yyyy'));
insert into orders values ('order2',2,'y',to_date('01-06-2020','dd-MM-yyyy'));

the goal is to find just the order name with at least one deleted position (except position 0 this will be set autimatically by deleting all positions 1-2 )

In our case the result is just order1

this one will not work of course

select distinct po from orders where pos <> 0 and flag = 'y';

the solution is subquery or correlated subquery

select distinct po from orders
where pos <> 0 and flag = 'y'
and po not in (
select distinct po from orders
where pos = 0 and flag = 'y');

but I wonder if there is a better solution from performance perspective because the table is queried twice and with huge amount of POs it is too slow of course

any ideas?


Solution

  • You probably need conditional aggregation:

    select po
    from tab
    group by po
    having max(case when pos <> 0 then flag end) = 'y' -- at least one deleted position
       and max(case when pos =  0 then flag end) = 'n' -- but no deleted header