Here is my case:
I would like to calculate quantity and price for a given item on any given date.
Prices are calculated using total item quantity and unit price so price changes with respect to item's quantity.
warehouse_1
states that item was shipped from that warehouse, warehouse_2
states that item was sent to that warehouse.
Here is my logic:
Fetch deliveries for each item and sum their quantities. (1st CTE)
Find the sum of quantities in both warehouses separately. (2nd CTE)
Calculate final quantity and multiply it by unit price.
Show result which consists of item id, quantity and price.
I wrote a query which does the calculations correctly BUT it gets exponentially slower when data count gets bigger. (Takes 5 seconds on my DB with 6k rows, almost locks DB on my coworker's DB with 21k rows)
How can I optimize this query? I am doing cumulative calculations on 2nd CTE for each row coming from 1st CTE and that needs a rework I believe.
Can I use
LAG()
function for this use case? I tried that with something like
LAG(a.deliveryTotal) over(order by a.updated desc rows between unbounded preceding and current row)
instead of the CASE block in 2nd CTE but I can't seem to figure out how to use filter()
or put a condition inside LAG() statement.
Here is my query:
`
with deliveriesCTE as (
select
row_number() over(partition by it.id
order by
dd.updated asc) as rn,
sum(dd.quantity) as deliveryTotal,
dd.updated as updated,
it.id as item_id,
d.warehouse_1 as outWH,
d.warehouse_2 as inWH,
d.company_code as company
from
deliveries d
join deliveries_detail dd on
dd.deliveries_id = d.id
join items it on
it.id = dd.item_id
where
...
group by
dd.updated,
it.id,
d.warehouse_1,
d.warehouse_2,
d.company_code
order by
dd.updated asc),
cumulativeTotalsByUnit as (
select
distinct on
(a.item_id) a.rn,
a.deliveryTotal,
a.updated,
a.item_id,
a.outWH,
a.inWH,
a.company,
case
when a.rn = 1
and a.outWH is not null then coalesce(a.deliveryTotal,
0)
else (
select
coalesce(sum(b.deliveryTotal) filter(
where b.outWH is not null),
0)
from
deliveriesCTE b
where
a.item_id = b.item_id
and b.rn <= a.rn)
end as outWHTotal,
case
when a.rn = 1
and a.inWH is not null then coalesce(a.deliveryTotal,
0)
else (
select
coalesce(sum(b.deliveryTotal) filter(
where b.inWH is not null),
0)
from
deliveriesCTE b
where
a.item_id = b.item_id
and b.rn <= a.rn)
end as inWHTotal
from
deliveriesCTE a
order by
a.item_id,
a.updated desc)
select
resultView.item_id,
resultView.quantity,
resultView.price
from
(
select
cumTotals.item_id,
cumTotals.inWHTotal - cumTotals.outWHTotal as quantity,
p.price * (cumTotals.inWHTotal - cumTotals.outWHTotal) as price
from
prices p
join cumulativeTotalsByUnit cumTotals on
cumTotals.item_id = p.item_id ) resultView
where
resultView.rn = 1;
`
It's hard to say for use without a MCV, but my guess on what you are trying to do is do a Windowed SUM() calculation as opposed to LAG(). There is documentation Here.
The query cumulativeTotalsByUnit
shouldn't be necessary and is likely quadratic to do the complex self-referential join.
Your delivery CTE should look like:
select
sum(dd.quantity) over (partition by it.id ORDER BY dd.updated asc) as deliveryTotal,
dd.updated as updated,
it.id as item_id,
d.warehouse_1 as outWH,
d.warehouse_2 as inWH,
d.company_code as company
from
deliveries d
join deliveries_detail dd on
dd.deliveries_id = d.id
join items it on
it.id = dd.item_id
where
...
group by
dd.updated,
it.id,
d.warehouse_1,
d.warehouse_2,
d.company_code
order by
dd.updated asc