Part of my query is:
select a.*, sum(case isneg when 2 then -paymentamount else amount end) over(order by coalesce(b.paymentdate, a.paymentdate)) as running_sum,
row_number() over(order by coalesce(b.paymentdate, a.paymentdate)) as rn
from Cheques a
left join b on b.code = a.code
where customercode = @customercode
If I replace the coalesce(b.paymentdate, a.paymentdate)
with just b.paymentdate then my Estimated Subtree Cost in the execution plan decreases dramatically. Is this because coalesce is a function call and is there a more performant way to do this?
Table b is a select with a max and a group by.
Since you didn't provide the query plan, I am assuming that the difference in the estimated subtree cost between using coalesce(b.paymentdate, a.paymentdate)
and b.paymentdate
in your query is likely due to the optimizer's estimation based on the complexity of the expression.
COALESCE
is a function that evaluates multiple expressions and returns the first non-null value. In your case, it will check if b.paymentdate
is null and if so, it will use a.paymentdate
instead. To go deeper and simulate how it is doing, the function call itself incurs some overhead in terms of CPU cycles.
If are sure that b.paymentdate
will always have a non null value whenever it is joined with Cheques
, you can just replace coalesce(b.paymentdate, a.paymentdate)
with just b.paymentdate
. This will make it simpler and potentially improve the performance of your query.
However, if b.paymentdate
can be null in some cases, and you want to fallback to a.paymentdate
when that happens, then in this case using COALESCE
is mandatory.
As for optimizing , you mentioned that table b
is derived from a select with a MAX
and GROUP BY
. In this case, make sure that you have the appropriate indexes on the columns used in the join (Cheques.code
and b.code
). Same thing for the grouping and filtering columns (b.paymentdate
, a.paymentdate
, and customercode
).