I have a function to get the balance for a customer.
CREATE OR REPLACE FUNCTION default1.get_balance (par_customer_id DECIMAL(31, 0))
RETURNS DECIMAL(31,15)
LANGUAGE SQL
DETERMINISTIC
NO EXTERNAL ACTION
READS SQL DATA
BEGIN
DECLARE var_balance DECIMAL(31,15);
SELECT SUM(amount)
INTO var_balance
FROM default1.accounting accounting
WHERE accounting.customer_id = par_customer_id
AND (YEAR(accounting.accounting_date) >= YEAR(SYSDATE)-3 or accounting.accounting_date IS NULL)
AND paid_date IS NULL
AND accounting_type_id <> 2
AND NOT EXISTS (
SELECT 1
FROM default1.accounting_detail detail
WHERE accounting.id = detail.accounting_id
AND detail.paid_date IS NOT NULL);
RETURN var_balance;
END
The performance for getting the balance of one customer is good, but using the function in a query to get the balance for multiple customers at once it gets really slow.
SELECT default1.get_balance(customer.id), customer.*
FROM default1.customer customer
WHERE customer.id < 1000
This query takes over 2 minutes to execute.
When I replace the function in the query with a subselect it is much faster.
SELECT
(SELECT SUM(amount)
FROM default1.accounting accounting
WHERE accounting.customer_id = customer.id
AND (YEAR(accounting.accounting_date) >= YEAR(SYSDATE)-3 or accounting.accounting_date IS NULL)
AND paid_date IS NULL
AND accounting_type_id <> 2
AND NOT EXISTS (
SELECT 1
FROM default1.accounting_detail detail
WHERE accounting.id = detail.accounting_id
AND detail.paid_date IS NOT NULL)),
customer.*
FROM
default1.customer customer
WHERE customer.id < 1000
This query takes about 8 seconds.
I did execute the both queries multiple times in different orders without any significant change in the runtime. So I don't think that it is a caching issue.
Why does the query with the function takes about 15 times longer than the query with the subselect?
Is there anything I can change in the function to make it faster?
I'm assuming DB2 for LUW.
Performance of your function can suffer because it uses a compiled compound statement as its body (BEGIN ... END
). Try using an inlined compound statement: BEGIN ATOMIC ... END
. Even better, you can probably simply use the RETURN
statement only:
CREATE OR REPLACE FUNCTION default1.get_balance (par_customer_id DECIMAL(31, 0))
RETURNS DECIMAL(31,15)
LANGUAGE SQL
NOT DETERMINISTIC
NO EXTERNAL ACTION
READS SQL DATA
RETURN SELECT SUM(amount)
INTO var_balance
FROM default1.accounting accounting
WHERE accounting.customer_id = par_customer_id
AND (YEAR(accounting.accounting_date) >= YEAR(SYSDATE)-3 or accounting.accounting_date IS NULL)
AND paid_date IS NULL
AND accounting_type_id <> 2
AND NOT EXISTS (
SELECT 1
FROM default1.accounting_detail detail
WHERE accounting.id = detail.accounting_id
AND detail.paid_date IS NOT NULL);
When using a compiled compound statement, each invocation of the function causes a context switch from the SQL data access engine to the PSM execution engine and back, while the inlined statement becomes a part of the query plan itself.
Note that you shouldn't declare this function as DETERMINISTIC
, because it's not; mis-declaring a non-deterministic function could cause unexpected results.