I have a function that deletes rows from many tables, as much as 130+ tables, in a loop. The problem isn't the DELETE
statements themselves, they seem to complete fine in a timely fashion, but for some reason, the LOOP
take much more time to exit.
Here's the function:
CREATE OR REPLACE FUNCTION myschema.delete_dependents(p_id smallint)
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE
v_query_delete TEXT;
i record;
BEGIN
DROP TABLE IF EXISTS tmp_dependents_table;
CREATE TEMP TABLE tmp_dependents_table (
schema_name varchar(255),
table_oid oid,
table_name varchar(255),
level smallint
);
INSERT INTO tmp_dependents_table
SELECT DISTINCT
ot.schema_name
,ot.table_oid
,ot.table_name
,ot.level
FROM related_tables_recursive(current_schema()) AS ot
INNER JOIN pg_attribute AS pga
ON pga.attrelid = ot.table_oid
WHERE pga.attname = 'col_id'
AND ot.level > 1
ORDER BY ot.level DESC NULLS LAST;
FOR i IN
SELECT * FROM tmp_dependents_table
LOOP
v_query_delete := '';
v_query_delete := 'DELETE FROM ' || i.schema_name || '.' || i.table_name
|| ' WHERE col_id = ' || p_id::TEXT || ';';
RAISE NOTICE 'EXECUTING --> %', v_query_delete;
EXECUTE v_query_delete;
RAISE NOTICE 'EXECUTED DELETE.';
END LOOP;
END;
$function$
;
If I execute the SELECT
statement alone that feeds the temp. table, I do get the relevant data, which is all the tables with a column called col_id
The logs (RAISE NOTICE
) show the function is going through all the tables and performing the deletes fast, no problem, but the loop takes forever to exit.
The last logged EXECUTED DELETE
is the last record in tmp_dependents_table.
I've looked for triggers, and indeed one of these tables (which holds 22k records when running the function on the test data I have) had one defined on DELETE. I've deleted that trigger.
I've looked for rules. The only ones I found for my schema are rules on SELECTs.
Is there anything else that could be triggered automatically making my code take much more time than needed to complete?
Some FK violations seem to have caused the problem. One possible solution: Define your FK constraints with ON DELETE CASCADE
- if you want dependent rows to be deleted also.
See:
While being at it, you can largely simplify your function, and make it cheaper and safer:
CREATE OR REPLACE FUNCTION myschema.delete_dependents(p_id smallint)
RETURNS void
LANGUAGE plpgsql AS
$func$
DECLARE
_sql text;
_tbl regclass;
BEGIN
FOR _tbl IN
SELECT ot.table_oid -- schema_name & table_name are redundant noise
FROM related_tables_recursive(current_schema()) ot -- function can probably simplified, too
WHERE ot.level > 1 -- excludes null
AND EXISTS (
SELECT FROM pg_attribute pga
WHERE pga.attrelid = ot.table_oid
AND pga.attname = 'col_id'
AND NOT pga.attisdropped -- must be ruled out
)
GROUP BY ot.table_oid -- why would you expect duplicates?
ORDER BY max(ot.level) DESC, ot.table_oid
-- nulls have been excluded, so no point in "NULLS LAST"
-- make sort order unambiguous to avoid deadlocks with concurrent transactions
LOOP
-- _sql := ''; -- noise
_sql := format('DELETE FROM %s WHERE col_id = $1', _tbl);
RAISE NOTICE 'EXECUTING --> %', _sql;
EXECUTE _sql USING p_id; -- pass value as value
RAISE NOTICE 'EXECUTED DELETE.';
END LOOP;
END
$func$;
You don't need a temporary table for this.
Your original is vulnerable to SQL injection. Identifiers have to be treated as possibly unsafe user input and must be double-quoted where necessary to be sure. About that, and the role of regclass
: