Search code examples
sqlpostgresqlplpgsqlsql-deletedynamic-sql

Loop with many DELETEs not finishing even though the statements complete OK


I have a function that deletes rows from many tables, as much as 130+ tables, in a loop. The problem isn't the DELETE statements themselves, they seem to complete fine in a timely fashion, but for some reason, the LOOP take much more time to exit.

Here's the function:

CREATE OR REPLACE FUNCTION myschema.delete_dependents(p_id smallint)
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE
    v_query_delete          TEXT;
    i                       record;
BEGIN

DROP TABLE IF EXISTS tmp_dependents_table;
CREATE TEMP TABLE tmp_dependents_table (
    schema_name     varchar(255),
    table_oid       oid,
    table_name      varchar(255),
    level           smallint    
);

INSERT INTO tmp_dependents_table
SELECT DISTINCT
    ot.schema_name
    ,ot.table_oid
    ,ot.table_name
    ,ot.level
FROM related_tables_recursive(current_schema()) AS ot
INNER JOIN pg_attribute AS pga
ON pga.attrelid = ot.table_oid
WHERE   pga.attname = 'col_id'
    AND ot.level > 1
ORDER BY ot.level DESC NULLS LAST;

FOR i IN 
    SELECT * FROM tmp_dependents_table
LOOP
            
    v_query_delete := '';
    
    v_query_delete := 'DELETE FROM ' || i.schema_name || '.' || i.table_name
        || ' WHERE col_id = ' || p_id::TEXT || ';';
    
    RAISE NOTICE 'EXECUTING --> %', v_query_delete;
    EXECUTE v_query_delete;
    RAISE NOTICE 'EXECUTED DELETE.';
    
END LOOP;
    
END;
$function$
;

If I execute the SELECT statement alone that feeds the temp. table, I do get the relevant data, which is all the tables with a column called col_id

The logs (RAISE NOTICE) show the function is going through all the tables and performing the deletes fast, no problem, but the loop takes forever to exit.

The last logged EXECUTED DELETE is the last record in tmp_dependents_table.

I've looked for triggers, and indeed one of these tables (which holds 22k records when running the function on the test data I have) had one defined on DELETE. I've deleted that trigger.

I've looked for rules. The only ones I found for my schema are rules on SELECTs.

Is there anything else that could be triggered automatically making my code take much more time than needed to complete?


Solution

  • Some FK violations seem to have caused the problem. One possible solution: Define your FK constraints with ON DELETE CASCADE - if you want dependent rows to be deleted also.
    See:

    While being at it, you can largely simplify your function, and make it cheaper and safer:

    CREATE OR REPLACE FUNCTION myschema.delete_dependents(p_id smallint)
      RETURNS void
      LANGUAGE plpgsql AS
    $func$
    DECLARE
       _sql text;
       _tbl regclass;
    BEGIN
       FOR _tbl IN 
          SELECT ot.table_oid  -- schema_name & table_name are redundant noise
          FROM   related_tables_recursive(current_schema()) ot  -- function can probably simplified, too
          WHERE  ot.level > 1  -- excludes null
          AND    EXISTS (
             SELECT FROM pg_attribute pga
             WHERE  pga.attrelid = ot.table_oid
             AND    pga.attname = 'col_id'
             AND    NOT pga.attisdropped  -- must be ruled out
             )
          GROUP BY ot.table_oid  -- why would you expect duplicates?
          ORDER BY max(ot.level) DESC, ot.table_oid
          -- nulls have been excluded, so no point in "NULLS LAST"
          -- make sort order unambiguous to avoid deadlocks with concurrent transactions
       LOOP
           -- _sql := '';  -- noise
           _sql := format('DELETE FROM %s WHERE col_id = $1', _tbl);
    
           RAISE NOTICE 'EXECUTING --> %', _sql;
           EXECUTE _sql USING p_id;  -- pass value as value
           RAISE NOTICE 'EXECUTED DELETE.';
       END LOOP;
    END
    $func$;
    

    You don't need a temporary table for this.

    Your original is vulnerable to SQL injection. Identifiers have to be treated as possibly unsafe user input and must be double-quoted where necessary to be sure. About that, and the role of regclass: