Search code examples
sqlquery-optimizationfirebirdsql-deletefirebird-3.0

How to optimize SQL delete query with subselect for Firebird?


The following query is extremely slow. It seems the subselect is executed for each row in the table?!

delete from HISTORY
where ID in (
  select ID from (
    select ID, ROW_NUMBER() over(partition by SOURCE order by ID desc) as NUM from HISTORY
  ) where NUM > 100
);

This is a cleanup query. It should delete everything but the 100 most recent records per SOURCE.

The time required seems to depend only on the number of records in the table and not on how many records are to be deleted. Even with only 10,000 records it takes several minutes. However, if I only execute the sub-select, it is fast.

Of course there is a PK on ID and a FK and index on SOURCE (both are Integer columns).


Solution

  • Firebird 3 added DELETE option into MERGE clause. It was first mentioned in Release Notes. It is now properly documented in Firebird 3 SQL Reference.

    Modelling by the examples there the cleanup query would look like that:

    merge into HISTORY HDel
    using ( select ID, SOURCE, ROW_NUMBER() over
                    (partition by SOURCE order by ID desc) as NUM 
            from HISTORY ) HVal
       on (HVal.NUM > 100) and (HVal.ID = HDel.ID) and (HVal.Source = HDel.Source)
    WHEN MATCHED THEN DELETE
    

    In your specific database (HVal.Source = HDel.Source) filtering seems redundant, but i still decided to add it to make the query as generic as possibe, for future readers. Better safe than sorry :-)


    Firebird 2.x did not provide for that feature, and with FB3's MERGE/DELETE and Window Functions features missing one can fall back to explicit imperative programming and write good old loops. It would take writing and executing a small PSQL program (either a persistent named Stored Procedure or ad hoc EXECUTE BLOCK statement) with making explicit loop over SOURCE values.

    Something like (i did not syntax-check it, just scratching from memory):

    execute block as
    declare variable SRC_VAL integer;
    declare variable ID_VAL integer;
    begin
      for select distinct SOURCE from HISTORY into :SRC_VAL do begin
         :ID_VAL = NULL;
         select first(1) skip(100) ID from HISTORY
           where SOURCE = :SRC_VAL
           order by ID desc
           into :ID_VAL;
         if (:ID_VAL IS NOT NULL) then
           delete from HISTORY 
             where SOURCE = :SRC_VAL 
               and ID <= :ID_VAL;
      end
    end