Search code examples
sqlpostgresqlstored-proceduressqlperformancepostgresql-performance

PSQL replacement for stored procedure, too slow


I have property, each property has contracts, each contract has integer field rental_area.

Previously I had to get the rental_area of all contracts summed by property and this worked.

 SELECT 
        Sum(cr.rental_area)         total_property_rental_area,
        -- bunch of other cr fields
 FROM  appdata.contract_rental cr
        INNER JOIN appdata.domain_building b1 
                ON ( b1.building_id = cr.building_id ) 
        INNER JOIN appdata.domain_immovable im1 
                ON ( im1.immovable_id = b1.immovable_id ) 
 GROUP  BY im1.property_id

Now the logic changed, and contract has a list of periods, and one of those periods contain the rental_area of that contract. And finding that correct period needed some special logic.

I tried to join the logic to the query, but could not find a place, where to put it so I had to create stored procedure.

SELECT Sum(p.rental_area) total_property_rental_area 
       -- bunch of other cr fields
FROM   appdata.contract_rental cr 
       JOIN appdata.rental_period p 
         ON p.id = Get_current_period_id(cr.contract_rental_id, 
                   cr.end_date_actual) 
       INNER JOIN appdata.domain_building b1 
               ON ( b1.building_id = cr.building_id ) 
       INNER JOIN appdata.domain_immovable im1 
               ON ( im1.immovable_id = b1.immovable_id ) 
GROUP  BY im1.property_id   

Procedure:

CREATE OR REPLACE FUNCTION appdata.get_current_period_id(in contract_id_in bigint, in end_date_actual_in Date)
  RETURNS bigint AS
$BODY$
DECLARE
    period_id bigint;
BEGIN
    -- find the period that matches with end date or current date
    select id into period_id
       from rental_period
       where contract_id = contract_id_in
         and Coalesce(end_date_actual_in, Now()) >= start_date
       order by start_date desc  limit 1;
    -- if there was no period, just take the first one
    IF period_id is null THEN
        select id into period_id
           from rental_period
           where contract_id = contract_id_in
           order by start_date asc
           limit 1;
    END IF;

    return period_id;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

But now it is too slow, is there a way to put the period finding logic into the sql to make it faster, without using stored procedure? The point is that, for reach contract, it has to only get single period based on the logic.


Solution

  • Roll out your stored procedure back into the main SQL, the primary hint being coalescing your 2 queries (if one is null then use the other)

    Example: Instead of

    p.id = Get_current_period_id(cr.contract_rental_id, 
                   cr.end_date_actual)
    

    Use:

    p.pid = coalesce(
        (select rpx.id 
         from rental_period rpx
         where contract_id = cr.contract_rental_id
         and Coalesce(cr.end_date, Now()) >= start_date
         order by start_date desc  limit 1;
        ),
      ( select rpy.id 
           from rental_period rpy
           where contract_id = cr.contract_rental_id
           order by start_date asc
           limit 1;
      )
    )
    

    As per the comment below, the following index may also help:

     create index on rental_period (contract_id, start_date asc) 
    

    be sure to analyze the table afterward to update the statistics.