Search code examples
sqlpostgresqlmetabase

SQL-Query get best incidents by solution_time for each month


I have the following SQL-Query in Metabase:

SELECT
  date_trunc('month', "resolvedAt") AS "resolvedAt",
  (CAST(avg("solution_time") AS float) / CASE WHEN 3600.0 = 0 THEN NULL ELSE 3600.0 END) AS "Average Resolution Time",
  12 AS "Minium",
  8 AS "Expected"
FROM "custom_Incident"
RIGHT JOIN "ims_Incident" ON "incident" = "ims_Incident"."id"
WHERE {{resolveAt}}
  AND {{assigned_group_name}}
  AND {{service}}
  AND {{incident_type}}
  AND "statusDetail" = 'Closed'
  AND "ims_Incident"."uid" NOT IN
    (SELECT "uid"
     FROM "ignore_Incident")
  AND "ims_Incident"."uid" IN
    (SELECT "incident_id"
     FROM "custom_Incident"
     WHERE "custom_Incident"."incident_id" in
         (SELECT "uid"
          FROM "ims_Incident"
          WHERE {{resolveAt}}
            AND {{assigned_group_name}}
            AND {{service}}
            AND {{incident_type}}
            AND "statusDetail" = 'Closed'
            AND "ims_Incident"."uid" NOT IN
              (SELECT "uid"
               FROM "ignore_Incident"))
     ORDER BY "solution_time"
     LIMIT {{take_into_account}})
GROUP BY date_trunc('month', "ims_Incident"."resolvedAt")

And I want to get the best ... let`s say 100 ... Incidents orderd by "solution_time" PER MONTH. So now I get the best with {{take_into_account}} (100) for the hole timeperiod of {{resolveAt}} but I want to split the {{resolveAt}} timeperiod (maybe 3 month) and for each of the month I want to get the best 100 Incidents orderd by "solution_time". So with 3 month there would be 300 incidents with {{take_into_account}} = 100 there would be 100 for each month.

How can I do that?


Solution

  • Introduce ROW_NUMBER() OVER(PARTITION BY date_trunc('month', "resolvedAt") ORDER BY "solution_time") into your query, this will assign a set of integers starting at 1 for each month (that's what the "partition" does) and the row numbers increase with solution_time, so the shortest solution time in each month will have a row number of 1. Then it is a matter of selecting by those row numbers:

    SELECT
      date_trunc('month', "resolvedAt") AS "resolvedAt",
      (CAST(avg("solution_time") AS float) / CASE WHEN 3600.0 = 0 THEN NULL ELSE 3600.0 END) AS "Average Resolution Time",
      12 AS "Minium",
      8 AS "Expected"
    
    FROM "ims_Incident"
    LEFT JOIN "custom_Incident" ON "ims_Incident"."incident" = "ims_Incident"."id"
    
    WHERE {{resolveAt}}
      AND {{assigned_group_name}}
      AND {{service}}
      AND {{incident_type}}
      AND "statusDetail" = 'Closed'
      AND "ims_Incident"."uid" NOT IN
      
        (SELECT "uid"
         FROM "ignore_Incident"
         WHERE "uid" IS NOT NULL)
         
      AND "ims_Incident"."uid" IN
        (SELECT "incident_id"
         FROM (
            SELECT 
              "incident_id", 
              ROW_NUMBER() OVER(PARTITION BY date_trunc('month', "resolvedAt") ORDER BY "solution_time") as rn
            FROM "custom_Incident"
            WHERE "custom_Incident"."incident_id" in
                (SELECT "uid"
                 FROM "ims_Incident"
                 WHERE {{resolveAt}}
                   AND {{assigned_group_name}}
                   AND {{service}}
                   AND {{incident_type}}
                   AND "statusDetail" = 'Closed'
                   AND "ims_Incident"."uid" NOT IN
                     (SELECT "uid"
                      FROM "ignore_Incident"))
          ) t
          WHERE t.rn <= {{take_into_account}})
    GROUP BY date_trunc('month', "ims_Incident"."resolvedAt")
    

    As mentioned in comments above using "right join" is unnecessary, and generally there is a preference to left join optional data instead. Also be wary of using "not in" against any subquery that might return a NULL value, because NOT IN with NULL produces very unexpected results.