Search code examples
mysqljoinsubqueryquery-optimizationstored-functions

MySQL Query (Sub Queries + Composed Functions + JOIN operations) takes too long to run


How can I revise the following query with subqueries composed of functions and join queries in functions. I want to append extra values to my main query that relies on main tables and two primary joins (risks, users) on several occasions.

Creating a MCVRE (Minimal Complete Verifiable Reproduceable Example) proved to be somewhat challenging because of request sent to SQL Fiddle has too many rows (too many text characters) After removing nearly all rows on main two tables ( users, risks ) I ended up with a running query.

The Fiddle (http://www.sqlfiddle.com/#!9/1d52a0/17) create functions and insertion of data commands have reduced rows from actual example on my local pc due to character count of 8000 being exceeded for request payload for SQLFiddle to understand.

Actual table has about 100 rows for risks, and 20 or so rows for users and takes about 3 seconds to run

What can I do to speed up query, via staving desired function results in table, or by revision, index insertion, movement of joins to outer main query, or even using stored procedure, or rewriting query structure, to reduce execution time to possibly half the time or less optimistically. SQL fiddle does not take all rows needed so I pasted a very limited subset, Even SQLFiddle query (see total select query below) does not run, due to Stack Overflow (pun partially intended).

http://www.sqlfiddle.com/#!9/1d52a0/17

Base Queries that do run on the fiddle (see fiddle)

select * from users;
select * from risks;
select * from riskevents;
select * from riskmatrixthresholds;
select * from risklevels;

#significantly minimized result set but still query does not run due to stack overflow issue on sql fiddle - see fiddle result (on bottom most portion of fiddle query output)

SELECT  r.RiskID,   
    r.CreatorID,
    r.OwnerID,
    r.ApproverID,
    r.RiskTitle,
    r.RiskStatement,
    r.ClosureCriteria,
    r.RiskState,
    r.Context                                                                               AS 'Context',
    GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'Last','')    AS 'creator.lastname',
    GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'First','')   AS 'creator.firstname',
    GetRiskUserLastOrFirstName(GetRiskUserID('Owner',   r.RiskID,0),r.RiskID, 'Last','')    AS 'owner.lastname', 
    GetRiskUserLastOrFirstName(GetRiskUserID('Owner',   r.RiskID,0),r.RiskID, 'First','')   AS 'owner.firstname',
    GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'Last','')    AS 'approver.lastname',
    GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'First','')   AS 'approver.firstname',
    r.Likelihood                                                                            AS 'OriginalLikelihood',
    r.Technical                                                                             AS 'OriginalTechnical',
    r.Schedule                                                                              AS 'OriginalSchedule',
    r.Cost                                                                                  AS 'OriginalCost',
    GREATEST(r.Technical, r.Schedule, r.Cost)                                               AS 'OriginalConsequence',
    RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0)                    AS 'OriginalValue',
    RiskLevel(RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0),'')      AS 'OriginalLevel',
    LatestEventDate(r.RiskID, r.AssessmentDate,'')                                          AS 'LatestEventDate',
    r.AssessmentDate                                                                        AS 'AssessmentDate',
    (SELECT CurrentLikelihood(r.RiskID,0))                                                  AS 'CurrentLikelihood',
    (SELECT CurrentConsequence(r.RiskID,0))                                                 AS 'CurrentConsequence',
    (SELECT CurrentRiskValue(r.RiskID,0))                                                   AS 'CurrentValue',
    (SELECT RiskLevel(CurrentRiskValue(r.RiskID,0),''))                                     AS 'CurrentLevel'
FROM  risks r;

Create Function Script

   CREATE TABLE `riskevents` (
  `ID` int NOT NULL AUTO_INCREMENT,
  `EventID` int ,
  `RiskID` int ,
  `EventTitle` text,
  `EventStatus` varchar(10) ,
  `EventOwnerID` int ,
  `ActualDate` date ,
  `ScheduleDate` date ,
  `BaselineDate` date ,
  `ActualLikelihood` int ,
  `ActualTechnical` int ,
  `ActualSchedule` int ,
  `ActualCost` int ,
  `ScheduledLikelihood` int ,
  `ScheduledTechnical` int ,
  `ScheduledSchedule` int ,
  `ScheduledCost` int ,
  `BaselineLikelihood` int ,
  `BaselineTechnical` int ,
  `BaselineSchedule` int ,
  `BaselineCost` int ,
  PRIMARY KEY (`ID`)
)
CREATE TABLE `risklevels` (
  `ID` int NOT NULL AUTO_INCREMENT,
  `RiskLevelID` int ,
  `RiskMaximum` float ,
  `RiskHigh` float ,
  `RiskMedium` float ,
  `RiskMinimum` float ,
  PRIMARY KEY (`ID`)
) 

CREATE TABLE `riskmatrixthresholds` (
  `ID` int NOT NULL AUTO_INCREMENT,
  `CellID` int ,
  `Likelihood` int ,
  `Consequence` int ,
  `Level` decimal(2,2) ,
  PRIMARY KEY (`ID`)
) 

CREATE TABLE `risks` (
  `ID` int NOT NULL AUTO_INCREMENT,
  `RiskState` varchar(10) ,
  `RiskID` int ,
  `RiskTitle` text CHARACTER SET latin1,
  `RiskStatement` text CHARACTER SET latin1,
  `ApproverID` int ,
  `OwnerID` int ,
  `CreatorID` int ,
  `Likelihood` int ,
  `Technical` int ,
  `Schedule` int ,
  `Cost` int ,
  `ClosureCriteria` text CHARACTER SET latin1,
  `CategoryID` int ,
  `AssessmentDate` date ,
  `CompletionDate` date ,
  `ClosureDate` date ,
  `Context` text,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `risk_index` (`RiskID`)
) 

CREATE TABLE `users` (
  `ID` int NOT NULL AUTO_INCREMENT,
  `UserID` int NOT NULL,
  `LastName` char(25) ,
  `FirstName` char(15) ,
  `Title` char(20) ,
  `Email` varchar(30) ,
  `Phone` char(12) ,
  `Extension` char(4) ,
  `Department` char(25) ,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `user_index` (`UserID`),
  KEY `SURROGATE` (`UserID`)
)

insert  into `riskevents`(`ID`,`EventID`,`RiskID`,`EventTitle`,`EventStatus`,`EventOwnerID`,`ActualDate`,`ScheduleDate`,`BaselineDate`,`ActualLikelihood`,`ActualTechnical`,`ActualSchedule`,`ActualCost`,`ScheduledLikelihood`,`ScheduledTechnical`,`ScheduledSchedule`,`ScheduledCost`,`BaselineLikelihood`,`BaselineTechnical`,`BaselineSchedule`,`BaselineCost`) values 
(171,0,1,'Risk','Complete',5,'2019-06-14',NULL,'2019-06-14',5,2,2,5,NULL,NULL,NULL,NULL,5,2,2,5),
(184,0,10,'Risk','Complete',21,'2019-10-07',NULL,'2019-10-07',5,4,5,4,NULL,NULL,NULL,NULL,5,4,5,4));

insert  into `risklevels`(`ID`,`RiskLevelID`,`RiskMaximum`,`RiskHigh`,`RiskMedium`,`RiskMinimum`) values 
(1,1,1,0.55,0.3,0);

insert  into `riskmatrixthresholds`(`ID`,`CellID`,`Likelihood`,`Consequence`,`Level`) values 
(1,1,1,1,0.09),
(2,2,1,2,0.12),
(3,3,1,3,0.16),
(4,4,1,4,0.19),
(5,5,1,5,0.23),
(6,6,2,1,0.12),
(7,7,2,2,0.19),
(8,8,2,3,0.27),
(9,9,2,4,0.34),
(10,10,2,5,0.41),
(11,11,3,1,0.16),
(12,12,3,2,0.27),
(13,13,3,3,0.37),
(14,14,3,4,0.48),
(15,15,3,5,0.59),
(16,16,4,1,0.19),
(17,17,4,2,0.34),
(18,18,4,3,0.48),
(19,19,4,4,0.63),
(20,20,4,5,0.77),
(21,21,5,1,0.23),
(22,22,5,2,0.41),
(23,23,5,3,0.59),
(24,24,5,4,0.77),
(25,25,5,5,0.95);

insert  into `risks`(`ID`,`RiskState`,`RiskID`,`RiskTitle`,`RiskStatement`,`ApproverID`,`OwnerID`,`CreatorID`,`Likelihood`,`Technical`,`Schedule`,`Cost`,`ClosureCriteria`,`CategoryID`,`AssessmentDate`,`CompletionDate`,`ClosureDate`,`Context`) values 
(1,'Completed',1,'t','t',1,5,1,5,2,2,5,'t',NULL,'2019-06-14','2020-09-26',NULL,'t'),
(2,'Completed',2,'t','t',2,1,1,5,3,4,2,'test',NULL,'2019-05-14',NULL,NULL,'t'),

insert  into `users`(`ID`,`UserID`,`LastName`,`FirstName`,`Title`,`Email`,`Phone`,`Extension`,`Department`) values 
(1,1,'Admin','','Admin','[email protected]','17890','1234',''),
(2,2,'Last','First','Engineer','[email protected]','123890','1234','Supplier');

CREATE FUNCTION Consequence(technical int, sched int, cost int, consequence int) RETURNS int
BEGIN
    select GREATEST(technical, sched, cost) into consequence;
return consequence;
END;

CREATE FUNCTION CurrentRiskEventID(riskidentifier int, eid int) RETURNS int
BEGIN
select MAX(e.EventID) into eid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
return riskeventid;
END;

CREATE FUNCTION CurrentConsequence(riskidentifier int, currentconsequence int) RETURNS int
BEGIN
SELECT coalesce(
       (SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM   riskevents 
WHERE  id = CurrentRiskEventID(riskidentifier, 0) 
and    actualtechnical is not null
ANDactualschedule is not null
andactualschedule is not null),
       (SELECT greatest(technical, schedule, cost)
       from risks 
       Where riskid = riskidentifier)
) into currentconsequence;
return currentconsequence;
END;

CREATE FUNCTION CurrentLikelihood(riskidentifier int, currentlikelihood int) RETURNS int
BEGIN
SELECT coalesce( 
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
END;

CREATE FUNCTION CurrentRiskLevel(riskidentifier int, currentrisklevel int) RETURNS int
BEGIN
select RiskLevel(CurrentRiskValue(riskidentifier, 0), '') into currentrisklevel;
return currentrisklevel;
END;

CREATE FUNCTION CurrentRiskValue(riskidentifier int, currentriskvalue int) RETURNS int
BEGIN
SELECT RiskValue(CurrentLikelihood(riskidentifier, 0), CurrentConsequence(riskidentifier, 0), 0) into currentriskvalue;
return currentriskvalue;
END;

CREATE FUNCTION GetRiskUserID(riskusertype VARCHAR(25), riskidentifier int, riskuserid int) RETURNS int
BEGIN
SELECT COALESCE(userres.userid, 0) into riskuserid FROM
(
SELECT r.creatorid, r.ownerid, r.approverid, u.userid
FROM risks r, users u
WHERE r.riskid = (select riskidentifier) and
      (
((select riskusertype) = 'Creator' AND u.userid = r.creatorid) OR
((select riskusertype) = 'Approver' AND u.userid = r.approverid) OR
((select riskusertype) = 'Owner' AND u.userid = r.ownerid)
)
) userres;
RETURN riskuserid;
END;
CREATE FUNCTION GetRiskUserLastOrFirstName(riskuserid int, riskid int, whichname char(25), firstorlastname char(25)) RETURNS char(25) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
SELECT (case
  when whichname = 'Last'  then u.LastName
  WHEN whichname = 'First' THEN u.FirstName
end)
into firstorlastname
FROM users u,risks r 
WHERE u.UserID = riskuserid
AND r.RiskID = riskid;
return firstorlastname;
END;

CREATE FUNCTION LatestEventDate(riskidentifier int, riskassessmentdate date, latestdate date) RETURNS date
BEGIN
SELECT COALESCE(
(SELECT ActualDate FROM riskevents evt WHERE evt.eventid = CurrentRiskEventID(riskidentifier, 0) and evt.riskid = riskidentifier), 
(SELECT riskassessmentdate)) into latestdate;
return latestdate;
END;

CREATE FUNCTION RiskLevel(riskvalue int, risklevel varchar(4)) RETURNS varchar(4)
begin
SELECT
       CASE 
    WHEN riskvalue >= levels.riskhigh*100 THEN 'High'
    WHEN riskvalue >= levels.riskmedium*100 THEN 'Med'
    ELSE 'Low'
       ENd as cat into risklevel
FROM risklevels levels;
return risklevel;
END;

CREATE FUNCTION RiskValue(likelihood int, consequence int, riskvalue int) RETURNS int
BEGIN
SELECT m.level*100 INTO riskvalue FROM riskmatrixthresholds m WHERE m.likelihood = likelihood AND m.consequence = consequence;
  RETURN riskvalue;   
END;

http://www.sqlfiddle.com/#!9/1d52a0/17


Solution

  • Application of the following changes in CurrentLikelihood() and CurrentConsequence() reduced total query execution time to exec 0.070 sec, total 0.082 sec.

    Old Current Likelihood Query (producing slow and incorrect output)

    SELECT coalesce( 
    (SELECT actuallikelihood
    FROM riskevents
    WHERE id = CurrentRiskEventID(riskidentifier, 0)),
    (SELECT r.likelihood
    FROM risks r
    WHERE r.riskid = riskidentifier)) into currentlikelihood;
    return currentlikelihood;
    

    Working CurrentLikelihood Query

    SELECT actuallikelihood INTO currentlikelihood 
    FROM riskevents
    WHERE eventid = CurrentRiskEventID(riskidentifier)
    AND riskid = riskidentifier;
    

    Old CurrentConsequence Query (producing slow and incorrect output)

    SELECT coalesce(
           (SELECT GREATEST(actualtechnical, actualschedule, actualcost)
    FROM   riskevents 
    WHERE  id = CurrentRiskEventID(riskidentifier, 0) 
    and    actualtechnical is not null
    and actualschedule is not null),
           (SELECT greatest(technical, schedule, cost)
           from risks 
           Where riskid = riskidentifier)
    ) into currentconsequence;
    

    Working CurrentConsequence Query

    SELECT GREATEST(actualtechnical, actualschedule, actualcost) INTO currentconsequence
    FROM    riskevents
    WHERE   eventid = CurrentRiskEventID(riskidentifier)
    AND riskid = riskidentifier;
    

    Old CurrentRiskEventID() Query

    select MAX(e.EventID) into currentriskeventid
    FROM riskevents e
    WHERE e.eventstatus not in('Open')
    AND e.riskid = riskidentifier;
    

    Modified GetRiskEventID() function

    SELECT MAX(e.EventID) INTO currentriskeventid
        FROM riskevents e
        WHERE e.riskid = riskidentifier AND 
            (e.eventstatus != 'Open'
            OR
            (e.EventID = 0 AND e.eventstatus = 'Open'));