Search code examples
sqlsql-serverperformance

Counting Eligible Employee that can Participate in a Course Plan


Suppose we have tables like below:

employee

employee_id position_name Name
E1 Staff Bob
E2 Staff Bib
E3 Manager Bub
E4 Manager Bab

courses

course_id Name
C1 Basic A
C2 Advance A
C3 Expert A
C4 General A

employee_certificate

employee_certificate_id course_id employee_id
EC1 C1 E2
EC2 C2 E2
EC3 C1 E1

course_plan

course_plan_id course_id position_name
CP1 C2 Staff
CP2 C3 Manager
CP3 C4 Staff

course_plan_requirement

course_plan_requirement_id course_id certificate_course_id
CPR1 C3 C1
CPR2 C3 C2
CPR3 C2 C1

Each employee can complete 0 to many courses and get certified which recorded in employee_certificate

Each course_plan can 0 to many course_plan requirement which recorded in course_plan_requirement

An Employee can participate in a course_plan if the employee meets the course_plan_requirement and have the corresponding position, for example: To participate in course_plan with course C3 an employee have to be a Manager and certified in course C1, and C1,

To participate in course_plan with course C4 an employee have to be a Staff and can participate even without having certified in any course,

as we can see from course_plan_requirement.

The Objective is to calculate how many employees can participate in each course_plan.

I have come up with this query

with
CPR as(
    select cp.course_id, cp.position_name, count(cpr.certificate_course_id) as RCount
    from course_plan cp
    left join course_plan_requirement cpr on cp.course_id = cpr.course_id
    group by cp.course_id, cp.position_name
),
ECR as(
    select e.employee_id, cp.course_id, e.position_name, COUNT(cpr.certificate_course_id) as RCount
    from course_plan cp 
    left join employee e on cp.position_name = e.position_name
    left join employee_certificate ec on e.employee_id = ec.employee_id
    left join course_plan_requirement cpr on ec.course_id = cpr.certificate_course_id
    group by e.employee_id, cp.course_id, e.position_name
),
CEC as(
    select e.employee_id, e.position_name, cpr.course_id, ecr.RCount as AC, cpr.RCount BC,
    case 
        when ecr.RCount is null and cpr.RCount is null then 1
        when ecr.RCount = cpr.RCount then 1 else 0
    end as Eligibility
    from CPR cpr
    left join employee e on cpr.position_name = e.position_name
    left join ECR ecr on e.employee_id = ecr.employee_id
),
EligibleEmployee as (
    select cec.position_name, cec.course_id, sum(cec.Eligibility) as EligibleMP
    from CEC cec
    group by cec.position_name, cec.course_id
)
select * 
from course_plan cp
left join EligibleEmployee ee on cp.course_id = ee.course_id and cp.position_name = ee.position_name

But this has performance issues in my real database it took a while to query. I wonder if I'm missing something obvious that can simplify the query process.

I hope that I could get some help from SQL Server experts here.


Solution

  • So as per my understanding:

    • Each course as one mandatory position
    • Each course can have zero or more certificate requirement

    You can take advantage of OUTER JOIN and GROUP BY with HAVING as follows:

    SELECT
        CP.COURSE_PLAN_ID,
        E.EMPLOYEE_ID
      FROM
             COURSES C
          JOIN COURSE_PLAN             CP ON CP.COURSE_ID = C.COURSE_ID
          JOIN EMPLOYEE                E ON E.EPOSITION_NAME = CP.POSITION_NAME
          LEFT JOIN COURSE_PLAN_REQUIREMENT CPR ON CP.COURSE_ID = CPR.COURSE_ID
          LEFT JOIN EMPLOYEE_CERTIFICATE    EC ON EC.COURSE_ID = CPR.CERTIFICATION_COURSE_ID
           AND EC.EMPLOYEE_ID = E.EMPLOYEE_ID
     GROUP BY
        CP.COURSE_PLAN_ID,
        E.EMPLOYEE_ID
    HAVING
        COUNT(DISTINCT CPR.CERTIFICATE_COURSE_ID) = COUNT(DISTINCT EC.COURDSE_ID);