Search code examples
sqlsql-serveradventureworks

COUNT and MAX/MIN and (HAVING?) in one query/ AdventureWorks2017 task


I have a training task in AdventureWorks2017 DB. The task is the following: I need a list of the Jobtitle in which the fewest and most women work in proportions. Only consider those in which at least 4 people work.

So far my code looks like this:

select a.JobTitle, AVG(ertek) as avg_women, COUNT(*) as sum_women
from
(select BusinessEntityID, JobTitle, Gender, CASE WHEN Gender = 'F' THEN 1.0 ELSE 0.0
END as ertek
from HumanResources.Employee) a
group by a.JobTitle

This gives me back all the JobTitles, the avarage of women in those titles and the sum of the women in the jobtitles. The problem is that I cant use MAX and MIN in the query (Maybe I need to create another subquery?). And I need to get the last part of the task "Only consider those in which at least 4 people work." Thanks for the help!

ER diagram: https://homel.vsb.cz/~dan11/ddj/AdventureWorks2008_db_diagram.pdf


Solution

  • Take these kind of requirements one at a time and build your solution gradually.

    Run your query after each additional step to confirm the expected result!

    Steps

    1. Only job titles with more than 4 people:
      group by e.JobTitle having count(1) >= 4
    2. Count women proportionally (I interpret this as "the relative number of women within one job title"). This requires the overall count and the women only count:
      count(1) as JobTitleCount and
      count(case when e.Gender = 'F' then 1 end) as FemaleCount
      The relative (proportional) number or percentage then becomes:
      count(case when e.Gender = 'F' then 1 end)*100.0/count(1) as FemalePercentage
    3. "Fewest" and "most" means ranking functions with an order by clause:
      dense_rank() over(order by <percentage from step 2 comes here> ) as RankLeast and
      dense_rank() over(order by <percentage from step 2 comes here> desc) as RankMost
    4. Filter on the job titles that are in first place for either ranking:
      where jfpr.RankLeast = 1 or jfpr.RankMost = 1

    Intermediate result

    Steps 1. to 3.

    select  e.JobTitle,
            count(1) as JobTitleCount,
            count(case when e.Gender = 'F' then 1 end) as FemaleCount,
            count(case when e.Gender = 'F' then 1 end)*100.0/count(1) as FemalePercentage,
            dense_rank() over(order by count(case when e.Gender = 'F' then 1 end)*100.0/count(1)     ) as FemalePercentageRankLeast,
            dense_rank() over(order by count(case when e.Gender = 'F' then 1 end)*100.0/count(1) desc) as FemalePercentageRankMost
    from HumanResources.Employee e
    group by e.JobTitle
    having count(1) >= 4
    order by FemalePercentage;
    

    Full solution

    Moving the entire previous query in a subquery jfpr and leaving out the columns we no longer need for the final result (JobTitleCount and FemaleCount). The order by moves to outside the subquery.

    select  jfpr.JobTitle,
            jfpr.FemalePercentage
    from (  select  e.JobTitle,
                    count(case when e.Gender = 'F' then 1 end)*100.0/count(1) as FemalePercentage,
                    dense_rank() over(order by count(case when e.Gender = 'F' then 1 end)*100.0/count(1)     ) as FemalePercentageRankLeast,
                    dense_rank() over(order by count(case when e.Gender = 'F' then 1 end)*100.0/count(1) desc) as FemalePercentageRankMost
            from HumanResources.Employee e
            group by e.JobTitle
            having count(1) >= 4 ) jfpr -- job female percentage rank
    where jfpr.FemalePercentageRankLeast = 1
       or jfpr.FemalePercentageRankMost = 1
    order by jfpr.FemalePercentage;
    

    Result

    On my copy of AdventureWorks (no idea what version I have right now) this produces:

    JobTitle                      FemalePercentage
    ----------------------------  ----------------
    Quality Assurance Technician   0.000000000000
    Scheduling Assistant           0.000000000000
    Janitor                       50.000000000000
    Application Specialist        50.000000000000