Search code examples
sqlaggregate-functionswindow-functions

How to write calculate columns in SQL


This is a sample of my database: enter image description here

I'm trying to write a query that returns the following:

worker_id,
avg salary for 2020 (basic_salary+bonus),
basic_salary for 2021,
ever been in level C (yes/no),
average basic_salary \>6000 (yes/no),
how many salaries the employee had until today.

here's a query i wrote:

// select worker_id, AVG(CASE WHEN year(month)=2020
                THEN BASIC_SALARY+bonus
                ELSE NULL END) AS avg_2020,
avg(case when year(month)=2021 
then BASIC_SALARY
else null end) as avg_2021,
case when t1.level='C' then 'YES' 
else 'NO'
end as 'Level C',
CASE WHEN AVG(BASIC_SALARY) >6000 
            THEN 'yes'
            ELSE 'no' 
       END AS 'above_6000',
count(BASIC_SALARY) 'HowManySalaries'
from [workers_table] t1
join data t2
on t1.level=t2.level
group by worker_id,t1.level
order by WORKER_ID
//

and the results:

worker_id   avg_2020    avg_2021    Level C above_6000  HowManySalaries
2011-11-11  NULL    3500    NO  no  4
2011-11-11  NULL    5000    NO  no  2
2011-11-11  NULL    4000    YES yes 3
2011-11-12  8666    4500    NO  yes 4
2011-11-12  9666    NULL    NO  yes 3
2011-11-12  7000    6000    YES no  2
2011-11-12  9000    NULL    NO  no  1
2011-11-13  NULL    10000   NO  yes 3
2011-11-13  9000    4500    NO  no  3
2011-11-13  10000   18000   YES yes 5
2011-11-13  12500   6750    NO  yes 5
2011-11-14  NULL    3600    NO  no  1
2011-11-14  NULL    7200    NO  yes 2
2011-11-14  NULL    4800    YES no  1

maybe there is a better way to write the query? i need to get 1 line for each worker_id.


Solution

  • Here's my solution. I just briefly checked through the result/values, so it may not be 100% correct, but seems correct. Double check for correctness before using this. :)

    select 
        worker_id,
        AVG(CASE WHEN year(w.month)=2020 THEN w.BASIC_SALARY+b.bonus ELSE NULL END) AS avg_2020,
        AVG(CASE WHEN year(w.month)=2021 THEN w.BASIC_SALARY ELSE NULL END) AS avg_2021,
        CASE WHEN MAX(CASE WHEN w.level='C' THEN 1 ELSE 0 END)=1 THEN 'YES' ELSE 'NO' END AS 'Level C',
        CASE WHEN AVG(w.BASIC_SALARY) > 6000 THEN 'yes' ELSE 'no' END AS 'above_6000',
        COUNT(w.BASIC_SALARY) 'HowManySalaries'
    FROM workers_table w
    JOIN data b on b.level=w.level
    GROUP BY w.worker_id
    

    In your query you're grouping by both worker id and the bonus level. So your rows end up with all possible combinations between those two values. Limiting it to just the worker_id results in one row per worker. Though this will cause your "in level C" column to not work properly since a basic CASE WHEN isn't paying attention to all rows given for a specific worker_id. Expanding the CASE WHEN with a MAX-aggregation will fix this though.