Good afternoon! I am trying to write a query using SQL in Teradata to show me the Z - score for each year in my data. I thought this should be a pretty simple problem, but I keep getting unexpected outputs or errors with each variation I try, and I'm not sure exactly what I'm doing wrong. I found a few similar posts on here, but I wasn't able to get them working for my specific situation. I'm still learning SQL so please bear with me and let me know if you need any more information. Thanks in advance for any assistance!
Formula for Zscore:
z = (x-μ)/σ, where x is the raw score, μ is the population mean, and σ is the population standard deviation
My data is structured like this: data structure But has a few more columns with customer info in them but for the purposes of this report I won't be using any of those and only care about the order year, not any of the other date info.
Order_Year | Order_Count | Zscore |
---|---|---|
2023 | 1443553 | -2.175361497 |
2019 | 11298753 | 0.313237462 |
2022 | 13058147 | 0.757513182 |
2020 | 10673440 | 0.155335716 |
2021 | 12912660 | 0.720775338 |
2018 | 10963180 | 0.228499799 |
select
OrderInfo.Order_Year
,sum(OrderInfo.Order_Count) as Order_Cnt
,(sum(OrderInfo.Order_Count) - AVG(sum(OrderInfo.Order_Count)) OVER (PARTITION BY OrderInfo.Order_Year)) /
STDDEV_samp(sum(OrderInfo.Order_Count)) OVER (PARTITION BY OrderInfo.Order_Year) as zscore
from ENTERPRISE_STATS.OrderInfo
group by OrderInfo.Order_Year
order by OrderInfo.Order_Year
;
Order_Year | Order_Cnt | zscore |
---|---|---|
2018 | 10963180 | |
2019 | 11298753 | |
2020 | 10673440 | |
2021 | 12912660 | |
2022 | 13058147 | |
2023 | 1443553 |
I tried a couple variations of this and I think it has something to with the sums, but if I remove any of those, it gives me this error:
Executed as Single statement. Failed [3504 : HY000] Selected non-aggregate values must be part of the associated group. Elapsed time = 00:00:00.047
STATEMENT 1: Select Statement failed.
WITH Order_Cnt AS(
SELECT
OrderInfo.Order_Year
,SUM(OrderInfo.Order_Count) AS Order_Cnt
FROM ENTERPRISE_STATS.OrderInfo
GROUP BY OrderInfo.Order_Year
ORDER BY OrderInfo.Order_Year
)
SELECT
OrderInfo.Order_Year, Order_Cnt,
CAST((Order_Cnt - AVG(Order_Cnt) over())/
STDDEV(Order_Cnt) over () as decimal(10,5)) AS Zscore
FROM Order_Cnt