I have two queries in a Microsoft Access database. They are named Average
and Home_Runs
. They both share the same first three columns Name
, [Year]
and Month
.
Query: Average
Name Year Month Average
Cabrera 2013 5 .379
Fielder 2013 5 .245
Martinez 2013 5 .235
Cabrera 2013 6 .378
Fielder 2013 6 .278
Martinez 2013 6 .240
Query: Home_Runs
Name Year Month Home Runs
Cabrera 2013 5 12
Fielder 2013 5 2
Martinez 2013 5 2
Cabrera 2013 6 9
Fielder 2013 6 4
Martinez 2013 6 4
I need to offset the data before I begin the calculations. I need to determine how the Home Runs from one month relate the the Average from the previous month. So it is not a direct month-to-month comparison. I need to perform a month-to-previous-month comparison.
I need to do two calculate two things from these two queries.
First: With Average
being the X-axis and Home_Runs
being the Y-Axis. I need to find the correlation between these data points.
Second: With Average
being the X-axis and Home_Runs
being the Y-Axis. I need to find the equation of the best-fit-line between all of these data points. More specifically I need to find the value of the Y variable when the X variable equals certain values.
Additional Information:
In the end I need to return a table that looks like this:
Calculation Tier 1 Tier 2 Tier 3 Correlation
Avgerage to Home Runs .04 3.00 6.00 .80
What is the best way to accomplish these things?
Here is the SQL Fiddle example for you to play with and tweak to get it exactly right:
SELECT (Avg(A.Paverage * H.HomeRuns) - Avg(A.Paverage) * Avg(H.HomeRuns)) /
(StDevP(A.Paverage) * StDevP(H.HomeRuns)) AS Correlation,
(Sum(A.Paverage * H.HomeRuns) - (Sum(A.Paverage) * Sum(H.HomeRuns) /
Count(*))) / (Sum(A.Paverage * A.Paverage) - (Sum(A.Paverage) * Sum(A.Paverage) / Count(*))) AS LineBestFit
FROM Averages AS A
INNER JOIN Home_Runs AS H
ON (A.Pname = H.Pname)
AND (A.Pyear = H.Pyear)
AND ((A.Pmonth - 1) = H.Pmonth)