I am trying determine how far below/above a persons average a venue performs. For example the query would return the venue, date, and how far below the expected average it performed. From the data below you can see Pauls average was 100 not including Lakeview so he went exactly his average and Mel went 20 below her average at Lakeview so Lakeview performed -10 on average below peoples average distance.
The query return would look like this:
Venue Date Performance
-------------------------------------
Lakeview 12/08/2024 -10
Firstly I need to group by name to get their average. However I only wish to do include their last 5 distances.
I can do that with this query
SELECT name, AVG(distance) as avg_distance
FROM
(SELECT
database.*,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY date DESC) AS seqnum
FROM database) s
WHERE seqnum <= 5
GROUP BY name
Then I need to group by venue and find the average distance of everyone that performed on this date and at this venue. Which I can do with this query
SELECT venue, AVG(distance)
FROM "database"
GROUP BY venue, date
Then I need to subtract the distance on this date vs their average to see how it performed. What is the easiest way to do this?
Sample Data
Name Venue Distance Date
-------------------------------------
Paul Holland 120 20/07/2024
Paul Venus 80 21/07/2024
Paul Lakeview 100 12/08/2024
Tom Lakeview 120 12/08/2024
Mel Lakeview 140 12/08/2024
Mel Hightail 160 05/05/2024
One option is to crate a cte (or subquery) with added columns with average distances per name and per venue/date. It could be done using Avg() Over() analytic function.
WITH -- S a m p l e D a t a :
tbl ( a_name, venue, distance, a_date ) AS
( Select 'Paul', 'Holland', 120, To_Date('20/07/2024', 'dd/mm/yyyy') Union All
Select 'Paul', 'Venus', 80, To_Date('21/07/2024', 'dd/mm/yyyy') Union All
Select 'Paul', 'Lakeview', 100, To_Date('12/08/2024', 'dd/mm/yyyy') Union All
Select 'Tom', 'Lakeview', 120, To_Date('12/08/2024', 'dd/mm/yyyy') Union All
Select 'Mel', 'Lakeview', 140, To_Date('12/08/2024', 'dd/mm/yyyy') Union All
Select 'Mel', 'Hightail', 160, To_Date('05/05/2024', 'dd/mm/yyyy')
),
... cte grid to get averages per name (last 5 dates) and per venue/date (all distances)
grid AS
( SELECT a.*,
Round(Avg(Case When rn_name <= 5 Then a.distance End) Over(Partition By a_name ), 2) as avg_name,
Round(Avg(a.distance) Over(Partition By venue, a_date ), 2) as avg_venue_date
FROM ( Select Row_Number() Over(Partition By a_name Order By a_name, a_date Desc) as rn_name,
a_name, venue, distance, a_date
From tbl
) a
)
-- M a i n S Q L :
Select venue, a_date, Sum(avg_venue_date - avg_name) as performance
From grid
Group By a_date, venue
Order By a_date, venue
/* R e s u l t :
venue a_date performance
----------- ---------- -----------
Hightail 2024-05-05 10.00
Holland 2024-07-20 20.00
Venus 2024-07-21 -20.00
Lakeview 2024-08-12 -10.00 */
See the fiddle here.
Addition:
If your table name is database (which is bad but anyway) and if column names /types in that table are name /varchar, venue /varchar, distance /integer and date /date (date is also very, very bad name of a column 'cause it is datatype too) then try to use the complete code like below. There is another (first) cte that should adjust your table to the rest of the code:
WITH
tbl AS
( Select t.name as a_name, t.venue, t.distance, t.date as a_date
From database t
),
grid AS
( SELECT a.*,
Round(Avg(Case When rn_name <= 5 Then a.distance End) Over(Partition By a_name ), 2) as avg_name,
Round(Avg(a.distance) Over(Partition By venue, a_date ), 2) as avg_venue_date
FROM ( Select Row_Number() Over(Partition By a_name Order By a_name, a_date Desc) as rn_name,
a_name, venue, distance, a_date
From tbl
) a
)
Select venue, a_date, Sum(avg_venue_date - avg_name) as performance
From grid
Group By a_date, venue
Order By a_date, venue