I've got a table with 11 columns for track and field and I've created an SQL query that sorts the table by a specific event, age group and gender, takes out all the duplicates and keeps the one with the best result. For some events it also differentiates between the results that had wind readings and those that didn't as well as if the wind was over a certain speed.
@export on;
@export set filename="/home/filename.csv" CsvColumnDelimiter=",";
select * from Results2015;
SELECT resu.Result,
resu.Wind,
resu.`Full Name`,
resu.Province,
resu.BirthDate,
resu.Position,
resu.Location,
resu.Date
FROM Results2015 resu
JOIN (
SELECT MIN(Result) BestResult,
Wind, `Full Name`, Gender,
Event, Agegroup
FROM Results2015
GROUP BY `Full Name`, Gender, Event, Agegroup
) best
ON resu.Result = best.BestResult
AND resu.Wind = best.Wind
AND resu.`Full Name` = best.`Full Name`
AND resu.Gender = best.Gender
AND resu.Event = best.Event
AND resu.Agegroup = best.Agegroup
WHERE resu.Event = '100m'
AND resu.Gender = 'F'
AND resu.Agegroup = 'Junior'
AND resu.Wind <> ''
AND resu.Wind <= 2
ORDER BY resu.Result asc;
It works quite well but I've noticed that it's been missing a lot of the results that includes wind readings and I have no idea why. Here's a sample of the table I use
Result Wind Full Name Province BirthDate Position Location Date Event Gender Agegroup
12.78 -3.6 Name 4 WPA D.o.B 6 Bellville 3-Feb 100m F Junior
12.87 -3.6 Name 2 WPA D.o.B 7 Bellville 8-Feb 100m M Youth
12.64 -0.8 Name 3 WPA D.o.B 2 Bellville 8-Feb 100m F Junior
12.02 -0.8 Name 4 WPA D.o.B 1 Bellville 8-Feb 100m F Junior
12.84 -0.8 Name 5 WPA D.o.B 3 Bellville 8-Feb 100m F Junior
13.07 -0.8 Name 6 WPA D.o.B 4 Bellville 8-Feb 100m F Junior
13.23 -0.8 Name 7 WPA D.o.B 5 Bellville 8-Feb 100m F Junior
13.71 -4.3 Name 8 WPA D.o.B 1 Bellville 8-Feb 100m F Junior
13.85 -4.3 Name 9 WPA D.o.B 2 Bellville 8-Feb 100m F Junior
14.33 -4.3 Name 10 WPA D.o.B 3 Bellville 8-Feb 100m F Junior
14.69 Name 11 WPA D.o.B 4 Bellville 2-Feb 100m F Junior
13.11 -2.9 Name 12 WPA D.o.B 1 Bellville 8-Feb 100m F Sub-Youth
13.43 -2.9 Name 13 WPA D.o.B 2 Bellville 8-Feb 100m F Sub-Youth
13.53 -2.9 Name 12 WPA D.o.B 3 Bellville 14-Feb 100m F Sub-Youth
13.60 -1.5 Name 15 WPA D.o.B 1 Bellville 14-Feb 100m F Sub-Youth
For some reason it completely skips Name 4 in the output. The data for Name 4 is exactly the same as the other entries and it'll display those, but it completely excludes Name 4 as well as other entries, but only those that have windreadings as far as I can tell.
If I add Wind to the GROUP BY part for
GROUP BY `Full Name`, Gender, Event, Agegroup, Wind
It does display all the correct results, but then there are a lot of duplicates that I want to avoid.
Any idea on what's happening?
I use DbVisualizer Pro for all my SQL queries
SQLFiddle Sample here http://www.sqlfiddle.com/#!2/f8958/1 The problem lies with Tamza Bay not showing in the output
That's because Name 4 is in AgeGroup 'Youth', not 'Junior' as your query specifies.
EDIT after proper data added and fiddler played with:
This is your query:
(Edit2 - query deleted for space/)
The difference is your GROUP BY on the inner query. You had not specified Wind as part of it. Most DBMSs will error on that but MySQL doesn't. It also does some odd things with queries with malformed GROUP BY's - i.e. queries with a group by which do not include all the un-aggregated columns in the select. In this case, it caused that GROUP BY to return the 2.4 wind speed which then gets excluded. It's a known 'feature' of MySQL.
EDIT 2: I changed te fiddle to include this query:
SELECT MIN(`Result`) `BestResult`,
`Full Name`, `Gender`,
`Event`, `Agegroup`
FROM `Results2015`
GROUP BY `Full Name` , `Gender`, `Event`, `Agegroup`;
...which is your inner query and used to select the correct result from the full table. This gave a best result of 11.64 for Tamza Bay, which is indeed the best result. However, the wind in that race was 2.4 and so the outer Where excludes it. That's why.
Possibly you need to include the outer where in the inner query:
SELECT MIN(`Result`) `BestResult`,
`Full Name`, `Gender`,
`Event`, `Agegroup`
FROM `Results2015`
WHERE `Event` = '100m'
AND `Gender` = 'F'
AND `Agegroup` = 'Junior'
AND `Wind` <> ''
AND `Wind` <= 2
GROUP BY `Full Name` , `Gender`, `Event`, `Agegroup`
Basically, make sure the inner query reyturns the results you want for the people, and then slot it into the main query.