The variables in the data set are: Event, EventType, FName, LName, Age, Gender, Score
.
I am trying to create a report that gives me the Lowest 5 scores per event/event type per each Age(18-65) per Gender.
For example, I want the 5 lowest scores for everyone who participated in EventA EventtypeB who were 18 year old females then I want all the 19 year old females and so on.. For each gender. A side note- Not all ages have 5 participants.. For example there may be no 20 year olds who participated and there may only be 2 21 year olds.
I originally tried to tackle this by making a bunch of separate data sets for each age, but I know there must be a better way to do it. I would appreciate any help thank you I am pretty new to SAS but I have introductory experience in all aspects.
Here is some sample input:
Mile Sprint John Smith 19 Male 15.31
Mile Sprint Alex Doe 19 Male 13.21
Mile Sprint Ian Sore 19 Male 23.51
Mile Sprint Sean Lae 19 Male 12.34
Mile Sprint Mike Rai 19 Male 17.27
Mile Sprint Connor Te 19 Male 11.23
Mile Sprint Simon Doe 19 Male 15.21
Mile Long Jane Joy 37 Female 35.12
Mile Long Victoria K 37 Female 27.31
Mile Long Chris Li 25 Male 23.43
For the Mile Sprint 19 Males
I would want it to return:
Mile Sprint Connor Te 19 Male 11.23
Mile Sprint Sean Lae 19 Male 12.34
Mile Sprint Alex Doe 19 Male 13.21
Mile Sprint Simon Doe 19 Male 15.21
Mile Sprint John Smith 19 Male 15.31
For the Mile Long 37 Female
I would want it to just return this due to there not being 5 participants:
Mile Long Victoria K 37 Female 27.31
Mile Long Jane Joy 37 Female 35.12
With the sample input shown I am trying to get the 5 lowest scores for Mile Sprint
for Males age 19. Then the same for age 20-65. Then the same for Mile Long
for all males. Vice versa for females. With the assumption that there may not be 5 participants in a race or there may be more than 5. Is there anyway to do all of this in one or two dataset outputs?
/* Creating Sample dataset */
data input_dataset;
infile datalines dlm=",";
input Event : $10.
EventType : $ 10.
FName : $10.
LName : $10.
Age : 8.
Gender : $10.
Score : 8.
;
datalines;
Mile,Sprint,John,Smith,19,Male,15.31
Mile,Sprint,Alex,Doe,19,Male,13.21
Mile,Sprint,Ian,Sore,19,Male,23.51
Mile,Sprint,Sean,Lae,19,Male,12.34
Mile,Sprint,Mike,Rai,19,Male,17.27
Mile,Sprint,Connor,Te,19,Male,11.23
Mile,Sprint,Simon,Doe,19,Male,15.21
Mile,Long,Jane,Joy,37,Female,35.12
Mile,Long,Victoria,K,37,Female,27.31
Mile,Long,Chris,Li,25,Male,23.43
;
run;
/* Sorting based on desired parameters - event EventType age Gender */
proc sort data = input_dataset;
by event EventType age Gender score;
run;
/*Picking the lowest five scores based on above parameters */
data input_dataset_1(drop=num);
set input_dataset;
retain num;
by event EventType age Gender score;
if first.gender then num=1 ; else num=num+1;
if num<=5;
run;