I need to do a frequency distribution of this one column data without using any proc freq; proc sql. I'm only allowed to use proc sort.
In excel I would use a simple countif, but I don't know how to do this in SAS given above contraint.
data sample_grades;
input grades $;
datalines;
C
A
A
B
B+
A-
W
A
A-
A
A-
A
B+
A-
A
B+
B+
A-
B+
;
run;
I came up with this but it stopped counting at A-
data new_dataset;
set Fall2016;
by grade;
retain grade frequency;
if grade = 'A' then frequency+1;
else if grade = 'A-' then frequency=0;
if grade = 'A-' then frequency+1;
else if grade = 'B' then frequency=0;
if grade = 'B' then frequency+1;
else if grade = 'B+' then frequency=0;
if grade = 'B+' then frequency+1;
else if grade = 'B-' then frequency=0;
if grade = 'B-' then frequency+1;
else if grade = 'C' then frequency=0;
if grade = 'C' then frequency+1;
else if grade = 'W' then frequency=0;
if grade = 'W' then frequency+1;
else frequency+0;
if last.grade then do;
frequency+0;
end;
run;
Ultimately I'm looking to a simple table like this: enter image description here
It helps to think of data steps as loops, that run through the input dataset and pick up values as they go. I was going to explain how your attempt worked in that respect, but it quickly became confusing. Here's my attempt at the problem:
data sample_grades;
input grades $;
datalines;
C
A
A
B
B+
A-
W
A
A-
A
A-
A
B+
A-
A
B+
B+
A-
B+
;
run;
Sort the data by grades first, so that BY-GROUP processing can take place:
proc sort data=sample_grades;
by grades;
run;
Now set up your data step as follows:
data new_dataset;
set sample_grades;
by grades;
/* If it's the first of the grades then set the frequency to zero */
if first.grades then frequency=0;
/* Increment the frequency value regardless of the value of grades */
frequency+1;
/* When the last of the grades values is found, output. This gives the total frequency for the grade in the output table */
if last.grades then output;
run;