i have my dataset which looks like below:
Customer Sales
1 15
2 14
3 13
4 11
5 12
6 18
7 21
I need to sort customers by % Sales and then assign them into "high", "medium", "low" buckets....
Customer Sales %Sales
7 21 20%
6 18 17%
1 15 14%
2 14 13%
3 13 13%
5 12 12%
4 11 11%
And the buckets need to be based on cumulative frequency:
Customer Sales %Sales CumFreq Bucket
7 21 20% 20% High
6 18 17% 38% Medium
1 15 14% 52% Medium
2 14 13% 65% Medium
3 13 13% 78% Low
5 12 12% 89% Low
4 11 11% 100% Low
So, as you can see anyone in top 33% of sales will be a high bucket, mid 33% will be a medium bucket and bottom 33% will be a low bucket
So you need to sum the column, sort the data set, then calculate the cumulative percent. Use a custom format to create your Bucket column.
data sales;
input Customer Sales;
datalines;
1 15
2 14
3 13
4 11
5 12
6 18
7 21
;
run;
proc sort data=sales;
by descending sales ;
run;
proc sql noprint;
select sum(sales) format=best32. into :s from sales;
quit;
proc format;
value pctSales
0-.33='High'
.33-.67='Medium'
.67-1='Low';
run;
data sales;
set sales;
retain total 0;
format pctSales percent8.2;
total = total + sales;
pctSales = total/&s;
bucket = put(pctSales,pctSales.);
drop total;
run;