Search code examples
sascumulative-frequency

cumulative frequency in SAS Dataset


i have my dataset which looks like below:

 Customer Sales
        1    15
        2    14
        3    13
        4    11
        5    12
        6    18
        7    21

I need to sort customers by % Sales and then assign them into "high", "medium", "low" buckets....

 Customer Sales  %Sales
        7    21     20%
        6    18     17%
        1    15     14%
        2    14     13%
        3    13     13%
        5    12     12%
        4    11     11%

And the buckets need to be based on cumulative frequency:

 Customer Sales %Sales CumFreq Bucket
        7    21    20%     20%   High
        6    18    17%     38% Medium
        1    15    14%     52% Medium
        2    14    13%     65% Medium
        3    13    13%     78%    Low
        5    12    12%     89%    Low
        4    11    11%    100%    Low

So, as you can see anyone in top 33% of sales will be a high bucket, mid 33% will be a medium bucket and bottom 33% will be a low bucket


Solution

  • So you need to sum the column, sort the data set, then calculate the cumulative percent. Use a custom format to create your Bucket column.

    data sales;
    input  Customer Sales;
    datalines;
            1    15
            2    14
            3    13
            4    11
            5    12
            6    18
            7    21
    ;
    run;
    
    proc sort data=sales;
    by descending sales ;
    run;
    
    proc sql noprint;
    select sum(sales) format=best32. into :s from sales;
    quit;
    
    proc format;
    value pctSales
        0-.33='High'
        .33-.67='Medium'
        .67-1='Low';
    run;
    
    data sales;
    set sales;
    retain total 0;
    format pctSales percent8.2;
    total = total + sales;
    pctSales = total/&s;
    bucket = put(pctSales,pctSales.);
    drop total;
    run;