Search code examples
sasrankproc

Is there a way to name proc rank groups based on values within the group?


So I have multiple continuous variables that I have used proc rank to divide into 10 groups, ie for each observation there is now a "GPA" and a "GRP_GPA" value, ditto for Hmwrk_Hrs and GRP_Hmwrk_Hrs. But for each of the new group columns the values are between 1 - 10. Is there a way to change that value so that rather than 1 for instance it would be 1.2-2.8 if those were the min and max values within the group? I know I can do it by hand using proc format or if then or case in sql but since I have something like 40 different columns that would be very time intensive.


Solution

  • It's not clear from your question if you want to store the min-max values or just format the rank columns with them. My solution below formats the rank column and utilises the ability of SAS to create formats from a dataset. I've obviously only used 1 variable to rank, for your data it will be a simple matter to wrap a macro around the code and run for each of your 40 or so variables. Hope this helps.

    /* create ranked dataset */
    proc rank data=sashelp.steel groups=10 out=want;
    var steel;
    ranks steel_rank;
    run;
    
    /* calculate minimum and maximum values per rank */
    proc summary data=want nway;
    class steel_rank;
    var steel;
    output out=want_min_max (drop=_:) min= max= / autoname;
    run;
    
    /* create dataset with formatted values */
    data steel_rank_fmt;
    set want_min_max (rename=(steel_rank=start));
    retain fmtname 'stl_fmt' type 'N';
    label=catx('-',steel_min,steel_max);
    run;
    
    /* create format from previous dataset */
    proc format cntlin=steel_rank_fmt;
    run;
    
    /* apply formatted value to rank column */
    proc datasets lib=work nodetails nolist;
    modify want;
    format steel_rank stl_fmt10.;
    quit;