Search code examples
sasbar-chart

Can you define a manual order of columns via PROC FREQ?


I am new to SAS but I am preparing a lecture explaining some basic SAS. I want to show how you can create a bar chart with SAS (version 9.4). For this purpose, I use the "heart" example data set in SAS.

If I create a bar chart via PROC FREQ for the variable "Smoking_Status", I get an ordering of the bars that is not useful.

proc freq data=sashelp.heart;
  tables Smoking_Status / plots = freqplot;
run;

Bar chart via FREQ

Is it possible to order the bars manually in PROC FREQ, maybe via the ORDER= argument? I could not see how. I want the graph to have nice descriptions and not prefix the labels with "1-" or "2-" to be able to order them alphabetically because that reduces the readability of the resulting plot.

I can achieve the desired result with SGPLOT.

proc sgplot data=sashelp.heart;
  vbar Smoking_Status;  
  title "Distribution of smoking_status from sashelp.heart dataset";
  xaxis display=(nolabel) values=('Non-smoker' 'Light (1-5)' 'Moderate (6-15)' 
    'Heavy (16-25)' 'Very Heavy (> 25)');
run;

Bar chart via SGPLOT

My question is: Is there also a way to achieve this result via PROC FREQ? I want to be sure whether this is the case or not because I want to show the easiest and most straightforward solution to this problem.


Solution

  • Yes, you can do this via the ORDER= statement. The default is INTERNAL, which works on unformatted values and puts character variables alphabetically. It's often more flexible to convert your grouping variable to numeric, and apply a format that sets the desired label. Alternatively you could use ORDER=DATA and add records in the correct order at the top of your dataset with zero weight for example.

    Example SAS code for both approaches (you only need to do one):

    /* Formats for option 1, can also generate these from the data. */
    proc format;
       invalue smoke_num
          "Non-smoker" = 1
          "Light (1-5)" = 2
          "Moderate (6-15)" = 3
          "Heavy (16-25)" = 4
          "Very Heavy (> 25)" = 5;
    
       value smoke_label
          1 = "Non-smoker"
          2 = "Light (1-5)"
          3 = "Moderate (6-15)"
          4 = "Heavy (16-25)"
          5 = "Very Heavy (> 25)";
    run;
    
    
    
    data want;
       /* Option 2: add dummy records with zero weight for ORDER=DATA. */
       length Smoking_Status $17;
       if _N_ eq 1 then do;
          wt = 0;
          do smoking_status = "Non-smoker",
                              "Light (1-5)",
                              "Moderate (6-15)",
                              "Heavy (16-25)",
                              "Very Heavy (> 25)";
             output;
          end;
       end;
       wt = 1;
    
       set sashelp.heart;
       /* Option 1: set order through formatted numeric variable. */
       smokeN = input(smoking_status, smoke_num.);
       format smokeN smoke_label.;
    
       output; /* Needed for option 2 (implicit output is gone). */
    run;
    
    /* Option 1. */
    proc freq data=want order=internal;
       tables smoken / plots=freqplot;
       format smoken smoke_label.; /* Already applied above as well. */
       weight wt; /* Only needed to exclude dummy records if they are present. */
    run;
    
    
    /* Option 2. */
    proc freq data=want order=data;
       tables smoking_status / plots=freqplot;
       weight wt;
    run;