Search code examples
sassummary

How to use proc summary and keep all variables (without naming them)


I want to sum over a specific variable in my dataset, without loosing all the other columns. I have tried the following code:

proc summary data=work.test nway missing; 
class var_1 var_2 ; *groups;
var salary; 
id _character_ _numeric_; * keeps all variables;
output out=test2(drop=_:) sum= ;
run;

But it does not seem to sum properly, and for the "salary" column I'm just left with the value of the last value in each group (var_1 and var_2). If I remove

 id _character_ _numeric_; 

it works fine, but I loose all other columns.

Example:

data:

data salary;
   input name $ dept $ Salary Sex $;
   datalines;
John Sales 23 M
John Sales 43 M
Mary Acctng 21 F
; 

desired output:

John Sales 66 M

Mary Acctng 21 F


Solution

  • I think this does what you want. You still get warnings about name conflicts and variables being dropped but at least the ones you want are kept. The ID statement is depreciated in favor in the new and better IDGROUP output statement option.

    You could add the AUTONAME option to the output statement if you wanted PROC SUMMARY to automatically rename the conflicting variables.

    data salary;
       input name $ dept $ Salary Sex $;
       datalines;
    John Sales 23 M
    John Sales 43 M
    Mary Acctng 21 F
    ;;;;
       run; 
    proc print;
       run;
    proc summary nway missing; 
       class name dept; 
       var salary; 
       output out=test2(drop=_:) sum= idgroup(out(_all_)=);
       run;
    proc print;
       run;                                                                                                                   
    

    enter image description here