Search code examples
sqlsassumnew-operatorproc

How can I create a new variable which calculates sum of a specific variable (by ID) containing multiple observation in SAS?


For example, I want to create a new dataset (Data2) from Data1.

A new variable, cost in data2 is calculated as sum of multiple observation by ID in material of data1.

(Data1)

ID  material
1   4
1   4
1   4
2   2
2   4
2   4
3   2
3   6
3   6
4   5
4   5
4   5
4   5
5   2
5   4
5   4
5   8

(Data2)

ID  cost    
1   12  #4+4+4
2   10  #2+4+4
3   14  #2+6+6
4   20  #5+5+5+5
5   18  #2+4+4+8

I have used SAS EG version only for simple analysis, and recently I started to use proc sql procedure. As a beginner in SAS coding (proc sql), it was very hard to approach the answer, for myself. Thank you very much, in advance.


Solution

  • If you want to use PROC SQL, this is a straight forward use of GROUP BY

    proc sql;
      select id, sum(material) as sum from mydataset group by id;
      quit;
    
    

    You could manually compute this in a datastep also if you don't want to use PROC SQL

    proc sort data=mydataset;
      by id;
    run;
    
    data sums;
      set mydataset;
      by id;
    
      if first.id then sum = 0;
      sum + material;
      if last.id then output;
    
      keep id sum;
    
    run;
    
    proc print data=sums;
    run;