Search code examples
sas

Average over number of variables where number of variables is dictated by separate column


I would like to create a new column whose values equal the average of values in other columns. But the number of columns I am taking the average of is dictated by a variable. My data look like this, with 'length' dictating the number of columns x1-x5 that I want to average:

data have;
    input ID $ length x1 x2 x3 x4 x5;
    datalines;
A 5 8 234 79 36 78
B 4 8 26 589 3 54
C 3 19 892 764 89 43
D 5 72 48 65 4 9
;
run;

I would like to end up with the below where 'avg' is the average of the specified columns.

data want;
    input ID $ length avg
    datalines;
A 5 87
B 4 156.5
C 3 558.3
D 5 39.6
;
run;

Any suggestions? Thanks! Sorry about the awful title, I did my best.


Solution

  • You have to do a little more work since mean(of x[1]-x[length]) is not valid syntax. Instead, save the values to a temporary array and take the mean of it, then reset it at each row. For example:

    tmp1 tmp2 tmp3 tmp4 tmp5
    8    234  79   36   78
    8    26   589  3    .
    19   892  764  .    . 
    72   48   65   4    9
    
    data want;
        set have;
        array x[*] x:;
        array tmp[5] _temporary_;
        
        /* Reset the temp array */
        call missing(of tmp[*]);
    
        /* Save each value of x to the temp array */
        do i = 1 to length;
            tmp[i] = x[i];
        end;
    
        /* Get the average of the non-missing values in the temp array */
        avg = mean(of tmp[*]);
    
        drop i;
    run;