Search code examples
variablesgroup-bysasaggregate

SAS collapse dates


I have a dataset which looks like this:

cust date 1 2 3... 600
1    1    5 . . ... .
1    2    5 . . ... .
1    2    . 4 . ... .
1    2    . . 6 ... .
2    1    1 . . ... .
2    1    . 5 . ... .
2    2    . . . ... 10

I want to collapse variables 1 to 600 for each date by customer (cust), so that the dataset looks like this:

cust date 1 2 3... 600
1    1    5 . . ... .
1    2    5 4 6 ... .
2    1    1 5 . ... .
2    2    . . . ... 10

I started with the following code (maybe it's a bit complicated), and it doesn't work:

data want ;
set have;
array vars &list.; *stored array of variables 1-600;
retain count vars;
by cust date;
if first.date then do;
do _i=1 to dim(vars);
vars[_i]=.; 
end;
count=0;
end;
count=count+1;
vars[_1]=vars;
if last.date then do;
output;
end;
drop count;
run;

Do you have any idea? Another idea was to use proc expand, but it doesn't work either because the dates are duplicates.

Thanks so much for your help!!


Solution

  • There's a neat trick to achieve this using the UPDATE statement. The first reference to the existing table (with the obs=0) creates an empty table with the required structure, the second reference updates with the values. The BY statement ensures it only outputs one record per BY value. Hope this makes sense.

    data have;
    input cust date v1 v2 v3 v600;
    datalines;
    1    1    5 . . .
    1    2    5 . . .
    1    2    . 4 . .
    1    2    . . 6 .
    2    1    1 . . .
    2    1    . 5 . .
    2    2    . . . 10
    ;
    run;
    
    data want;
    update have (obs=0) have;
    by cust date;
    run;