Search code examples
sasrows

Create duplicate rows in SAS and change values of variables


I have been so confused on how to implement this in SAS. I am trying to create duplicate rows if the value of "2" occurs more than once between the variables (member1 -member4). For example, if a row has the value 2 in member2, member3, and member4, then I will create 2 duplicate rows since the initial row will serve for the first variable and the duplicate rows will be for member 3 and 4. On the duplicate row for member3 for example, member 2 and 4 will be missing if their values is equal to 2. Basically the value "2" can only occur once per row. let's assume sa1 to sa4 corresponds to other variables of member1 to member4 respectively. When we create a duplicate row for each member, the other variables should be missing if they have a value of "1". For example, if the duplicate row is for member 3, then values that equal "1" for sa1, sa2 and sa4 should be set to missing. There are other variables in the dataset that will have same values for all duplicate rows as initial rows. Duplicate rows will also have a suffix for the ID to indicate the parent rows.

This is an example of the data I have

id  member1 member2 member3 member4 sa1 sa2  sa3  sa4
 1     0      2       2       0      0   1    1    0
 2     2      2       0       5      .   1    0    0
 3     2      2       3       2      1   1    0    1

Then this is the output I am trying to achieve

id  member1 member2 member3 member4 sa1 sa2  sa3  sa4
 1     0      2       .       0      0   1    .    0
 1_1   0      .       2       0      0   .    1    0
 2     2      .       0       5      .   .    0    0
 2_1   .      2       0       5      .   1    0    0
 3     2      .       3       .      1   .    0    .
 3_1   .      2       3       .      .   1    0    .
 3_2   .     .        3       2      .   .    0     1

Will appreciate any help. Thank you!


Solution

  • You need to count the number of '2's. You also need to remember where they used to be. "I had the spots removed for good luck, but I remember where the spots formerly were."

    data have ;
      input id :$10. member1 member2 member3 member4 sa1 sa2  sa3  sa4 ;
    cards;
     1 0 2 2 0 0 1 1 0
     2 2 2 0 5 . 1 0 0
     3 2 2 3 2 1 1 0 1
     4 2 0 0 0 . . . .
     5 0 0 0 0 . . . . 
    ;
    
    data want ;
      set have ;
      array m member1-member4 ;
      array x [4] _temporary_;
      do index=1 to dim(m);
        x[index]=m[index]=2;
      end;
      n2 = sum(of x[*]);
      if n2<2 then output;
      else do counter=1 to n2;
        id=scan(id,1,'_');
        if counter > 1 then id=catx('_',id,counter-1);
        counter2=0;
        do index=1 to dim(m);
           if x[index] then do;
              counter2+1;
              if counter = counter2 then m[index]=2;
              else m[index]=.;
           end;
        end;
        output;
      end;
      drop index n2 counter counter2;
    run;
    

    Results

    Obs    id     member1    member2    member3    member4    sa1    sa2    sa3    sa4
    
     1     1         0          2          .          0        0      1      1      0
     2     1_1       0          .          2          0        0      1      1      0
     3     2         2          .          0          5        .      1      0      0
     4     2_1       .          2          0          5        .      1      0      0
     5     3         2          .          3          .        1      1      0      1
     6     3_1       .          2          3          .        1      1      0      1
     7     3_2       .          .          3          2        1      1      0      1
     8     4         2          0          0          0        .      .      .      .
     9     5         0          0          0          0        .      .      .      .