Search code examples
sassas-macro

Conditionally replace column values with column name in SAS dataset


I have a SAS dataset as follow :

Key    A    B    C    D    E
001    1    .    1    .    1
002    .    1    .    1    .

Other than keeping the existing varaibales, I want to replace variable value with the variable name if variable A has value 1 then new variable should have value A else blank.

Currently I am hardcoding the values, does anyone has a better solution?


Solution

  • The following should do the trick (the first dstep sets up the example):-

    data test_data;
      length key A B C D E 3;
      format key z3.;  **  Force leading zeroes for KEY;
      key=001; A=1; B=.; C=1; D=.; E=1; output;
      key=002; A=.; B=1; C=.; D=1; E=.; output;
    proc sort;
      by key;
    run;
    
    data results(drop = _: i);
      set test_data(rename=(A=_A B=_B C=_C D=_D E=_E));
    
      array from_vars[*] _:;
      array to_vars[*] $1 A B C D E;
    
      do i=1 to dim(from_vars);
        to_vars[i] = ifc( from_vars[i], substr(vname(from_vars[i]),2), '');
      end;
    run;
    

    It all looks a little awkward as we have to rename the original (assumed numeric) variables to then create same-named character variables that can hold values 'A', 'B', etc.

    If your 'real' data has many more variables, the renaming can be laborious so you might find a double proc transpose more useful:-

    proc transpose data = test_data out = test_data_tran;
      by key;
    proc transpose data = test_data_tran out = results2(drop = _:);
      by key;
      var _name_;
      id _name_;
      where col1;
    run;
    

    However, your variables will be in the wrong order on the output dataset and will be of length $8 rather than $1 which can be a waste of space. If either points are important (they rsldom are) and both can be remedied by following up with a length statement in a subsequent datastep:-

    option varlenchk = nowarn;
      data results2;
        length A B C D E $1;
        set results2;
      run;
    option varlenchk = warn;
    

    This organises the variables in the right order and minimises their length. Still, you're now hard-coding your variable names which means you might as well have just stuck with the original array approach.