I have a SAS dataset as follow :
Key A B C D E
001 1 . 1 . 1
002 . 1 . 1 .
Other than keeping the existing varaibales, I want to replace variable value with the variable name if variable A has value 1 then new variable should have value A else blank.
Currently I am hardcoding the values, does anyone has a better solution?
The following should do the trick (the first dstep sets up the example):-
data test_data;
length key A B C D E 3;
format key z3.; ** Force leading zeroes for KEY;
key=001; A=1; B=.; C=1; D=.; E=1; output;
key=002; A=.; B=1; C=.; D=1; E=.; output;
proc sort;
by key;
run;
data results(drop = _: i);
set test_data(rename=(A=_A B=_B C=_C D=_D E=_E));
array from_vars[*] _:;
array to_vars[*] $1 A B C D E;
do i=1 to dim(from_vars);
to_vars[i] = ifc( from_vars[i], substr(vname(from_vars[i]),2), '');
end;
run;
It all looks a little awkward as we have to rename the original (assumed numeric) variables to then create same-named character variables that can hold values 'A', 'B', etc.
If your 'real' data has many more variables, the renaming can be laborious so you might find a double proc transpose more useful:-
proc transpose data = test_data out = test_data_tran;
by key;
proc transpose data = test_data_tran out = results2(drop = _:);
by key;
var _name_;
id _name_;
where col1;
run;
However, your variables will be in the wrong order on the output dataset and will be of length $8 rather than $1 which can be a waste of space. If either points are important (they rsldom are) and both can be remedied by following up with a length
statement in a subsequent datastep:-
option varlenchk = nowarn;
data results2;
length A B C D E $1;
set results2;
run;
option varlenchk = warn;
This organises the variables in the right order and minimises their length. Still, you're now hard-coding your variable names which means you might as well have just stuck with the original array approach.