I would like to create a new variable 'Weight' and 'Height' using the variables A through D:
DATA:
A B C D
Jim Weight 180 Screen
Jim Weight 200 C1
Jim Height 60 Screen
Jim Height 61 C3
Tod Weight 190 Screen
Tod Weight 201 C1
Tod Height 70 Screen
Tod Height C1
The Weight variable would have the following criteria: Set Weight to Column C if Column B = Weight, Column D = C1 and Column C is not missing. Else if Column D is not C1 or Column C is missing, then use the Column C where Column D is Screen. So in plain terms, let's say Jim was weighed during screening and not for C1 then I would like to keep his screen weight. Or if he was screened at C1 but the wight is missing, I would like to keep his screen weight. Similarly for the Height variable.
My code which is incorrect is:
DATA MYTEST;
SET TEST.TEST;
if B = 'WEIGHT' and D = 'C1D1' and not missing(C) then NEW = C;
else if (missing(C) or D ~= 'C1') and B = 'WEIGHT' then WEIGHT = C where D = 'Screen';
if B = 'HEIGHT' and D = 'C1D1' and not missing(C) then NEW = C;
else if (missing(C) or D ~= 'C1') and B = 'HEIGHT' then WEIGHT = C where D = 'Screen';
else WEIGHT = 'NA';
RUN;
PROC PRINT DATA = MYTEST;
RUN;
Desired Outcome:
DATA:
A Weight Height
Jim 200 60
Tod 201 70
The result data can be created using an update
statement, applied to the data after transposition. UPDATE
is different than MERGE
in this way -- missing values in the update data set will never overwrite an existing value in PDV.
DATA have;
input
a $ b $ c d $; datalines;
Jim Weight 180 Screen
Jim Weight 200 C1
Jim Height 60 Screen
Jim Height 61 C3
Tod Weight 190 Screen
Tod Weight 201 C1
Tod Height 70 Screen
Tod Height . C1
run;
proc transpose data=have out=haveT;
by a d notsorted;
var c;
id b;
run;
data haveScreen / view=haveScreen;
set haveT;
where d='Screen';
by a;
if first.a;
run;
data want;
update
haveScreen
haveT (where=(d in ('Screen', 'C1')))
;
by a;
run;
Your posted code is not using WHERE correctly. Where clauses are not conditionally applied and also can not be changed dynamically while the data step is running. A where clause is applied at 'run initialization` time. The where clause is a non-executable/unconditional statement and the last occurring one in data step code is the one that will be applied when the step runs.
For example, in the following, the if 0
is never true yet the where
is applied regardless.
options msglevel=i;
data _null_;
set sashelp.class;
if name =: 'X' then where age > 12;
if 0 then where age > 1;
run;
----- LOG -----
4625 options msglevel=i;
4626 data _null_;
4627 set sashelp.class;
4628 if name =: 'X' then where age > 12;
4629 if 0 then where age > 1;
NOTE: WHERE clause has been replaced.
4630 run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
WHERE age>1;