I am trying to join two tables, weight 1 and weight2. I want to join these two tables on ID1=ID2 first. if there is a missing ID, I will join on DOB1=DOB2. I want the program to compare each row in the table. but sas saw the first ID is missing in weight2 table, then it completely switch to the else statement, and use DOB1=DOB2 to join.
My Ideal result is to have four rows joined. but now sas only gives me three rows.
data weight1;
input ID1 $ Week1 DOB1;
datalines;
1111 195 101
2222 220 102
3333 173 103
4444 135 104
;
proc print data=weight1;
run;
data weight2;
infile datalines missover;
input ID2 $ Week2 DOB2;
datalines;
195 101
2222 220 102
3333 173 103
4444 135 104
;
proc print data=weight2;
run;
options mlogic symbolgen mprint;
%macro test ;
proc sql;
create table final as
select a.ID1, a.DOB1, b.ID2,b.DOB2
from weight1 a
%if %sysfunc(exist(b.ID2)) %then
inner join weight2 b
on a.ID1 = b.ID2;
%else
/*%if %sysfunc(not exist(b.IDnumber))*/
inner join weight2 b
on a.DOB1 = b.DOB2
;
;
quit;
%mend test;
%test
Sas log:
LOGIC(TEST): Beginning execution.
MPRINT(TEST): proc sql;
MLOGIC(TEST): %IF condition %sysfunc(exist(b.ID2)) is FALSE
MPRINT(TEST): create table final as select a.ID1, a.DOB1, b.ID2,b.DOB2 from weight1 a inner join weight2 b on a.DOB1 = b.DOB2 ;
NOTE: Table WORK.FINAL created, with 3 rows and 4 columns.
this is my ideal result
ID1 DOB1 ID2 DOB2
1111 101 101
2222 102 2222 102
3333 103 3333 103
4444 104 4444 104
A case
expression can be used as join criteria.
Do not use macro. Your use of %sysfunc(exist
is just wrong, the exist
function checks for the existence of a data set, and %sysfunc
is performed pre source code compilation time, not at SQL execution time.
Use a case expression to evaluate the pre-conditions for your equality check.
Example:
data weight1;
input ID1 $ Week1 DOB1;
datalines;
1111 195 101
2222 220 102
3333 173 103
4444 135 104
;
data weight2;
infile datalines missover;
input ID2 $ Week2 DOB2;
datalines;
. 195 101
2222 220 102
3333 173 103
4444 135 104
;
proc sql;
create table want as
select a.ID1, a.DOB1, b.ID2,b.DOB2
from weight1 a
join weight2 b
on case
when (a.id1 is not missing and b.id2 is not missing) then a.id1=b.id2
when (a.dob1 is not missing and b.dob2 is not missing) then a.dob1=b.dob2
else 0
end
;
NOTE: The case of real data where you have more than one row in each table with missing id and the same date of birth will cause a multiplicative row count effect in the result table. Such as this sample data:
1111 195 101
2222 220 102
3333 173 103
4444 135 104
. 145 105 ***
. 175 105 ***
and
. 195 101
2222 220 102
3333 173 103
4444 135 104
. 155 105 ***
. 166 105 ***
4 = 2 x 2 result rows from the starred data.