Search code examples
sassas-macroproc-sql

PROC SQL MERGE MISMATCH


ATTACHED SCREENSHOT OF DESIRED OUTPUTthe required condition is "SUBJECT in A = SUBJECT in B and VISIT in A NE(not equal to) VISIT in B"

I would like to find the exact mismatch and missing VISIT from the below Tables A and B by using Proc SQL procedure, Can anyone help me please?

Table A

SUBJECT Test    VISIT
1001    ABCB    1
1001    ABCD    2
1001    ABCD    3
1001    ABCD    5

Table B

SUBJECT Test    VISIT1
1001    ABCD    2
1001    ABCD    1
1001    ABCD    4

Expected output:

SUBJECT Test    VISIT   VISIT1
1001    ABCD    3
1001    ABCD    5    
1001    ABCD            4

VISIT 3 AND 5 IS PRESENT IN DATASET A NOT IN B AND VISIT 4 IS PRESENT IN DATASET2 NOT IN DATASET A , LIKE WISE CODE FOR DATASET-

DATA A;
  LENGTH SUBJECT 8 Test $10 visit 8;
  INPUT SUBJECT Test $ visit ;
  DATALINES;
  1001 ABCD 1
  1001 ABCD 2
  1001 ABCD 3
  1001 ABCD 5
;
RUN;

DATA B;
  LENGTH SUBJECT 8 Test $10 visit1 8;
  INPUT SUBJECT Test $ visit1 ;
  DATALINES;
  1001 ABCD 2
  1001 ABCD 1
  1001 ABCD 4
;
RUN;

Thanks in advance!

the code i tried is below (but not working as expected)-

    ****************(VISIT ) in A and not in B****;
proc sql;
create table SS1 as
select distinct a.* FROM
A a where a.visit not in(select s.visit1 from B s WHERE A.SUBJECT = S.SUBJECT );

create  table   INRAVE as
select * from SS1 A
left join 
B B
on a.subject=b.SUBJECT  and a.VISIT NE b.VISIT1
where   b.SUBJECT is not  null  
;
quit;
****************VISIT in B and not in A****;
proc sql;
create table SS2 as
select distinct a.* from 
B a where a.VISIT1 not in(select S.VISIT from A s WHERE A.SUBJECT = S.SUBJECT );

create  table   INVENDOR as
select * from SS2 A
left join 
A B
on a.subject=b.SUBJECT  and a.VISIT1 NE b.VISIT 
where   b.SUBJECT is not  null  
;
quit;

data ALL;;
set inrave invendor;
where subject=subject ;
RUN;

Solution

  • Seems you know SQL very well, why not try union all, just like this:

    proc sql noprint;
        create table C as 
        select *, 'A' as Source from A
        where catx('@',SUBJECT,Test,visit) not in (
            select distinct catx('@',SUBJECT,Test,visit1) from B 
        )
        union all corr 
        select *, 'B' as Source from B(rename=VISIT1=VISIT)
        where catx('@',SUBJECT,Test,visit) not in (
            select distinct catx('@',SUBJECT,Test,visit) from A 
        )
        ;
    
        create table D(drop=TmpVISIT Source) as 
        select *,
            case when Source = 'B' then . else TmpVISIT end as VISIT,
            case when Source = 'B' then TmpVISIT else . end as VISIT1
        from C(rename=VISIT=TmpVISIT);
    quit;
    

    I get all obs from dataset A where not repeat in dataset B and do the oppsite with dataset B.
    Well, I also get another solution, which is shorter:

    proc sql noprint;
        select catx('@',SUBJECT,Test,visit) into :Ununique separated by '" "' from (
            select * from A union all select * from B(rename=visit1=visit)
        )
        group by SUBJECT, Test, visit
        having count(*) > 1;
    quit;
    
    data D;
        set A B;
        if catx('@',SUBJECT,Test,coalesce(visit1,visit)) in ("&Ununique") then delete;
    run;
    

    Whereas, this method is limited by the max lenth of macro variable.