Search code examples
sasproc-sql

PROC SQL: Warning variable already exists on multiple dataset join


I have this data check integrity code for an oncology study I'm working on. This is ostensibly to confirm TU,TR and RS are consistent.

 proc sql ;
 create table tu_tr_rs as 
 select tu.*,tr.*,rs.*
 from trans.tu  as tu
 left join trans.tr as tr on tu.usubjid=tr.usubjid and tu.TULNKID
 =tr.TRLNKID and tu.tudtc=tr.trdtc
 left join trans.rs as rs on tr.usubjid=rs.usubjid and tr.trdtc=
 rs.rsdtc
 ;
 quit;

However, when I run this code I get the warning

"Variable XXXX already exists on file WORK.TU_TR_RS."

When I add the feedback option to PROC SQL to get a more granular look I get this

enter image description here

So I know if it's one variable that brings this warning up you can use a rename/DROP combination to work around it but for this case is it just the case that I have to explicitly state the variables for each dataset in the select statement or is there something fundamentally wrong with the code?


Solution

  • Yes, if you want to select columns with the same name from 2 (or more) data sets, you simply need to select them explicitly and give them distinct names. Something like this:

    create table tu_tr_rs as 
     select
        tu.ColA as tu_ColA
        ,tu.ColB as tu_ColB
        /* etc */
        ,tr.ColA as tr_ColA
        ,tr.ColB as tr_ColB
        /* etc */
        ,rs.ColA as rs_ColA
        ,rs.ColB as rs_ColB
        /* etc */
     from trans.tu  as tu
    /* etc */