Search code examples
sasproc-sqldatastep

replicating a sql function in sas datastep


Hi another quick question

in proc sql we have on which is used for conditional join is there something similar for sas data step

for example

proc sql;
....
data1 left join data2
on first<value<last
quit;

can we replicate this in sas datastep

like

data work.combined
  set data1(in=a) data2(in=b)

   if a then output;
run;

Solution

  • Yes there is a simple (but subtle) way in just 7 lines of code.

    What you intend to achieve is intrinsically a conditional Cartesian join which can be done by a do-looped set statement. The following code use the test dataset from Dmitry and a modified version of the code in the appendix of SUGI Paper 249-30

    data data1;
        input first last;
    datalines;
    1 3
    4 7
    6 9
    ;
    run;
    
    data data2;
        input value;
    datalines;
    2
    5
    6
    7
    ;
    run;
    
    /***** by data step looped SET *****/
    DATA CART_data; 
        SET data1; 
        DO i=1 TO NN; /*NN can be referenced before set*/
            SET data2 point=i nobs=NN; /*point=i - random access*/
            if first<value<last then OUTPUT; /*conditional output*/
        END; 
    RUN;
    
    /***** by SQL *****/
    proc sql;
        create table cart_SQL as 
        select * from data1 
        left join data2
            on first<value<last;
    quit;
    

    One can easily see that the results coincide.

    Also note that from SAS 9.2 documentation: "At compilation time, SAS reads the descriptor portion of each data set and assigns the value of the NOBS= variable automatically. Thus, you CAN refer to the NOBS= variable BEFORE the SET statement. The variable is available in the DATA step but is not added to any output data set."