Search code examples
sqlsassas-macro

Using macros in SAS SQL


I am wondering of I can consolidate some code into one PROC SQL statement instead of several back-to-back SQL statements.

I am using macro to calculate the age of a participant at date of consent for a study, similar to what is provided here and here. I tried to troubleshoot the functionality of macros, and I calculated the total weight of a baby at birth in ounces using the %sum macro (which works great...). However, when trying to calculate age, the macros don't work.

However, if I use the macros in a new SQL statement, it works fine.

The code below works:

%macro months(somedate,birth);
intck('month',&birth,&somedate) 
         - (day(&somedate) < day(&birth))
%mend months;

%macro days(somedate,birth);
intck('day',&birth,&somedate) 
         - (day(&somedate) < day(&birth))
%mend days;

%macro sum(part1, part2);
&part1*16 + &part2
%mend sum;


**********  bringing in data from outside tables  ;
proc sql;
  create table demos as
    select      x.*,    infcondt2 as c_dt, 
                y.*,    datepart(visitdt)   as v_dt format date9. ,
                        datepart(birthdt)   as b_dt format date9. ,
                        birthweightlbs as lbs,
                        birthweightoz as oz,

                        lbs*16 + oz as tot_oz,
                        %sum(lbs,oz) as tot_oz_m

    from    enrolled as x, 
            demographics as y

    where x.center = y.center and x.id = y.id  ;
    quit;

**********  calculating age in months and in days  ;
proc sql;
create table demos2 as
select  * ,
        %months(c_dt, b_dt) as age_m ,
        %days(c_dt, b_dt) as age_d 
from demos;
quit;


**********  creating age groupings by months: 0-3 and 3-6  ;
proc sql;
create table demos3 as
select  * ,
        case
          when age_m le 3 then 1
          when age_m le 6 and age_m gt 3 then 2
          else 3
          end as age_interval
from demos2;
quit;

Is there any way to consolidate it into a single statement? Something like:

proc sql;
  create table demos as
    select      x.*,    infcondt2 as c_dt, 
                y.*,    datepart(visitdt)   as v_dt format date9. ,
                        datepart(birthdt)   as b_dt format date9. ,
                        birthweightlbs as lbs,
                        birthweightoz as oz,

                        lbs*16 + oz as tot_oz,
                        %sum(lbs,oz) as tot_oz_m,
                        %months(c_dt, b_dt) as age_m,
                        %days(c_dt, b_dt) as age_d, 
                        case
                            when age_m le 3 then 1
                            when age_m le 6 and age_m gt 3 then 2
                            else 3
                            end as age_interval

    from    enrolled as x, 
            demographics as y

    where x.center = y.center and x.id = y.id  ;
    quit;

Solution

  • If you want to use a field that was previously created in the same SQL statement, you need to use the calculated keyword. IE:

    proc sql;
    select age*2 as double_Age, calculated double_age/2 as normal_age from sashelp.class;
    quit;
    

    Calculated is only needed when there is an actual calculation - ie, for b_Dt. c_dt is only a rename of infcondt2, so you can use c_dt (or infcondt2) interchangeably, and cannot use CALCUALTED.

    data test;
    input dt1 :date9. dt2 :date9.;
    datalines;
    01JAN2010 01FEB2011
    01DEC2011 03FEB2012
    ;;;;
    run;
    
    %macro months(somedate,birth);
    intck('month',&birth,&somedate) - (day(&somedate) < day(&birth))
    %mend months;
    
    proc sql;
    create table test2 as
    select dt1 -1 as b_dt, dt2 as c_dt, %months(calculated b_dt, c_dt) as third
    from test;
    quit;
    

    That said, you no longer need to adjust for day-of-month if you have 9.2 or sooner - look at the documnentation for INTCK. There is an optional argument (in 9.3 it's called METHOD, I think 9.2 calls it something different) that allows you to force it to use a continuous month-concept rather than a discrete concept (counting first-of-the-months is discrete, for example, as the default used to be).

    Also, I don't understand the point of the DAY macro - not only are DAYs integers (so you can just subtract the two numbers using normal subtraction), but why are you subtracting the day/day like in %month? That's to correct for the part-of-month, as I just discussed, and is not needed for days (it would yield a wrong answer where somedate < birth).

    Example of correct code:

    %macro months(somedate,birth);
    intck('month',&birth,&somedate) 
             - (day(&somedate) < day(&birth))
    %mend months;
    
    %macro days(somedate,birth);
    intck('day',&birth,&somedate) 
             - (day(&somedate) < day(&birth))
    %mend days;
    
    %macro sum(part1, part2);
    &part1*16 + &part2
    %mend sum;
    
    data enrolled;
    input
    id infcondt2 :date9.
    ;
    datalines;
    1 01JAN2011
    2 02JAN2011
    3 03MAR2011
    ;;;;
    run;
    data demographics;
    input
    id
    birthweightlbs
    birthweightoz
    birthdt :datetime17.
    ;
    datalines;
    1 8 14 04MAR2011:15:13:14
    2 7 13 05MAR2011:15:13:14
    3 6 15 06MAR2011:15:13:14
    ;;;;
    run;
    
    
    proc sql;
      create table demos as
        select      x.*,    infcondt2 as c_dt, 
                    y.*,    datepart(birthdt)   as b_dt format date9. ,
                            birthweightlbs as lbs,
                            birthweightoz as oz,
                            lbs*16 + oz as tot_oz,
                            %sum(lbs,oz) as tot_oz_m,
                            %months(c_dt, calculated b_dt) as age_m,
                            c_dt - calculated b_dt as age_d, /* %days is almost certainly wrong here */
                            case
                                when calculated age_m le 3 then 1
                                when 3 le calculated age_m le 6 then 2
                                else 3
                                end as age_interval
    
        from    enrolled as x, 
                demographics as y
    
        where x.id = y.id  ;
        quit;