Search code examples
if-statementsasretain

SAS retain statement


Suppose I have a dataset with three variables:
ID   Year   Status
1   2017   Y
1   2017   N
1   2018   N
1   2018   Y
2   2017   Y
2   2017      
2   2018   N
2   2018   N

I would like to create a fourth column called NEW which has three possible values ('Yonly' 'Nonly' and 'yesno'). In the example above the output will be:
ID     Year     Status   NEW
1     2017     Y          
1     2017     N         yesno
1     2018     N      
1     2018     Y        yesno
2     2017     Y      
2     2017               yesonly
2     2018     N      
2     2018     N       noonly

Note: could have missing data. My solution so far is wrong:

retain tmp '';
by ID Year;
if Status='Y' then tmp='Yonly';
if Status='N' then tmp='Nonly';
if tmp='Yonly' and Status='N' then tmp='yesno';
  if tmp='Nonly' and Status='Y' then tmp='yesno';
if last.Year=1 then NEW=tmp;

Please help? Any method will do, you don't have to use RETAIN.


Solution

  • Make sure to define the length of TMP. Your current code will set the length of TMP to 1 since the first usage is the initial value listed in the RETAIN statement.

    You are missing an initialization step for when starting a new group.

    if first.year then tmp=' ';
    

    Your method can only set the result on the last record for each group. If you want all observations in a group to have the same value then I would suggest using a double DOW loop. The first loop can be used to see if there are any 'Y' or 'N' status. Then you can calculate your NEW variable. Then a second loop will read in the data for the group again and write the values out. Because all observations for a group are processed in a single data step iteration there is no need to use RETAIN.

    data want ;
      do until (last.year) ;
        set have ;
        by id year ;
        y = y or (status='Y');
        n = n or (status='N');
      end;
      length new $8;
      if Y and N then new='yesno';
      else if Y then new='yesonly';
      else if N then new='noonly';
      else new='none';
      drop y n ;
      do until (last.year) ;
        set have ;
        by id year ;
        output ;
      end;
    run;