Search code examples
sasdelete-rowretain

Drop observations once condition is met by multiple variables


I have the following data and used one of the existing answered questions to solve my data problem but could not get what I want. Here is what I have in my data

  • Amt1 is populated when the Evt_type is Fee
  • Amt2 is populated when the Evt_type is REF1/REF2
  • I don't want to display any observations after the last Flag='Y'
  • If there is no Flag='Y' then I want all the observations for that id (e.g. id=102)
  • I want to display if the next row for that id is a Fee followed by REF1/REF2 after flag='Y' (e.g. id=101) However I don't want if there is no REF1/REF2 (e.g.id=103)

Have:

       id   Date        Evt_Type   Flag   Amt1   Amt2
      101  2/2/2019      Fee              5
      101  2/3/2019      REF1      Y             5
      101  2/4/2019      Fee              10
      101  2/6/2019      REF2      Y             10
      101  2/7/2019      Fee               4
      101  2/8/2019      REF1
      102  2/2/2019      Fee              25
      102  2/2/2019      REF1      N      25
      103  2/3/2019      Fee              10
      103  2/4/2019      REF1      Y             10
      103  2/5/2019      Fee              10

Want:

      id   Date        Evt_Type   Flag   Amt1   Amt2
     101  2/2/2019      Fee              5
     101  2/3/2019      REF1      Y             5
     101  2/4/2019      Fee              10
     101  2/6/2019      REF2      Y             10
     101  2/7/2019      Fee               4
     101  2/8/2019      REF1
     102  2/2/2019      Fee              25
     102  2/2/2019      REF1      N      25
     103  2/4/2019      REF1      Y             10
     103  2/5/2019      Fee              10

I tried the following

data want;
    set have;
    by id Date;
    drop count;

    if (first.id or first.date) and FLAG='Y' then
        do;
            retain count;
            count=1;
            output;
            return;
        end;

    if count=1 and ((first.id or first.date) and Flag ne 'Y') then
        do;
            retain count;
            delete;
            return;
        end;
    output;
run;

Any help is appreciated.

Thanks


Solution

  • A technique known as DOW loop can perform a computation that measures a group in some way and then, in a second loop, apply that computation to members of the group.

    The DOW relies on a SET statement inside the loop. In this case the computation is 'what row in the group is the last one having flag="Y".

    data want;
      * DOW loop, contains computation;
    
      _max_n_with_Y = 1e12;
    
      do _n_ = 1 by 1 until (last.id);
        set have;
        by id;
        if flag='Y' then _max_n_with_Y = _n_;
      end;
    
      * Follow up loop, applies computation;
      do _n_ = 1 to _n_;
        set have;
        if _n_ <= _max_n_with_Y then OUTPUT;
      end;
      drop _:;
    run;