Search code examples
loopssaslagcalculationdatastep

Creating calculated variables in one datastep


Any help is much appreciated. Thanks

I would like to create a couple of variables with my transactional data

I am trying to create variables 'act_bal' and 'penalty' using amount, type and Op_Bal. The rules I have are:

  1. For the first record, the id will have op_bal and it will be subtracted from the 'amount' for type=C and added if type=D to calculate act_bal
  2. For the second record onwards it is act_bal + amount for type=C and act_bal-amount for type=D
  3. I will add the penalty 10 only if the amount is >4 and the type=D. The id can have only two penalties.
  4. Total Penalty should be subtracted from the act_bal of the last record which would become op_bal for the next day. (e.g. for id 101, -178-20=-198 will become the op_bal for 4/2/2019)

This is the data I have for two customers IDs 101 and 102 for two different dates (My actual dataset has the data for all the 30 days).

  id   date       amount    type  Op_Bal
  101  4/1/2019    50        C       100
  101  4/1/2019    25        D       
  101  4/1/2019    75        D       
  101  4/1/2019     3        D       
  101  4/1/2019    75        D       
  101  4/1/2019    75        D       
  101  4/2/2019   100        C
  101  4/2/2019   125        D
  101  4/2/2019   150        D
  102  4/1/2019    50        C       125
  102  4/1/2019   125        C       
  102  4/2/2019   250        D
  102  4/2/2019    10        D

The code I wrote is like this

  data want;
    set have;
    by id date;
   if first.id or first.date then do;
    if first.id then do;
    if type='C' then act_bal=Op_Bal - amount;
    if type='D' then act_bal=Op_Bal + amount;
   end;
  else do;
 retain act_bal;
   if type='C' then act_bal=act_bal + amount;
   if type='D' then act_bal=act_bal - amount;
    if amount>4 and type='D' then do;
    penalty=10;
    end;
 run;

I couldn't create a counter to control the penalties to 2 and could not subtract the total penalty amount from the amount of the last row. Could someone help me in receiving the desired result? Thanks

  id   date       amount    type  Op_Bal  act_bal    penalty
  101  4/1/2019    50        C       200       150        0 
  101  4/1/2019    25        D                 125        0
  101  4/1/2019   150        D                 -25       10
  101  4/1/2019    75        D                 -100      10
  101  4/1/2019     3        D                 -103       0
  101  4/1/2019    75        D                 -178       0
  101  4/2/2019   100        C       -198       -98       0                    
  101  4/2/2019   125        D                 -223      10
  101  4/2/2019   150        D                 -373      10  
  102  4/1/2019    50        C       125        175       0
  102  4/1/2019   125        C                  300       0
  102  4/2/2019   250        D                   50       0
  102  4/2/2019    10        D                   40       0

Solution

  • A few tips:

    • You have the same code for incrementing act_bal in both the if and else blocks, so factor it out. Don't repeat yourself.
    • You can skip the retain statement if you use a sum statement.
    • Use a separate variable to keep track of the number of penalties triggered per day, but only apply the first two of them.

    So, putting it all together:

    data want;
      set have;
      by id date;
      if first.date and not first.id then op_bal = act_bal;
      if first.date then do;
        act_bal = op_bal;
        penalties = 0;
      end;
      if type='C' then act_bal + amount;
      if type='D' then act_bal + (-amount);
      if amount > 4 and type='D' then penalties + 1;
      if last.date then act_bal + (-min(penalties,2) * 10);
    run;