I am trying to sum up a variable using retain based on certain conditions. I have these fields account_id, date, transaction, value and transaction_type.
The data is sorted by account_id and date
I want to sum up value until the first transaction_type not in ('A')
I have this
data dset;
set dset;
by account_id;
retain sum_flag sum;
if first.account_id then do;
sum_flag = 1;
sum=0;
end;
if transaction_type not in ('A') then sum_flag = 0;
if sum_flag = 1 then sum=sum +value;
run;
My issue is if the first transaction_type in A then it works fine but if the next transaction_type is also A sum_flag is set to . instead of being kept as 1. Why is this happening?
I think it's something to do with if transaction_type not in ('A') then sum_flag = 0; Without it the variable sum_flag is not being reset to missing if the transaction is in 'A' but I only want the value retained until the first non 'A' transaction.
Two things could be happening here.
value
field may sometimes contain nulls? If so the +
operator is causing your problems. The +
operator will always return NULL
if one of it's arguments are null. Instead use the sum()
function. i.e. if sum_flag = 1 then sum=sum(sum,value);