An example of end results I have is
CASEID | Transaction id | charge | sumofcharges |
---|---|---|---|
1 | 10 | $0.10 | $0.60 |
1 | 11 | $0.20 | $0.60 |
1 | 12 | $0.30 | $0.60 |
2 | 13 | $0.50 | $0.50 |
3 | 14 | $0.65 | $0.65 |
4 | 15 | $0.75 | $1.75 |
4 | 16 | $1.00 | $1.75 |
5 | 17 | $0.15 | $1.35 |
5 | 18 | $1.20 | $1.35 |
My end result the business would like to have is only display 1 aggregate total per caseid. for case id 1, .60 would only show one time and the rest blank. Is this possible? I'm using sum(charges) as sum_of_charges which of course gives me the totals. I need to leave the transaction ID in there. I do have other fields as well however transaction ID is the outlier.
You cannot really do that with SQL as it is designed to do SET operations and not process observations individually.
But once you have the sum calculated it is easy to use a data step to set all but one per CASEID to missing.
data want;
set want;
by CASEID ;
if not first.caseid then sumofcharges=.;
run;
If the TRANSACTIONID value is UNIQUE within each value of CASEID you could use that to drive a CASE clause in PROC SQL.
proc sql ;
create table want as
select *
, case when (transactionid=min(transactionid)) then sum(charge)
else . end as sumofcharges
from have
group by caseid
;
quit;