I am trying to count occurrences where the same person was billed for an item, four or more times, by the same place within 30 days of each instance. For example, input would look something like:
person service place date
A x shop1 01/01/15
A x shop1 01/15/15
A x shop1 01/20/15
B y shop2 03/20/15
B y shop2 04/01/15
C z shop1 05/05/15
And output would look something like:
person service place date count
A x shop1 01/01/15 3
A x shop1 01/15/15 3
A x shop1 01/20/15 3
B y shop2 03/20/15 2
B y shop2 04/01/15 2
C z shop1 05/05/15 1
I have tried stuff similar to:
data work.want;
do _n_ =1 by 1 until (last.PLACE);
set work.rawdata;
by PERSON PLACE;
if first.PLACE then count=0;
count+1;
end;
frequency= count;
do _n_ = 1 by 1 until (last.PLACE);
set work.rawdata;
by PERSON PLACE;
output;
end;
run;
this gives a count based on person and place but does not factor in time. Any help or suggestions would be greatly appreciated! Thank you
This can be done easily with proc sql...
Your data:
data have;
input person $ service $ place $;
datalines;
A x shop1
A x shop1
A x shop1
B y shop2
B y shop2
C z shop1
;
run;
Then we count the occurences of "place" for each 1,2 group, and join the original table.
proc sql;
create table want as
select a.*, b._count
from have as a
inner join
(
select person, service, count(place) as _count
from have
group by 1,2
) as b
on a.person = b.person
and a.service = b.service
;
quit;
Is there a date field? We need it in order to group the data by month (or 30 days), for example.