Search code examples
sassas-iml

How do I find an average of values in dataset2 for each individual contained in dataset1, who have unique start and end dates?


I am looking to find the average of valueZ for each individual based on their unique start and end date. Exposure X has numerous values for each day, for each location, as such (There are actually 23 locations with over 300 values & dates per site):

data dataset2;
input date location valueZ;
datalines;
1/1/2016 1 0.028
1/1/2016 1 0.022
...
2/8/2016 1 0.041
2/8/2016 1 0.044
1/1/2016 2 0.056
...
8/8/2016 2 0.089
1/1/2016 3 0.029
...
11/8/2016 3 0.083
...
1/1/2016 4 0.081
...
12/8/2016 4 0.019
...
10/30/2016 23 0.063
;

The individuals in dataset1 are linked to dataset 2 by the location as such:

data dataset1;
input individual location start_date end_date;
datalines;
1 1 1/1/2016 12/31/2016
2 1 3/12/2016 9/4/2016
3 2 2/5/2016 11/5/2016
4 19 9/30/2016 10/3/2016
5 23 4/12/2016 12/12/2016
...
305 16 1/20/2016 5/15/2016
;

So, I would like to end up with the average of valueZ based on the location indicated in dataset2 from start_date to end_date for each individual. Can someone please help!


Solution

  • Something like this?

    proc sql;
      create table want as
      select d1.individual
            ,d1.location
            ,avg(d2.valueZ) as avg_value
      from dataset2 d2
      join dataset1 d1
        on d1.location=d2.location
        and d2.date between d1.start_date and d2.end_date
      group by d1.individual, d1.location
    quit;