I have found a few options from other posts but I am having trouble reproducing the code for my specific needs.
I have climate data that I would like to find the average based on season from 1910-2015 for specific locations.
Here is an example of what I need: the average PPT for CENTROID_ID c1763_1 in 1911 for Winter (Month 12 of the preceding year [1910], month 1 and 2 of year in question [1911]), Spring (months 3,4,5 in 1911), Summer (months 6,7,8 in 1911) and Fall (months 9,10,11 in 1911). This would then need to be done for all individual centroid IDs for every year. I have over 400 unique CENTROID_IDs covering years 1910-2015.
I envision the new dataframe to have the columns CENTROID_ID, YEAR, WINT_PPT, SPR_PPT, SUM_PPT, FALL_PPT.
CENTROID_ID YEAR MONTH PPT
1 c1763_1 1910 1 52.639
2 c1763_1 1910 2 20.870
3 c1763_1 1910 3 21.706
4 c1763_1 1910 4 9.347
5 c1763_1 1910 5 1.201
6 c1763_1 1910 6 11.267
7 c1763_1 1910 7 41.870
8 c1763_1 1910 8 61.260
9 c1763_1 1910 9 27.815
10 c1763_1 1910 10 67.377
11 c1763_1 1910 11 24.719
12 c1763_1 1910 12 30.212
13 c1763_1 1911 1 88.728
14 c1763_1 1911 2 50.035
15 c1763_1 1911 3 37.720
16 c1763_1 1911 4 12.831
17 c1763_1 1911 5 0.739
18 c1763_1 1911 6 18.198
19 c1763_1 1911 7 74.731
20 c1763_1 1911 8 40.873
21 c1763_1 1911 9 86.340
22 c1763_1 1911 10 36.423
23 c1763_1 1911 11 12.491
24 c1763_1 1911 12 19.428
25 c1763_1 1912 1 11.010
26 c1763_1 1912 2 16.339
27 c1763_1 1912 3 72.017
28 c1763_1 1912 4 25.887
29 c1763_1 1912 5 5.314
30 c1763_1 1912 6 8.595
31 c1763_1 1912 7 47.781
32 c1763_1 1912 8 51.188
33 c1763_1 1912 9 10.931
34 c1763_1 1912 10 119.725
35 c1763_1 1912 11 10.420
36 c1763_1 1912 12 8.777
37 c1763_1 1913 1 27.771
38 c1763_1 1913 2 62.622
39 c1763_1 1913 3 17.533
40 c1763_1 1913 4 8.008
41 c1763_1 1913 5 1.423
42 c1763_1 1913 6 3.773
43 c1763_1 1913 7 42.982
44 c1763_1 1913 8 40.541
45 c1763_1 1913 9 58.495
46 c1763_1 1913 10 22.729
47 c1763_1 1913 11 48.130
48 c1763_1 1913 12 32.049
49 c1763_1 1914 1 104.197
50 c1763_1 1914 2 31.707
Simply assign a SEASON
column then aggregate
:
df <- within(df, {
SEASON <- ifelse(MONTH %in% c(12, 1, 2), 'Winter',
ifelse(MONTH %in% c(3, 4, 5), 'Spring',
ifelse (MONTH %in% c(6, 7, 8), 'Summer',
ifelse(MONTH %in% c(9, 10, 11), 'Fall', NA)
)
)
)
YEAR <- ifelse(MONTH == 12, YEAR + 1, YEAR)
})
agg_df <- aggregate(PPT ~ CENTROID_ID + SEASON, df, FUN=mean)
agg_df
# CENTROID_ID SEASON PPT
# 1 c1763_1 Fall 43.79958
# 2 c1763_1 Spring 17.81050
# 3 c1763_1 Summer 36.92158
# 4 c1763_1 Winter 39.74171
Should you need multiple aggregations of PPT
:
agg_raw <- aggregate(PPT ~ CENTROID_ID + SEASON, df,
FUN=function(x) c(min=min(x), mean=mean(x), max=max(x)))
agg_df <- do.call(data.frame, agg_raw)
agg_df
# CENTROID_ID SEASON PPT.min PPT.mean PPT.max
# 1 c1763_1 Fall 10.420 43.79958 119.725
# 2 c1763_1 Spring 0.739 17.81050 72.017
# 3 c1763_1 Summer 3.773 36.92158 74.731
# 4 c1763_1 Winter 8.777 39.74171 104.197