Search code examples
rmeanpanel-data

Average price of a time period in R


I'm beginning to get my feet wet with R, and I'm brand new to time series concepts.

I have panel data with a time frame of 200 years where I want to calculate the price_average over 10 years for each city, but I cannot find a way to shape it appropriately into a function.

> all_cities
    month price year     town
110    10  2625 1699   Munich
111    11  2730 1699   Munich
112    12  2782 1699   Munich
113     1    34 1700 Hannover
114     1  2520 1700   Munich
115     2    34 1700 Hannover
116     2  2730 1700   Munich
117     3    33 1700 Hannover
118     3  2765 1700   Munich
119     4    36 1700 Hannover
120     4  3150 1700   Munich

I am looking for a return data frame of the form:

> all_cities_10y
    s.year e.year mean_price    town
1     1690  1699    xxx         Munich
2     1700  1709    xxx         Munich
3     1700  1709    xxx         Hannover

Later I want to display the data in a graph. However, I hope the question is clear. I appreciate any advice.
structure(list(month = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 
7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 1, 1, 2, 2, 3, 3, 4
), price = c(34, 2520, 34, 2730, 33, 2765, 36, 3150, 36, 3150, 
36, 3097, 36, 3675, 24, 3360, 23, 2205, 20, 2152, 21, 2520, 21, 
2520, 21, 2590, 21, 2567, 24, 2415, 24), year = c(1700, 1700, 
1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 
1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 1700, 
1701, 1701, 1701, 1701, 1701, 1701, 1701), town = c("Hannover", 
"Munich", "Hannover", "Munich", "Hannover", "Munich", "Hannover", 
"Munich", "Hannover", "Munich", "Hannover", "Munich", "Hannover", 
"Munich", "Hannover", "Munich", "Hannover", "Munich", "Hannover", 
"Munich", "Hannover", "Munich", "Hannover", "Munich", "Hannover", 
"Munich", "Hannover", "Munich", "Hannover", "Munich", "Hannover"
)), row.names = 113:143, class = "data.frame")

Solution

  • this might work for you... I used the data in your question (see bottom answer), but not the sample data provided, since it only contained years in the 1700-1709 interval.

    library( data.table )
    #make it a data.table if not already
    setDT( DT )
    #summarise
    DT[, .(mean_proce = mean(price) ), 
       by = .(s.year = floor(year/10)*10,
              e.year = 9 + floor(year/10)*10,
              town )]
    
    #    s.year e.year     town mean_proce
    # 1:   1690   1699   Munich   2712.333
    # 2:   1700   1709 Hannover     34.250
    # 3:   1700   1709   Munich   2791.250
    

    sample data used

    DT <- fread("    month price year     town
        10  2625 1699   Munich
        11  2730 1699   Munich
        12  2782 1699   Munich
         1    34 1700 Hannover
         1  2520 1700   Munich
         2    34 1700 Hannover
         2  2730 1700   Munich
         3    33 1700 Hannover
         3  2765 1700   Munich
         4    36 1700 Hannover
         4  3150 1700   Munich")