Search code examples
rdataframetibblepanel-data

Total percent change in a column in R?


I have a panel dataset in R that documents the percent change in population for cities over several years. A simplified version looks like this:

city_pop<-tibble(city=c("NYC", "NYC", "LA", "LA"), 
       percentpopchange=c("0.1", "0.2", "0.5", "0.3"), 
       year=c(1980, 1981, 1980, 1981))

I would like to compute the total population change (in percentage) for NYC and LA over 1980 and 1981. For example, that would be 0.32 for NYC. My ideal dataset with such output would be:

cum_city_pop<-tibble(city=c("NYC", "LA"), 
       total_percentpopchange=c("0.32", "0.95"))

Is there any way to do this in R? I would greatly appreciate any help.


Solution

  • Assuming your 0.93 for LA is a typo:

    library(dplyr)
    
    city_pop <- tibble(
      city = c("NYC", "NYC", "LA", "LA"), 
      percentpopchange = c("0.1", "0.2", "0.5", "0.3"), 
      year = c(1980, 1981, 1980, 1981)
    )
    
    city_pop %>% 
      group_by(city) %>% 
      summarize(
        total_percentpopchange = as.character(
          prod(1 + as.numeric(percentpopchange)) - 1
      ))
    #> # A tibble: 2 × 2
    #>   city  total_percentpopchange
    #>   <chr> <chr>                 
    #> 1 LA    0.95                  
    #> 2 NYC   0.32
    

    Created on 2024-09-03 with reprex v2.1.1