Search code examples
rgroup-bysummarize

R code (Rstats) calculating unemployment rate based off columns in long form data


I am trying to calculate the unemployment rate based of the data below and add it as new rows to the data table. I want to divide unemployed by labourforce based off the date and add each datapoint as a row.

Essentially, I am trying to go from this

date series_1 value
2021-01-01 labourforce 13793
2021-02-01 labourforce 13812
2021-03-01 labourforce 13856
2021-01-01 unemployed 875
2021-02-01 unemployed 805
2021-03-01 unemployed 778

to this

date series_1 value
2021-01-01 labourforce 13793
2021-02-01 labourforce 13812
2021-03-01 labourforce 13856
2021-01-01 unemployed 875
2021-02-01 unemployed 805
2021-03-01 unemployed 778
2021-01-01 unemploymentrate 6.3
2021-02-01 unemploymentrate 5.8
2021-03-01 unemploymentrate 5.6

Here is my code so far. I know the last line is wrong? Any suggestions or ideas are welcome!

longdata %>% 
  group_by(date) %>%
  summarise(series_1 = 'unemploymentrate',
  value = series_1$unemployed/series_1$labourforce))

Solution

  • Fro each day, you can get the ratio of 'unemployed' by 'labourforce' and add it as new rows to your original dataset.

    library(dplyr)
    
    df %>% 
      group_by(date) %>%
      summarise(value = value[series_1 == 'unemployed']/value[series_1 == 'labourforce'] * 100, 
                series_1 = 'unemploymentrate') %>%
      bind_rows(df) %>%
      arrange(series_1)
    
    #   date          value series_1        
    #  <chr>         <dbl> <chr>           
    #1 2021-01-01 13793    labourforce     
    #2 2021-02-01 13812    labourforce     
    #3 2021-03-01 13856    labourforce     
    #4 2021-01-01   875    unemployed      
    #5 2021-02-01   805    unemployed      
    #6 2021-03-01   778    unemployed      
    #7 2021-01-01     6.34 unemploymentrate
    #8 2021-02-01     5.83 unemploymentrate
    #9 2021-03-01     5.61 unemploymentrate