Search code examples
rgrouping

Ho to apply the same formula to groups


My dataset, unitatsconsum_2021 is such:

structure(list(NUMERO = structure(c(21, 22, 22, 22, 23, 23, 23, 
24, 24, 25, 25, 25, 25, 26, 27, 28), format.stata = "%12.0g"), 
    unitats_consum = c(2, 2, 2, 2, 2, 2, 1.9, 1.5, 1.5, 2.5, 
    2.5, 2.5, 2.2, 1, 1, 2), edat = c(17, 51, 17, 14, 44, 36, 
    3, 67, 63, 35, 48, 17, 13, 73, 67, 73), membresllar = c(3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L
    )), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -16L), groups = structure(list(NUMERO = structure(c(21, 
22, 23, 24, 25, 26, 27, 28), format.stata = "%12.0g"), .rows = structure(list(
    1L, 2:4, 5:7, 8:9, 10:13, 14L, 15L, 16L), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -8L), .drop = TRUE))

I want to calculate a new variable, unitats_consum, which should be equal to: 1 + 0.5*((observations if edat>13)-1) + 0.3*(observations if edat>=13).

The result of this equation should be the same for each identical NUMERO, which is the identifier. So far I have tried the following:

Unitatsconsum_2021 <- Unitatsconsum_2021 %>%
  group_by(NUMERO) %>%
  mutate(unitats_consum = (1 + 
                             0.5 * (ifelse(edat > 13, membresllar - 1, 0)) +
                             0.3 * (ifelse(edat <= 13, membresllar, 0))))

The desired output is:

enter image description here

So, in the code, membres_llar should count the number of observations where edat > 13 and where edat >=13, in each case respectively.


Solution

  • This does not match your output for two rows, but I believe it is what you are looking for:

    Unitatsconsum_2021 <- Unitatsconsum_2021 %>%
      group_by(NUMERO) %>%
      mutate(
        unitats_consum = 1 + 0.5 * (sum(edat > 13) - 1) + 0.3 * sum(edat <= 13)
      )
    
    Unitatsconsum_2021
    # # A tibble: 16 × 4
    # # Groups:   NUMERO [8]
    #     NUMERO  unitats_consum  edat  membresllar
    #     <dbl>   <dbl>           <dbl> <int>
    # 1   21      1               17    3
    # 2   22      2               51    3
    # 3   22      2               17    3
    # 4   22      2               14    3
    # 5   23      1.8             44    3
    # 6   23      1.8             36    3
    # 7   23      1.8             3     3
    # 8   24      1.5             67    2
    # 9   24      1.5             63    2
    # 10  25      2.3             35    4
    # 11  25      2.3             48    4
    # 12  25      2.3             17    4
    # 13  25      2.3             13    4
    # 14  26      1               73    1
    # 15  27      1               67    1
    # 16  28      1               73    3
    

    For NUMERO 21, we should have 1, since 1 + 0.5 * (1 - 1) = 1 and the same for NUMERO 28.