My dataset is such:
structure(list(NUMERO = structure(c(1, 2, 3, 3, 4, 5, 6, 6, 6,
6), format.stata = "%12.0g"), sexe = structure(c(1L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L), levels = c("Dona", "Home"), class = "factor"),
edat = c(71, 73, 44, 44, 70, 69, 56, 56, 23, 19)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
numero is my id variable, and I want to create a new variable that counts how many values in numero
are repeated and assign the sum to each observation. So, if there are 4 observations whith numero
= 6, then for this observations membres
should be 4.
In other words, this is the output I'm looking for:
Either do add_count
library(dplyr)
df1 %>%
add_count(NUMERO, name = "membres")
or use
library(dplyr) # version >= 1.1.0
df1 %>%
mutate(membres = n(), .by = NUMERO)
-output
# A tibble: 10 × 4
NUMERO sexe edat membres
<dbl> <fct> <dbl> <int>
1 1 Dona 71 1
2 2 Dona 73 1
3 3 Home 44 2
4 3 Dona 44 2
5 4 Home 70 1
6 5 Dona 69 1
7 6 Home 56 4
8 6 Dona 56 4
9 6 Home 23 4
10 6 Dona 19 4