I'm trying to make a new column that checks on a group (id and number) if two columns have the same observations (classification and classification-1").
This is the original data frame:
reprex <- tribble(~"id", ~"number", ~"year", ~"classification", ~"classification-1",
5, 7020, 2015, "Trading de servicios", "Servicios empresariales",
2, 4649, 2015, "Trading", "Comercial",
2, 4649, 2015, "Comercial", "Trading",
2, 4649, 2016, "Trading", "Comercial",
2, 4649, 2016, "Comercial", "Trading",
3, 4651, 2015, "Trading", "Comercial",
3, 4651, 2015, "Trading", "Comisiones",
3, 4651, 2015, "Comercial", "Trading",
3, 4651, 2015, "Comercial", "Comisiones")
I want to get this:
reprex <- tribble(~"id", ~"number", ~"year", ~"classification", ~"classification-1", ~"check",
5, 7020, 2015, "Trading de servicios", "Servicios empresariales", T,
2, 4649, 2015, "Trading", "Comercial", T
2, 4649, 2015, "Comercial", "Trading", T
2, 4649, 2016, "Trading", "Comercial", T
2, 4649, 2016, "Comercial", "Trading", T
3, 4651, 2015, "Trading", "Comercial", F
3, 4651, 2015, "Trading", "Comisiones", F
3, 4651, 2015, "Comercial", "Trading", F
3, 4651, 2015, "Comercial", "Comisiones", F)
Perhaps this would help
library(dplyr)
reprex %>%
group_by(id, number) %>%
mutate(check = length(intersect(classification, `classification-1`)) > 0)
Of if we need to check all
the unique
elements, then after grouping by 'id', 'number', get the unique
elements of both classification
, classification-1
, check whether they are equal with setequal
reprex %>%
group_by(id, number) %>%
mutate(check = setequal(sort(unique(classification)),
sort(unique(`classification-1`))))