I'm trying to come up with an elegant, rule-based way to assign codes to rows in a data frame based on combinations of values in columns, using this data:
library(tidyr)
df <- crossing(yr2018=c("M","S","W"),
yr2019=c("M","S","W"),
yr2020=c("M","S","W")) %>%
print(n=27)
# A tibble: 27 × 3
yr2018 yr2019 yr2020
<chr> <chr> <chr>
1 M M M
2 M M S
3 M M W
4 M S M
5 M S S
6 M S W
7 M W M
8 M W S
9 M W W
10 S M M
11 S M S
12 S M W
13 S S M
14 S S S
15 S S W
16 S W M
17 S W S
18 S W W
19 W M M
20 W M S
21 W M W
22 W S M
23 W S S
24 W S W
25 W W M
26 W W S
27 W W W
>
What I want to end up with is a column with codes applied with rules such the following:
This feels like a big, ugly if
statement, but I'm hoping for something more elegant, especially since my real data is actually 4x5 (625 rows). It also feels like maybe regular expressions, which I struggle with.
I started looking into row-wise functions and found rowwise()
as a start to logically reconfigure the data frame, but it looks like the number of functions that can operate that way are limited.
All guidance welcome!
You can use mutate
and case_when
to efficiently satisfy these conditions. sort
in the second logic will organize the letters as you described.
Since case_when
evaluates iteratively, you may be able to parse this down to make it more elegant, but as written it should follow your exact conditions:
library(dplyr)
df %>%
rowwise() %>%
mutate(new_column = case_when(
yr2018 == yr2019 & yr2019 == yr2020 ~ paste0("CON", yr2018),
yr2018 == yr2020 ~ paste(sort(c(yr2019, yr2020)), collapse = ""),
yr2018 != yr2019 & yr2019 != yr2020 & yr2018 != yr2020 ~ "MSW",
yr2019 == yr2020 & yr2018 != yr2020 ~ paste0("CON", yr2020),
yr2018 == yr2019 & yr2018 != yr2020 ~ paste0("CON", yr2018)
))
Output:
yr2018 yr2019 yr2020 new_column
<chr> <chr> <chr> <chr>
1 M M M CONM
2 M M S CONM
3 M M W CONM
4 M S M MS
5 M S S CONS
6 M S W MSW
7 M W M MW
8 M W S MSW
9 M W W CONW
10 S M M CONM
11 S M S MS
12 S M W MSW
13 S S M CONS
14 S S S CONS
15 S S W CONS
16 S W M MSW
17 S W S SW
18 S W W CONW
19 W M M CONM
20 W M S MSW
21 W M W MW
22 W S M MSW
23 W S S CONS
24 W S W SW
25 W W M CONW
26 W W S CONW
27 W W W CONW