I have a dataframe (my_dataframe) with 5 columns. All have 0 or 1 values. I would like to create a new column called cn7_any, which should have values of 1 when any values from columns 2:5 are ==1.
structure(list(cn7_normal = c(1L, 1L, 1L, 1L, 1L, 1L),
cn7_right_paralysis_central = c(0L, 0L, 0L, 0L, 0L, 0L),
cn7_right_paralysis_peripheral = c(0L, 0L, 0L, 0L, 0L, 0L),
cn7_left_paralysis_central = c(0L, 0L, 0L, 0L, 0L, 0L),
cn7_left_paralysis_peripheral = c(0L, 0L, 0L, 0L, 0L, 0L)),
row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
> head(my_dataframe)
# A tibble: 6 x 5
cn7_normal cn7_right_paralysis_cen… cn7_right_paralysis_perip… cn7_left_paralysis_cen… cn7_left_paralysis_peri…
<int> <int> <int> <int> <int>
1 1 0 0 0 0
2 1 0 0 0 0
I could do it successfully with case_when():
my_dataframe<-my_dataframe%>%
mutate(cn7_paralisis_any=case_when(cn7_right_paralysis_central==1 ~ 1,
cn7_right_paralysis_peripheral==1 ~ 1,
cn7_left_paralysis_central==1 ~ 1,
cn7_left_paralysis_peripheral==1 ~ 1,
TRUE ~ 0)
)
Although it worked, I wonder whether there is a simpler, less verbose solution. I feel I should be using any() somehow. Any ideas?
Your data is all zeroes, so I'll change a couple to prove the point.
rowSums(my_dataframe[,2:5]) > 0
# [1] FALSE TRUE FALSE TRUE FALSE FALSE
+(rowSums(my_dataframe[,2:5]) > 0)
# [1] 0 1 0 1 0 0
my_dataframe$cn7_any <- +(rowSums(my_dataframe[,2:5]) > 0)
Within dplyr
,
my_dataframe %>%
mutate(cn7_any = rowSums(across(-cn7_normal, ~ . > 0)) > 0)
# # A tibble: 6 x 6
# cn7_normal cn7_right_paralysis_central cn7_right_paralysis_peripheral cn7_left_paralysis_central cn7_left_paralysis_peripheral cn7_any
# <int> <int> <int> <int> <int> <lgl>
# 1 1 0 0 0 0 FALSE
# 2 1 0 0 0 1 TRUE
# 3 1 0 0 0 0 FALSE
# 4 1 0 0 1 0 TRUE
# 5 1 0 0 0 0 FALSE
# 6 1 0 0 0 0 FALSE
It seems like a logical
thing you're doing, not a number thing, but if you want numbers, just use the +(.)
trick as above:
my_dataframe %>%
mutate(cn7_any = +(rowSums(across(-cn7_normal, ~ . > 0)) > 0))