I'm doing some data wrangling on my raw data to get it ready for analysis.
I'm creating a varivale HEART that equals 1 when any of the HEART1, HEART2, HEART3 equals 1; when HEART1, HEART2, HEART3 all equal 0 then HEART equals 0. When all columns are NA, return NA.
ID <- c(1,1,1,2,2,2,3,3,3)
HEART1 <- c(1,0,NA,0,0,0,0,0,NA)
CARDIO <- c(1,0,0,0,0,0,0,1,NA)
ANGINA <- c(1,0,1,0,0,0,0,1,NA)
SLEEP <- c(1,1,1,0,0,0,0,0,0)
df<- data.frame(ID, HEART1, CARDIO, ANGINA)
So the HEART column will be (1,0,1,0,0,0,0,1,NA)
How do I do it with the mutate() function in dplyr? I've heard of the if_all() function but how do I only select HEART1, CARDIO, ANGINA and leave SLEEP out of it?
This will spit out some warnings about coercing the columns to logical
, but those can be ignored.
library(dplyr)
df |>
rowwise() |>
mutate(
heart = as.integer(any(c_across(starts_with("heart"))))
)
# # A tibble: 9 × 5
# # Rowwise:
# ID HEART1 HEART2 HEART3 heart
# <dbl> <dbl> <dbl> <dbl> <int>
# 1 1 1 1 1 1
# 2 1 0 0 0 0
# 3 1 NA 0 1 1
# 4 2 0 0 0 0
# 5 2 0 0 0 0
# 6 2 0 0 0 0
# 7 3 0 0 0 0
# 8 3 0 1 1 1
# 9 3 NA NA NA NA