I have a much larger table than this example with many more clinical variables - a,b,c in the example below which are either present 1 or absent 0. This would be table X below.
I have a column which sums these (rsum below) BUT if there is an NA I can infer that the values in a,b,c should also be missing NA and not absent 0. So the desired answer is table Y below.
X = tibble(a = c(0, 0, 0, 0), b = c(0, 1, 0, 1), c = c(0, 1, 0, 0),
rsum = c(0, 2, NA, 1))
X
# A tibble: 4 × 4
a b c rsum
<dbl> <dbl> <dbl> <dbl>
1 0 0 0 0
2 0 1 1 2
3 0 0 0 NA
4 0 1 0 1
Y = tibble(a = c(0, 0, NA, 0), b = c(0, 1, NA, 1), c = c(0, 1, NA, 0),
rsum = c(0, 2, NA, 1))
Y
# A tibble: 4 × 4
a b c rsum
<dbl> <dbl> <dbl> <dbl>
1 0 0 0 0
2 0 1 1 2
3 NA NA NA NA
4 0 1 0 1
I thought this would be simple enough using variants of mutate_at
or mutate
and across
. But I keep running into bugs and messages about previous solutions being deprecated. My best guess at how this should work was:
mutate(X, across(a:c), ~if_else(is.na(rsum), NA, .)
Error in `mutate()`:
ℹ In argument: `~if_else(is.na(rsum), NA, .x)`.
Caused by error:
! `~if_else(is.na(rsum), NA, .x)` must be a vector, not a
<formula> object.
Run `rlang::last_trace()` to see where the error occurred.
Is there a simple dplyr solution?
library(tidyverse)
X = tibble(a = c(0, 0, 0, 0), b = c(0, 1, 0, 1), c = c(0, 1, 0, 0),
rsum = c(0, 2, NA, 1))
X %>%
mutate(across(a:c, ~ if_else(is.na(rsum), NA, .x)))
# A tibble: 4 × 4
a b c rsum
<dbl> <dbl> <dbl> <dbl>
1 0 0 0 0
2 0 1 1 2
3 NA NA NA NA
4 0 1 0 1