I have some data that I want to pivot_wider
, using one or more names_from
columns. At least one of the potential names_from
columns is Boolean.
This results in uninformative default column names like TRUE
, FALSE
or TRUE_somevalue
, FALSE_somevalue
(where somevalue
is an otherwise meaningful level from another column).
I would like to instead use some more informative column name when some names_from
column or columns are Boolean, for instance, varnameTRUE
, varnameFALSE
or varnameTRUE_somevalue
, varnameFALSE_somevalue
(where varname
is the name of the Boolean column in the first place).
Obviously, I could convert the Boolean column to character or factor with meaningful names in the first place, but I'm wondering if there's some generalizable way to do this using the arguments available to pivot_wider
. I can't figure out a way to do it with e.g. names_glue
or names_sep
. Ideally I'd like to be able to do this without even manually specifying the relevant columns, but have it happen automatically if a names_from
column is a Boolean one.
Is there some way to do this?
Some example data:
test_dat <- tibble(
idx = c(rep("a", 6), rep("b", 6)),
boo = c(rep(c(TRUE, FALSE), 6)),
word = c(rep(c("foo", "bar", "baz"), 4)),
val = runif(12)
)
> test_dat
# A tibble: 12 × 4
idx boo word val
<chr> <lgl> <chr> <dbl>
1 a TRUE foo 0.325
2 a FALSE bar 0.770
3 a TRUE baz 0.824
4 a FALSE foo 0.351
5 a TRUE bar 0.555
6 a FALSE baz 0.698
7 b TRUE foo 0.0200
8 b FALSE bar 0.427
9 b TRUE baz 0.325
10 b FALSE foo 0.463
11 b TRUE bar 0.987
12 b FALSE baz 0.345
Default output:
> test_dat %>% pivot_wider(names_from=c("boo", "word"), values_from=val)
# A tibble: 2 × 7
idx TRUE_foo FALSE_bar TRUE_baz FALSE_foo TRUE_bar FALSE_baz
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a 0.325 0.770 0.824 0.351 0.555 0.698
2 b 0.0200 0.427 0.325 0.463 0.987 0.345
Desired output would be identical except with column names more like booTRUE_foo
, booFALSE_foo
(or some variant thereof, e.g. boo_foo
, notboo_foo
)
You can append column names to those boolean(logical) columns at first with dplyr
functions:
mutate(across(where(is.logical), ...))
before pivoting to wide. By this way you don't need to require manual specification of the boolean column names.
library(tidyverse)
test_dat %>%
mutate(across(where(is.logical), ~ paste0(cur_column(), .x))) %>%
pivot_wider(names_from = c(boo, word), values_from = val)
# # A tibble: 2 × 7
# idx booTRUE_foo booFALSE_bar booTRUE_baz booFALSE_foo booTRUE_bar booFALSE_baz
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 a 0.0556 0.679 0.409 0.666 0.675 0.312
# 2 b 0.281 0.477 0.0817 0.977 0.745 0.340