Search code examples
rtidyversetidyr

Change column name in pivot_wider for Boolean column


I have some data that I want to pivot_wider, using one or more names_from columns. At least one of the potential names_from columns is Boolean.

This results in uninformative default column names like TRUE, FALSE or TRUE_somevalue, FALSE_somevalue (where somevalue is an otherwise meaningful level from another column).

I would like to instead use some more informative column name when some names_from column or columns are Boolean, for instance, varnameTRUE, varnameFALSE or varnameTRUE_somevalue, varnameFALSE_somevalue (where varname is the name of the Boolean column in the first place).

Obviously, I could convert the Boolean column to character or factor with meaningful names in the first place, but I'm wondering if there's some generalizable way to do this using the arguments available to pivot_wider. I can't figure out a way to do it with e.g. names_glue or names_sep. Ideally I'd like to be able to do this without even manually specifying the relevant columns, but have it happen automatically if a names_from column is a Boolean one.

Is there some way to do this?

Some example data:

test_dat <- tibble(
  idx = c(rep("a", 6), rep("b", 6)), 
  boo = c(rep(c(TRUE, FALSE), 6)), 
  word = c(rep(c("foo", "bar", "baz"), 4)), 
  val = runif(12)
)

> test_dat
# A tibble: 12 × 4
   idx   boo   word     val
   <chr> <lgl> <chr>  <dbl>
 1 a     TRUE  foo   0.325 
 2 a     FALSE bar   0.770 
 3 a     TRUE  baz   0.824 
 4 a     FALSE foo   0.351 
 5 a     TRUE  bar   0.555 
 6 a     FALSE baz   0.698 
 7 b     TRUE  foo   0.0200
 8 b     FALSE bar   0.427 
 9 b     TRUE  baz   0.325 
10 b     FALSE foo   0.463 
11 b     TRUE  bar   0.987 
12 b     FALSE baz   0.345 

Default output:

> test_dat %>% pivot_wider(names_from=c("boo", "word"), values_from=val)
# A tibble: 2 × 7
  idx   TRUE_foo FALSE_bar TRUE_baz FALSE_foo TRUE_bar FALSE_baz
  <chr>    <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl>
1 a       0.325      0.770    0.824     0.351    0.555     0.698
2 b       0.0200     0.427    0.325     0.463    0.987     0.345

Desired output would be identical except with column names more like booTRUE_foo, booFALSE_foo (or some variant thereof, e.g. boo_foo, notboo_foo)


Solution

  • You can append column names to those boolean(logical) columns at first with dplyr functions:

    mutate(across(where(is.logical), ...))
    

    before pivoting to wide. By this way you don't need to require manual specification of the boolean column names.

    library(tidyverse)
    
    test_dat %>%
      mutate(across(where(is.logical), ~ paste0(cur_column(), .x))) %>%
      pivot_wider(names_from = c(boo, word), values_from = val)
    
    # # A tibble: 2 × 7
    #   idx   booTRUE_foo booFALSE_bar booTRUE_baz booFALSE_foo booTRUE_bar booFALSE_baz
    #   <chr>       <dbl>        <dbl>       <dbl>        <dbl>       <dbl>        <dbl>
    # 1 a          0.0556        0.679      0.409         0.666       0.675        0.312
    # 2 b          0.281         0.477      0.0817        0.977       0.745        0.340