Search code examples
rtidyversetranspose

R tidyverse: transposing but keeping one column intact


I am aware there are numerous posts about this topic. However, I chose to make a new post because

  1. Many solutions were in reshape2, but I am trying to minimize the number of packages used in this project. So I would strongly prefer tidyverse solution
  2. Lack of my ability to understand and change the other posts' solution appropriately.

Thank you for understanding.

Suppose I have a dataset:

test.dat <- tibble(pair = rep(c("1 v 2","1 v 3", "1 v 4" ,"2 v 3", "2 v 4", "3 v 4"),2),
                   name = c(rep("Louis",6),rep("Paul",6)),
                   Case1 = rpois(12,5),
                   Case2 = rpois(12,6),
                   Case3 = rpois(12,2),
                   Case4 = rpois(12,1)
> head(test.dat,2)
# A tibble: 2 x 6
  pair  name  Case1 Case2 Case3 Case4
  <chr> <chr> <int> <int> <int> <int>
1 1 v 2 Louis    11     7     0     0
2 1 v 3 Louis     3     9     3     1

and my goal is to transpose while keeping name column intact.

test.want <- tibble(Case= rep(c("Case1","Case2","Case3","Case4"),2),
                    name = c(rep("Louis",4),rep("Paul",4)),
                    `1 v 2` = NA,
                    `1 v 3` = NA,
                    `1 v 4` = NA,
                    `2 v 3` = NA,
                    `2 v 4` = NA,
                    `3 v 4` = NA)
> head(test.want,2)
# A tibble: 2 x 8
  Case  name  `1 v 2` `1 v 3` `1 v 4` `2 v 3` `2 v 4` `3 v 4`
  <chr> <chr> <lgl>   <lgl>   <lgl>   <lgl>   <lgl>   <lgl>  
1 Case1 Louis NA      NA      NA      NA      NA      NA     
2 Case2 Louis NA      NA      NA      NA      NA      NA   

From the other posts about using tidyverse, the solution seems to be using pivot_longer() then pivot_wider afterwards. However, I can't go beyond this section.

test <- test.dat %>% 
  pivot_longer(cols=starts_with("Case"))   #add names_repair argument?

Error: Failed to create output due to bad names.
* Choose another strategy with `names_repair`

Solution

  • The issue would be that there is a clash between the name column and the default one created with pivot_longer and this would cause duplication of "name" column. Specify the names_to in pivot_longer as "Case" to avoid the duplication

    library(dplyr)
    library(tidyr)
    test.dat %>%
      pivot_longer(cols = starts_with("Case"), names_to = "Case") %>% 
      pivot_wider(names_from = pair, values_from = value)
    

    -output

    # A tibble: 8 x 8
    #  name  Case  `1 v 2` `1 v 3` `1 v 4` `2 v 3` `2 v 4` `3 v 4`
    #  <chr> <chr>   <int>   <int>   <int>   <int>   <int>   <int>
    #1 Louis Case1       3       3      11       4       8       2
    #2 Louis Case2       5       7       4       6       4       5
    #3 Louis Case3       3       1       0       2       4       2
    #4 Louis Case4       1       0       0       2       1       1
    #5 Paul  Case1       5       2       5       4       5       3
    #6 Paul  Case2       6       6       2       1       9       4
    #7 Paul  Case3       0       3       2       2       2       2
    #8 Paul  Case4       0       0       1       1       0       0