Search code examples
rdataframedplyrdatatable

How can I transpose dataframe with multiple columns in R?


I'm working on a table like this:

data <- data.frame(Chromosome = c("Chr1", "Chr1", "Chr1", "Chr2","Chr2", "Chr3", "Chr3", "Chr3", "Chr5","Chr5","Chr5", "Chr5", "Chr7", "Chr7"),
               Value = c(150, 325, 666, 121, 111, 325, 777, 123, 325, 150, 666, 444, 32,777),
               Percentaje = c(90, 80,20,0.5, 0.7, 29, 13, 2,3,3,1 ,34, 56, 78))

How can I obtain this table?

data2 <- data.frame(Value = c(150, 325, 666, 121, 111, 777, 123, 444, 32),
                Chr1 = c(90, 80, 20, 0, 0, 0, 0, 0, 0),
                Chr2 = c(0,0,0,0.5,0.7,0,0,0,0),
                Chr3 = c(0, 29, 0, 0, 0, 13, 2, 0, 0),
                Chr5 = c(3, 3, 1, 0,0,0,0,34,0),
                Chr7 = c(0,0,0,0,0,78,0,0,56))

Note that I would like to add the value "0" when some "Value" is missing in a chromosome.


Solution

  • You could use pivot_wider with names_from to your Chromosome column to specify the column names and values_from by Percentage column and fill the NA's with 0 by values_fill like this:

    library(tidyr)
    data %>%
      pivot_wider(names_from = Chromosome, values_from = Percentaje, values_fill = 0)
    #> # A tibble: 9 × 6
    #>   Value  Chr1  Chr2  Chr3  Chr5  Chr7
    #>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    #> 1   150    90   0       0     3     0
    #> 2   325    80   0      29     3     0
    #> 3   666    20   0       0     1     0
    #> 4   121     0   0.5     0     0     0
    #> 5   111     0   0.7     0     0     0
    #> 6   777     0   0      13     0    78
    #> 7   123     0   0       2     0     0
    #> 8   444     0   0       0    34     0
    #> 9    32     0   0       0     0    56
    

    Created on 2022-12-13 with reprex v2.0.2