Search code examples
rtidy

Changing wide to long table using tidyr


I have a data frame in the form shown below:

structure(list(y_green = c(4.6, 4.7, 9, 6.7, 6.2, 7.7), y_red = c(8.3, 
9, 10, 8.3, 7.8, 9.1), y_blue = c(4.6, 2.6, 7.5, 6.2, 7, 4.7), 
    CO_green = c(3.5, 3.3, 4.4, 3.9, 4.3, 3.6), CO_red = c(4.2, 
    2.2, 5.5, 2.1, 4.5, 5.8), CO_blue = c(5.9, 5, 5.1, 4.3, 5.4, 
    4.7), age_green = c(18, 18, 29, 19, 19, 23), age_red = c(19, 
    22, 19, 18, 26, 18), age_blue = c(25, 18, 41, 24, 32, 24), 
    gender_green = c("Male", "Male", "Female", "Female", "Male", 
    "Male"), gender_red = c("Male", "Female", "Female", "Female", 
    "Male", "Female"), gender_blue = c("Male", "Female", "Female", 
    "Male", "Female", "Female")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))



 y_green y_red y_blue CO_green CO_red CO_blue age_green age_red age_blue gender_green gender_red gender_blue
    <dbl> <dbl>  <dbl>    <dbl>  <dbl>   <dbl>     <dbl>   <dbl>    <dbl> <chr>        <chr>      <chr>      
1     4.6   8.3    4.6      3.5    4.2     5.9        18      19       25 Male         Male       Male       
2     4.7   9      2.6      3.3    2.2     5          18      22       18 Male         Female     Female     
3     9    10      7.5      4.4    5.5     5.1        29      19       41 Female       Female     Female     
4     6.7   8.3    6.2      3.9    2.1     4.3        19      18       24 Female       Female     Male       
5     6.2   7.8    7        4.3    4.5     5.4        19      26       32 Male         Male       Female     
6     7.7   9.1    4.7      3.6    5.8     4.7        23      18       24 Male         Female     Female     

and what I want to do, is to change it into long formate in which we have column "y", "CO", "age", "gender" with a "red", "green" and "blue" label. I am not sure how I should do this. Any hint or suggestion?


Solution

  • You could use :

    tidyr::pivot_longer(df, 
                        cols = everything(), 
                        names_to = c('.value', 'color'), 
                        names_sep = '_')
    
    #   col       y    CO   age gender
    #   <chr> <dbl> <dbl> <dbl> <chr> 
    # 1 green   4.6   3.5    18 Male  
    # 2 red     8.3   4.2    19 Male  
    # 3 blue    4.6   5.9    25 Male  
    # 4 green   4.7   3.3    18 Male  
    # 5 red     9     2.2    22 Female
    # 6 blue    2.6   5      18 Female
    # 7 green   9     4.4    29 Female
    # 8 red    10     5.5    19 Female
    # 9 blue    7.5   5.1    41 Female
    #10 green   6.7   3.9    19 Female
    #11 red     8.3   2.1    18 Female
    #12 blue    6.2   4.3    24 Male  
    #13 green   6.2   4.3    19 Male  
    #14 red     7.8   4.5    26 Male  
    #15 blue    7     5.4    32 Female
    #16 green   7.7   3.6    23 Male  
    #17 red     9.1   5.8    18 Female
    #18 blue    4.7   4.7    24 Female