Search code examples
rdata.tabletidyversetidyr

How to split elements separated by underscore in columns names and create


I have the following dataset. As you could see, the columns names are composed by three elements. I would like to separate each of the element separated by an underscore:

structure(list(post_name_mo_score = c("3433243juhy234_2323526", "3433243juhy234_2323526", 
"3433243juhy234_2323526"), minna_yout_qu_mai = c("3433243juhy234_2323526", 
"3433243juhy234_2323526", "3433243juhy234_2323526"), tour_sima_ta_tup = c("3433243juhy234_2323526", 
"3433243juhy234_2323526", "3433243juhy234_2323526")), class = "data.frame", row.names = c(NA, 
-3L))

in a way that all the first, second and thirs elements will go each of them under a new column (i.e. column1 = post, minna, tour, column2 = name, yout, sima, column3 = mo, qu, ta)

Expected output

clm1   clm2  clm3  score                    mai                      tup
post   name  mo    3433243juhy234_2323526   3433243juhy234_2323526 3433243juhy234_2323526 
minna  yout  qu    3433243juhy234_2323526   3433243juhy234_2323526 3433243juhy234_2323526 
tour   sima  ta    3433243juhy234_2323526   3433243juhy234_2323526 3433243juhy234_2323526 

Can anyone suggest something functional for this? Thanks


Solution

  • tidyr does this well:

    library(tidyr)
    pivot_longer(dat, everything(), values_to = "clm4") |>
      separate(name, into = c("clm1", "clm2", "clm3"), sep = "_")
    # # A tibble: 9 × 4
    #   clm1  clm2  clm3  clm4                  
    #   <chr> <chr> <chr> <chr>                 
    # 1 post  name  mo    3433243juhy234_2323526
    # 2 minna yout  qu    3433243juhy234_2323526
    # 3 tour  sima  ta    3433243juhy234_2323526
    # 4 post  name  mo    3433243juhy234_2323526
    # 5 minna yout  qu    3433243juhy234_2323526
    # 6 tour  sima  ta    3433243juhy234_2323526
    # 7 post  name  mo    3433243juhy234_2323526
    # 8 minna yout  qu    3433243juhy234_2323526
    # 9 tour  sima  ta    3433243juhy234_2323526
    

    Data

    dat <- structure(list(post_name_mo = c("3433243juhy234_2323526", "3433243juhy234_2323526", "3433243juhy234_2323526"), minna_yout_qu = c("3433243juhy234_2323526", "3433243juhy234_2323526", "3433243juhy234_2323526"), tour_sima_ta = c("3433243juhy234_2323526", "3433243juhy234_2323526", "3433243juhy234_2323526")), class = "data.frame", row.names = c(NA, -3L))