Search code examples
rdatashape

reshaping data from wide to long, but with some complexity


I made minimal reproducible example, but my real data is huge and complicated This is the example


fact_1_p_model1 <- c(1,3,4,2,5)
ra_2_p_model1<- c(5,6,4,2,3)
da_1_p_model2 <- c(3,5,3,1,5)
dd_2_p_model2 <- c( 4,2,5,2,1)
fact_1_p_nonlinearmodel1<-c( 4,2,5,2,2)
tt_2_p_nonlinearmodel1<-c( 3,6,5,3,1)
fact_1_p_nonlinearmodel2<-c( 1,2,6,2,4)
rara_2_p_nonlinearmodel2<-c( 9,5,5,2,1)
id<-1:5
data<-data.frame(fact_1_p_model1, ra_2_p_model1, da_1_p_model2, dd_2_p_model2,
                 fact_1_p_nonlinearmodel1, tt_2_p_nonlinearmodel1, fact_1_p_nonlinearmodel2,
                 rara_2_p_nonlinearmodel2,id)

so, currently, I have a dataset like this

 data
  fact_1_p_model1 ra_2_p_model1 da_1_p_model2 dd_2_p_model2 fact_1_p_nonlinearmodel1
1               1             5             3             4                        4
2               3             6             5             2                        2
3               4             4             3             5                        5
4               2             2             1             2                        2
5               5             3             5             1                        2
  tt_2_p_nonlinearmodel1 fact_1_p_nonlinearmodel2 rara_2_p_nonlinearmodel2 id
1                      3                        1                        9  1
2                      6                        2                        5  2
3                      5                        6                        5  3
4                      3                        2                        2  4
5                      1                        4                        1  5



and, I want to make this data to a long format with two guide columns ("model", "coef")


model <- c("model1","model1","model2","model2","nonlinearmodel1","nonlinearmodel1",
           "nonlinearmodel2","nonlinearmodel2")
coef <- c("fact_1_p_","ra_2_p_","da_1_p_","dd_2_p_","fact_1_p_","tt_2_p_","fact_1_p_","rara_2_p_")

#value <- ?? don't know how to... 
#id <- ??

data_long<-data.frame(model,coef
#,value, id
)

if I exclude value, and id, it is like this. but I want to put value and id as well, and I did it manually, but I cannot do it manually for my real data.

> data_long
            model      coef
1          model1 fact_1_p_
2          model1   ra_2_p_
3          model2   da_1_p_
4          model2   dd_2_p_
5 nonlinearmodel1 fact_1_p_
6 nonlinearmodel1   tt_2_p_
7 nonlinearmodel2 fact_1_p_
8 nonlinearmodel2 rara_2_p_

with this small dataset, I can do it manually. but with my real huge data, I cannot.

how can I do this? how can I reshape the wide data (that shown in the first) to long data as I show?


Solution

  • data %>%
       pivot_longer(-id, names_to = c('coef', 'model'), names_sep = '(?<=_p_)')
    
    # A tibble: 40 x 4
          id  coef     model           value
       <int> <chr>     <chr>           <dbl>
     1     1 fact_1_p_ model1              1
     2     1 ra_2_p_   model1              5
     3     1 da_1_p_   model2              3
     4     1 dd_2_p_   model2              4
     5     1 fact_1_p_ nonlinearmodel1     4
     6     1 tt_2_p_   nonlinearmodel1     3
     7     1 fact_1_p_ nonlinearmodel2     1
     8     1 rara_2_p_ nonlinearmodel2     9
     9     2 fact_1_p_ model1              3
    10     2 ra_2_p_   model1              6
    # ... with 30 more rows