Search code examples
rreshapedata-management

reshaping k columns to 2 columns representing sequential pairs of the values of the k variables


I have a data frame like this:

id y1 y2 y3 y4  
--+--+--+--+--
a |12|13|14|  
b |12|18|  |
c |13|  |  |
d |13|14|15|16  

I want to reshape in such a way that I end with two columns. The above example would then become:

id from to  
--+----+--- 
a |12  |13  
a |13  |14  
a |14  |
b |12  |18
b |18  |  
c |13  |
d |13  |14  
d |14  |15  
d |15  |16  

Each id has a 'from' and a 'to' per pair of year values.
Does anybody know of an easy way to do this? I tried using reshape2. I also looked at Combine Multiple Columns Into Tidy Data but I think my case is different.


Solution

  • You can use lapply to loop over the pairs of columns and rbind to union them:

    do.call(rbind,
            lapply(2:(length(df)-1), 
                   function(x) setNames(df[!is.na(df[,x]),c(1,x,x+1)], 
                                        c("id", "from", "to"))))
       id from to
    1   a   12 13
    2   b   12 18
    3   c   13 NA
    4   d   13 14
    11  a   13 14
    21  b   18 NA
    41  d   14 15
    12  a   14 NA
    42  d   15 16