Search code examples
rdataframecopymultiple-columns

Cross copying in R


I ask for some help to solve the following issue. I have a dataset composed as

from   to
A      B
A      B
C      D
C      D

I want to get the following dataset

from   to
A      B
A      C
C      A
C      D

Basically, after group_by(from), I want a "cross-copying" between the value in the "to" column, between the last value of the "from" column in the first group and the first value of the "from" column in the second group, and so on for each group. I am using "complete" but it is not helpful.


Solution

  • Here is a tidyverse solution:

    library(tidyverse) 
    
    dat <- tibble(
      from = c("A", "A", "C", "C"), 
      to = c("B", "B", "D", "D")
    )
    
    sol <- dat %>% 
      mutate(
        fst = lag(from), # lag `from` for first values
        lst = lead(from) # lead `from` for last values
      ) %>% 
      group_by(from) %>% 
      transmute(
        to = case_when( 
          row_number() == 1 & !is.na(fst) ~ fst, # if first element in group and lagged `from` is not NA then equals lagged `from`
          row_number() == n() & !is.na(lst) ~ lst, # if last element in group and leaded `from` is not NA, then equals leaded `from`
          T ~ to # else `to`
        )
      ) %>% 
      ungroup()
    
    sol
    #> # A tibble: 4 × 2
    #>   from  to   
    #>   <chr> <chr>
    #> 1 A     B    
    #> 2 A     C    
    #> 3 C     A    
    #> 4 C     D