Search code examples
rleft-joinnested-loopslapply

Left join from list of data frames in R


I have some datasets like this:

data_201401 <- data.frame(ID = c(123, 234, 345, 456), Block_Code = c("D", "U", "Z", "G"),
                     DPP = c(1,2,3,4))

data_201402 <- data.frame(ID = c(123, 234, 345, 456, 678), Block_Code = c("Z", "G", "T", "D", "U" ),
                     DPP = c(2,3,1,4,5))

data_201403 <- data.frame(ID = c(123, 234, 345, 456,678,124), 
                     Block_Code = c("U", "Z","G","T", "D","R"),DPP = c(6,2,2,4,5,6))

data_201404 <- data.frame(ID = c(123, 234, 345, 456, 678, 124, 567), 
                     Block_Code = c("D", "U", "Z", "G","T","R","Z"),DPP = c(1,2,3,4,5,6,7))

data_201405 <- data.frame(ID = c(123, 234, 345, 456, 678, 124, 567, 256), 
                     Block_Code = c("D", "U", "Z","G","T","R","Z","U"), DPP = c(1,2,3,4,5,6,7,8))

data_201406 <- data.frame(ID = c(123, 234, 345, 456, 678, 124, 567, 256, 345), 
                     Block_Code = c("D", "U","Z", "G","T","R","Z","U", "Z"),
                     DPP = c(1,2,3,4,5,6,7,8,2))
data_201407 <- data.frame(ID = c(123, 234, 345, 456,  678, 124, 567, 256, 345, 647), 
                     Block_Code = c("D", "U", "Z", "G","T","R","Z","U", "Z","S"), 
                     DPP = c(1,2,3,4,5,6,7,8,2,12))

and so one until data_201506.

I want data_201401 will be merge with 12 datasets after that, data_201402 until data_201501. I want data_201402 will be merge with 12 datasets after that, data_201403 until data_201502. I want data_201403 will be merge with 12 datasets after that, data_201404 until data_201503. I want data_201404 will be merge with 12 datasets after that, data_201405 until data_201504. I want data_201405 will be merge with 12 datasets after that, data_201406 until data_201505. I want data_201406 will be merge with 12 datasets after that, data_201407 until data_201506.

After data_201404 until data_201406 done for each merge. so there are same 13 columns like Block_Code. I want also rename BLock_code like the initial dataset is still same 'Block_Code' but for 12 dataframes that merge be Block_Code_1, Block_Code_2, ..., Block_Code_12.

Can someone help me to do that in R. I dont know it will be work like loop and nested looping

enter image description here

Maybe from the picture will be give a view from i mean

enter image description here


Solution

  • Here is a way.
    Get the data sets in a list with mget. Loop through the list and join them with merge, Reduce'ing the results to one data set. The results have ndatasets Block_Codes, as asked.

    data_list <- mget(ls(pattern = "data_201"))
    
    ndatasets <- 3L
    
    res_list <- lapply(seq_along(data_list)[-(1:2)], \(inx) {
      i <- inx - ndatasets:1 + 1L
      out <- Reduce(\(x, y) merge(x, y, by = "ID"), data_list[i])
      # take care of the column names
      icols <- grep("Block_Code", names(out))
      names(out)[icols] <- paste0("Block_Code_", seq_along(icols))
      icols <- grep("DPP", names(out))
      names(out)[icols] <- paste0("DPP_", seq_along(icols))
      out
    })
    
    res_list
    #> [[1]]
    #>    ID Block_Code_1 DPP_1 Block_Code_2 DPP_2 Block_Code_3 DPP_3
    #> 1 123            D     1            Z     2            U     6
    #> 2 234            U     2            G     3            Z     2
    #> 3 345            Z     3            T     1            G     2
    #> 4 456            G     4            D     4            T     4
    #> 
    #> [[2]]
    #>    ID Block_Code_1 DPP_1 Block_Code_2 DPP_2 Block_Code_3 DPP_3
    #> 1 123            Z     2            U     6            D     1
    #> 2 234            G     3            Z     2            U     2
    #> 3 345            T     1            G     2            Z     3
    #> 4 456            D     4            T     4            G     4
    #> 5 678            U     5            D     5            T     5
    #> 
    #> [[3]]
    #>    ID Block_Code_1 DPP_1 Block_Code_2 DPP_2 Block_Code_3 DPP_3
    #> 1 123            U     6            D     1            D     1
    #> 2 124            R     6            R     6            R     6
    #> 3 234            Z     2            U     2            U     2
    #> 4 345            G     2            Z     3            Z     3
    #> 5 456            T     4            G     4            G     4
    #> 6 678            D     5            T     5            T     5
    #> 
    #> [[4]]
    #>    ID Block_Code_1 DPP_1 Block_Code_2 DPP_2 Block_Code_3 DPP_3
    #> 1 123            D     1            D     1            D     1
    #> 2 124            R     6            R     6            R     6
    #> 3 234            U     2            U     2            U     2
    #> 4 345            Z     3            Z     3            Z     3
    #> 5 345            Z     3            Z     3            Z     2
    #> 6 456            G     4            G     4            G     4
    #> 7 567            Z     7            Z     7            Z     7
    #> 8 678            T     5            T     5            T     5
    #> 
    #> [[5]]
    #>     ID Block_Code_1 DPP_1 Block_Code_2 DPP_2 Block_Code_3 DPP_3
    #> 1  123            D     1            D     1            D     1
    #> 2  124            R     6            R     6            R     6
    #> 3  234            U     2            U     2            U     2
    #> 4  256            U     8            U     8            U     8
    #> 5  345            Z     3            Z     3            Z     3
    #> 6  345            Z     3            Z     3            Z     2
    #> 7  345            Z     3            Z     2            Z     3
    #> 8  345            Z     3            Z     2            Z     2
    #> 9  456            G     4            G     4            G     4
    #> 10 567            Z     7            Z     7            Z     7
    #> 11 678            T     5            T     5            T     5
    

    Created on 2022-10-02 with reprex v2.0.2