Search code examples
rdataframemergedata.tableflatten

How can I merge all dataframes nested in another dataframe and produce all combinations in one result data.frame?


I have data like sampledata reproduced using the code below:

###producing sample data
sampledatain1 <- data.frame(in_1=c(1,2,3,4,5),in_2=c("A","B","C","D","E"),stringsAsFactors = F)
sampledatain2 <- data.frame(in_1=c(6,7,8,9,10),in_2=c("F","G","H","I","J"),stringsAsFactors = F)
sampledatain3 <- data.frame(in_1=c(99,98,97,96,95),in_2=c("Z","Y","X","W","V"),stringsAsFactors = F)
sampledata <- data.frame(row = 1:3,
                    colmerge1 = c("MA","MB","MC"), 
                    colmerge2 = -99:-97,
                    stringsAsFactors = FALSE)
sampledata$df <- list(sampledatain1,sampledatain2,sampledatain3)

I want to flatten all the nested dataframes and combine them so the result looks like desired_data:

###reproducing desired data 
library(data.table)
sampledatain1$row <- 1
sampledatain1$colmerge1 <- "MA"
sampledatain1$colmerge2 <- -99

sampledatain2$row <- 2
sampledatain2$colmerge1 <- "MB"
sampledatain2$colmerge2 <- -98

sampledatain3$row <- 3
sampledatain3$colmerge1 <- "MC"
sampledatain3$colmerge2 <- -97

desired_data <- rbindlist(list(sampledatain1,sampledatain2,sampledatain3))

How can I do this using a function to produce desired_data from sampledata?


Solution

  • The thing you're looking for is unnest() (unnest_longer() then unpack()/unnest_wider() works too):

    > tidyr::unnest(sampledata, df)
    # A tibble: 15 × 5
         row colmerge1 colmerge2  in_1 in_2 
       <int> <chr>         <int> <dbl> <chr>
     1     1 MA              -99     1 A    
     2     1 MA              -99     2 B    
     3     1 MA              -99     3 C    
     4     1 MA              -99     4 D    
     5     1 MA              -99     5 E    
     6     2 MB              -98     6 F    
     7     2 MB              -98     7 G    
     8     2 MB              -98     8 H    
     9     2 MB              -98     9 I    
    10     2 MB              -98    10 J    
    11     3 MC              -97    99 Z    
    12     3 MC              -97    98 Y    
    13     3 MC              -97    97 X    
    14     3 MC              -97    96 W    
    15     3 MC              -97    95 V
    

    To convert it to a data.table, you can use as.data.table()

    For more information on unnest(), read the documentation.