Search code examples
rdataframedplyrtidyversetibble

lists of tibble to column in data.frame


I want to create a column which is a list of tibbles (of different row number). The straight forward way fails. Example:

> x <- data.frame('a' = 1:2, 
+                 'b' = list(tibble('c' = 1:4, 'd' = 1:4),
+                            tibble('c' = 1:3, 'd' = 1:3)))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 4, 3

I can avoid the error by wrapping it with I. However, when I do so, and try to unnest I can't.

> x <- data.frame('a' = 1:2, 
+                 'b' = I(list(tibble('c' = 1:4, 'd' = 1:4),
+                            tibble('c' = 1:3, 'd' = 1:3))))
> x %>% unnest(cols = b) 
# A tibble: 2 x 2
      a b               
  <int> <I<list>>       
1     1 <tibble [4 x 2]>
2     2 <tibble [3 x 2]>

How can I create a column which is a list of tibble, which later I can unnest?


Solution

  • It's much easier to create list columns using tibbles instead of data.frames (See e.g. Hadley's note on this here).

    You can fix your code by swtiching from data.frame() to tibble():

    library(dplyr)
    
    x <- tibble(
      'a' = 1:2,
      'b' = list(
        tibble('c' = 1:4, 'd' = 1:4),
        tibble('c' = 1:3, 'd' = 1:3)
      )
    )
    
    x
    #> # A tibble: 2 × 2
    #>       a b               
    #>   <int> <list>          
    #> 1     1 <tibble [4 × 2]>
    #> 2     2 <tibble [3 × 2]>
    
    x %>% tidyr::unnest(b)
    #> # A tibble: 7 × 3
    #>       a     c     d
    #>   <int> <int> <int>
    #> 1     1     1     1
    #> 2     1     2     2
    #> 3     1     3     3
    #> 4     1     4     4
    #> 5     2     1     1
    #> 6     2     2     2
    #> 7     2     3     3
    

    Created on 2022-03-31 by the reprex package (v2.0.1)