Search code examples
rtidyrunnest

Is there an alternative to tidyr::unnest_wider? It fails when nested list element is not a vector


I have old code that used to work using tidyr::unnest_wider() to unnest a nested named list into their own columns; however, it no longer works. Instead I get an error saying x$name_of_list must be a vector, not a <non-vector> object, where my non-vector objects include <mcpfit> and <patchwork/gg/ggplot> objects. It seems like they tried to address this issue here, but it still doesn't work using tidyr v. 1.3.0.

I couldn't easily create a reproducible example from my own use case. But I'll use the example listed in the Github issue link above in hopes that this will work for my use case as well.

library(tidyverse)  
  
m <- 
  tibble::as_tibble(mtcars[1,]) %>% 
  mutate(ls_col=list(
    list(
      a=c(1:10), 
      b=lm(cyl~gear))
    )
  )  

m2 <-
  m %>% 
  unnest_wider(ls_col)

I am looking for EITHER an alternative data.table or base R solution OR a tidyverse workaround (e.g., remove the non-vector objects from the nested list and then use tidyr::unnest_wider()). tidyr::unnest() seems to work, but then I don't know how to pivot the column containing the lists into their own columns (R crashes every time I try something like this).


Solution

  • You can specify strict = TRUE.

    library(tidyverse)  
    
    m <- tibble::as_tibble(mtcars[1,]) %>% 
      mutate(ls_col= list(
        list(
          a=c(1:10), 
          b=lm(cyl~gear))
      ))
    
    m %>% 
      unnest_wider(ls_col, strict = TRUE)
    #> # A tibble: 1 x 13
    #>     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb a      b    
    #>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <lis>
    #> 1    21     6   160   110   3.9  2.62  16.5     0     1     4     4 <int>  <lm>
    

    Why?

    The strict argument defaults to FALSE, and in this state, unnest_wider will convert zero-length typed objects like numeric() or character() in your list to NA, which can be helpful in converting lists with zero-length items into a typed column, for example:

    m <- tibble(ls_col = list(list(a = character()), list(a = 1))) 
    
    m %>% unnest_wider(ls_col, strict = FALSE)
    #> # A tibble: 2 x 1
    #>       a
    #>   <dbl>
    #> 1    NA
    #> 2     1
    

    Whereas with strict = TRUE, type is strictly preserved, which means in this case we end up with a list column:

    m %>% unnest_wider(ls_col, strict = TRUE)
    #> # A tibble: 2 x 1
    #>   a        
    #>   <list>   
    #> 1 <chr [0]>
    #> 2 <dbl [1]>
    

    The default strict = FALSE can come in handy in some circumstances, since it can help rearranging complex lists with some empty items (as in parsing certain json structures). To achieve this, unnest_wider uses the function vctrs::list_sizes, (via the non-exported function elt_to_wide), which will throw an error if the list contains non-vector items:

    vctrs:::list_sizes(list(a = 1, b = lm(cyl~gear, mtcars)))
    #> Error in `vctrs:::list_sizes()`:
    #> ! `x$b` must be a vector, not a <lm> object.
    #> Run `rlang::last_trace()` to see where the error occurred.
    

    I wouldn't call this behaviour a bug as such, but it's a bit unintuitive and feels like we are using strict = TRUE for a reason other than its design rationale. However, it does work here.

    Created on 2023-08-04 with reprex v2.0.2