Tidyverse/Dplyr solution to assigning values to column names extracted from a nested list

I have a tibble with a column containing a nested list (<list<list<double>>> data type to be specific).

It looks something like the following (but in R/Arrow format):

ID	nestedvals
001	`[[1]] (1,0.1) [[2]] (2,0.2) [[3]] (3,0.3) [[4]] (4,0.4) [[5]] (5,0.5)`
002	`[[1]] (1,0.1) [[2]] (2,0.2) [[3]] (3,0.3) [[4]] (4,0.4)`
003	`[[1]] (1,0.1) [[2]] (2,0.2) [[3]] (3,0.3)`
004	`[[1]] (1,0.1) [[2]] (2,0.2)`
005	`[[1]] (1,0.1)`

Where if I call row 1 of nestedvals I get:

tibble$nestedvals[1]
<list<list<double>>[1]>
[[1]]
<list<double>[5]>
[[1]]
[1] 1   0.1

[[2]]
[1] 2   0.2

[[3]]
[1] 3   0.3

[[4]]
[1] 4   0.4

[[5]]
[1] 5   0.5

Basically, for the nestedvals column, there is a list containing a list of pairs of doubles, with the first indicating a specific index (e.g. 5), and a certain value (e.g. 0.5).

What I would like to do is generate set of zero-filled columns based on the range of unique indexes of each nested list. e.g.: col_1, col_2, col_3, col_4, col_5

And then replace each 0 with the value (second number in the nested list), based on the index (first number in each nested list), for each row of the tibble.

I believed the best way to do this involves unlisting the variables and making separate columns with the lists of indexes and lists of values of interest, both so I can find the maximum for name generation in the former and for assignment between to two.

To accomplish this, I wrote a function to split each nested list:

  nestsplit <- function(x, y) {
    unlist(lapply(x, `[[`, y))
  }

And then generate unique columns with the column names (by index) and values of interest to append to the tibble:

  tibble <-
    tibble |> rowwise() |> mutate(index_names = list(paste0(
      "col_", as.character(nestsplit(nestedvals, 1))
    )),
    index_values = list(nestsplit(nestedvals, 2)))

But I would like to see if there is an efficient, row-wise, tidyverse/dplyr-based solution to assign the values in index_values to index-based column names using the information in the index_names variable, rather than writing a loop to assign each of them by row.

So that an output like this:

ID	nestedvals	col_1	col_2	col_3	col_4	col_5
001	`<Nested list of 5 pairs of values>`	0	0	0	0	0
002	`<Nested list of 4 pairs of values>`	0	0	0	0	0
003	`<Nested list of 3 pairs of values>`	0	0	0	0	0
004	`<Nested list of 2 pairs of values>`	0	0	0	0	0
005	`<Nested list of 1 pair of values>`	0	0	0	0	0

Instead looks like the following:

ID	nestedvals	col_1	col_2	col_3	col_4	col_5
001	`<Nested list of 5 pairs of values>`	0.1	0.2	0.3	0.4	0.5
002	`<Nested list of 4 pairs of values>`	0.1	0.2	0.3	0.4	0
003	`<Nested list of 3 pairs of values>`	0.1	0.2	0.3	0	0
004	`<Nested list of 2 pairs of values>`	0.1	0.2	0	0	0
005	`<Nested list of 1 pair of values>`	0.1	0	0	0	0

To generate some example data for above, please use:

tibble <-
  structure(
    list(
      ID = c(001, 002, 003, 004, 005),
      nestedvals = structure(
        list(
          structure(
            list(c(1, 0.1), c(2, 0.2), c(3, 0.3), c(4, 0.4), c(5, 0.5)),
            class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list"),
            ptype = numeric(0)
          ),
          structure(
            list(c(1, 0.1), c(2, 0.2), c(3, 0.3), c(4, 0.4)),
            class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list"),
            ptype = numeric(0)
          ),
          structure(
            list(c(1, 0.1), c(2, 0.2), c(3, 0.3)),
            class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list"),
            ptype = numeric(0)
          ),
          structure(
            list(c(1, 0.1), c(2, 0.2)),
            class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list"),
            ptype = numeric(0)
          ),
          structure(
            list(c(1, 0.1)),
            class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list"),
            ptype = numeric(0)
          )
        ),
        ptype = structure(
          list(),
          class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list"),
          ptype = numeric(0)
        ),
        class = c("arrow_list", "vctrs_list_of", "vctrs_vctr", "list")
      )
    ),
    row.names = c(NA, -5L),
    class = c("tbl_df", "tbl", "data.frame")
  )

Solution

tibble%>%
   unnest_longer(nestedvals) %>%
   unnest_wider(nestedvals, names_sep = '_') %>%
   pivot_wider(names_from = nestedvals_1, names_prefix = 'col_',
               values_from = nestedvals_2, values_fill = 0)

# A tibble: 5 × 6
     ID col_1 col_2 col_3 col_4 col_5
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1   0.1   0.2   0.3   0.4   0.5
2     2   0.1   0.2   0.3   0.4   0  
3     3   0.1   0.2   0.3   0     0  
4     4   0.1   0.2   0     0     0  
5     5   0.1   0     0     0     0

Note that in the case the elements are indexed 1:n,like in your example just use unnest_wider:

unnest_wider(tibble, nestedvals, names_sep = '_', transform = ~.x[2])

# A tibble: 5 × 6
     ID nestedvals_1 nestedvals_2 nestedvals_3 nestedvals_4 nestedvals_5
  <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
1     1          0.1          0.2          0.3          0.4          0.5
2     2          0.1          0.2          0.3          0.4         NA  
3     3          0.1          0.2          0.3         NA           NA  
4     4          0.1          0.2         NA           NA           NA  
5     5          0.1         NA           NA           NA           NA

You can easily replace the NA with 0:

tibble %>%
  unnest_wider(nestedvals, names_sep = '_', transform = ~.x[2]) %>%
  mutate(across(starts_with('nestedvals'), ~replace_na(., 0)))

# A tibble: 5 × 6
     ID nestedvals_1 nestedvals_2 nestedvals_3 nestedvals_4 nestedvals_5
  <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
1     1          0.1          0.2          0.3          0.4          0.5
2     2          0.1          0.2          0.3          0.4          0  
3     3          0.1          0.2          0.3          0            0  
4     4          0.1          0.2          0            0            0  
5     5          0.1          0            0            0            0