R - Nested list to tibble

I have a nested list like so:

> ex <- list(list(c("This", "is", "an", "example", "."), c("I", "really", "hate", "examples", ".")), list(c("How", "do", "you", "feel", "about", "examples", "?")))
> ex
[[1]]
[[1]][[1]]
[1] "This"    "is"      "an"      "example" "."      

[[1]][[2]]
[1] "I"        "really"   "hate"     "examples" "."       


[[2]]
[[2]][[1]]
[1] "How"      "do"       "you"      "feel"     "about"    "examples" "?"

I want to convert it to a tibble like so:

> tibble(d_id = as.integer(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2)),
+        s_id = as.integer(c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1)),
+        t_id = as.integer(c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7)),
+        token = c("This", "is", "an", "example", ".", "I", "really",
+                  "hate", "examples", ".", "How", "do", "you", "feel", "about", "examples", "?"))
# A tibble: 17 x 4
    d_id  s_id  t_id token   
   <int> <int> <int> <chr>   
 1     1     1     1 This    
 2     1     1     2 is      
 3     1     1     3 an      
 4     1     1     4 example 
 5     1     1     5 .       
 6     1     2     1 I       
 7     1     2     2 really  
 8     1     2     3 hate    
 9     1     2     4 examples
10     1     2     5 .       
11     2     1     1 How     
12     2     1     2 do      
13     2     1     3 you     
14     2     1     4 feel    
15     2     1     5 about   
16     2     1     6 examples
17     2     1     7 ?

What is the most efficient way for me to perform this? Preferably using tidyverse functionality?

Solution

Time to get some sequences working, which should be very efficient:

d_id <- rep(seq_along(ex), lengths(ex))
s_id <- sequence(lengths(ex))
t_id <- lengths(unlist(ex, rec=FALSE))

data.frame(
  d_id  = rep(d_id, t_id),
  s_id  = rep(s_id, t_id),
  t_id  = sequence(t_id),
  token = unlist(ex)
)

#   d_id s_id t_id    token
#1     1    1    1     This
#2     1    1    2       is
#3     1    1    3       an
#4     1    1    4  example
#5     1    1    5        .
#6     1    2    1        I
#7     1    2    2   really
#8     1    2    3     hate
#9     1    2    4 examples
#10    1    2    5        .
#11    2    1    1      How
#12    2    1    2       do
#13    2    1    3      you
#14    2    1    4     feel
#15    2    1    5    about
#16    2    1    6 examples
#17    2    1    7        ?

This will run in about 2 seconds for a 500K sample of your ex list. I suspect that will be hard to beat in terms of efficiency.