I have a list with 3 elements, each with a different set, and number, of values. I would like to turn this list into a simple two column dataframe.
One column would be the value from the list element, the second column would be the name of the list element itself.
myList <- list(A = c(1,2,3),
B = c(10,20,30,40),
C = c(100,200,300,400,500))
So the ideal outcome is something like:
Value List
1 A
2 A
10 B
100 C
......
So I know I can do this with a series of rbinds:
df <- data.frame(Value = myList[[A]],cluster = A) %>%
rbind(data.frame(Value = myList[[B]],cluster = B)) %>%
rbind(data.frame(Value = myList[[C]],cluster = C))
And I can probably clean this up with a loop or lapply...but it seems like there should be a more straightforward way to get this!
If you want to use tidyverse
(not sure it can be done just with dplyr
), you can use
library(magrittr)
tibble::enframe(myList) %>% tidyr::unnest(cols = value)
output
# A tibble: 12 x 2
name value
<chr> <dbl>
1 A 1
2 A 2
3 A 3
4 B 10
5 B 20
6 B 30
7 B 40
8 C 100
9 C 200
10 C 300
11 C 400
12 C 500
First, tibble::enframe(myList)
will return a tibble
with two columns and three rows. Column name
will be the name of each element in your original list
, and value
will itself be the data.frame
s each containing a column with the values in each list.
Then, tidyr::unnest(cols = value)
just unnest
s the value
column.
That said, I do encourage you to consider @akrun's answer as utils::stack(myList)
is considerably faster, and less verbose.
(edited to add @Martin Gal's approach using purrr
)
microbenchmark::microbenchmark(
tidyverse = tibble::enframe(myList) %>% tidyr::unnest(cols = value),
baseR = utils::stack(myList),
purrr = purrr::map_df(myList, ~data.frame(value = .x), .id = "id"),
times = 10000
)
output
Unit: microseconds
expr min lq mean median uq max neval
tidyverse 1937.067 2169.251 2600.4402 2301.1385 2592.7305 77715.238 10000
baseR 144.218 182.112 227.6124 202.0755 230.0960 5476.169 10000
purrr 350.265 417.803 523.7954 455.4410 520.3555 71673.820 10000