Search code examples
rlistdataframelapplytext-mining

Extract values and attributes from a list and convert them into a dataframe in R


I got the following list for my model:

List of 9
 $ phi           : num [1:5, 1:1500] 1.8e-04 1.8e-04 1.8e-04 1.8e-04 1.8e-04 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:5] "t_1" "t_2" "t_3" "t_4" ...
  .. ..$ : chr [1:1500] "word1" "word2" "word3" "word4" ...
 $ theta         : num [1:500, 1:5] 0.1234 0.4567 0.01234 0.04567 0.02345 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:500] "1" "2" "3" "4" ...
  .. ..$ : chr [1:5] "t_1" "t_2" "t_3" "t_4" ...
 $ gamma         : num [1:5, 1:1500] 0.20 0.70 0.10 0.1 0.11 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:5] "t_1" "t_2" "t_3" "t_4" ...
  .. ..$ : chr [1:1500] "word1" "word2" "word3" "word4" ...
 $ data          :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. ..@ i       : int [1:10000] 1234 6789 2233 1367 1123 1123 145 145 156 1325 ...
  .. ..@ p       : int [1:1500] 0 1 2 3 4 5 6 7 8 9 ...
  .. ..@ Dim     : int [1:2] 1234 1500
  .. ..@ Dimnames:List of 2
  .. .. ..$ : chr [1:500] "1" "2" "3" "4" ...
  .. .. ..$ : chr [1:1500] "word1" "word2" "word3" "word4" ...
  .. ..@ x       : num [1:100000] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..@ factors : list()
 $ alpha         : Named num [1:5] 0.1 0.1 0.1 0.1  ...
  ..- attr(*, "names")= chr [1:5] "t_1" "t_2" "t_3" "t_4" ...
 $ beta          : Named num [1:1500] 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ...
  ..- attr(*, "names")= chr [1:1500] "word1" "word2" "word3" "word4"

Is there a way of how to select $theta and all its attributes and save them as a data frame? In other words, I want to extract this part from the list:

$ theta         : num [1:500, 1:5] 0.1234 0.4567 0.01234 0.04567 0.02345 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:500] "1" "2" "3" "4" ...
  .. ..$ : chr [1:5] "t_1" "t_2" "t_3" "t_4" ...

and have a dataframe that looks like this (the column order does not matter):

Theta  | var1 | var2 |
0.1234 | 1    | t_1  |
0.4567 | 2    | t_2  |
0.01234| 3    | t_3  |

I have tried lapply and many other suggestions that I found in terms of list extraction but failed to extract the part shown above.

Thanks a lot!


Solution

  • As already metioned in comments, you can easily access $theta with list subsetting either model$theta or model[['theta']].

    $theta is a numeric matrix 500 x 5. To convert it into desirable format just melt it:

    theta_matrix = model$theta
    theta_df = reshape2::melt(theta_matrix, value.name = "Theta")