if I have the following example:
library(text2vec)
library(magrittr)
reviews <- movie_review[1:10,]
vocabInsomnia <- reviews$review %>% itoken(tolower, word_tokenizer, n_chunks = 10) %>%
create_vocabulary %>%
prune_vocabulary(term_count_min = 10, doc_proportion_max = 0.5) %>%
vocab_vectorizer %>%
create_dtm(<output_from_itoken>,<output_from_vocab_vectorizer>)
You can see that in the very last chain sequence I want to use the outputs of two of the previous steps as arguments to the create_dtm
function. I only know how to feed in the output from the chain directly before i.e. output from vocab_vectorizer
, but not the output from the function itoken
that was the first chain in the sequence. Does magrittr allow this?
We could create a temporary object using pipeR
library(text2vec)
library(pipeR)
library(magrittr)
reviews$review %>%
itoken(tolower, word_tokenizer, n_chunks = 10) %>>%
(~ tmp) %>%
create_vocabulary %>%
prune_vocabulary(term_count_min = 10, doc_proportion_max = 0.5) %>%
vocab_vectorizer %>%
create_dtm(tmp, .)
-output
10 x 6 sparse Matrix of class "dgCMatrix"
an so by are he br
1 2 4 . 2 9 8
2 . 1 1 . . .
3 1 . 6 7 . 2
4 4 1 3 2 . 4
5 2 . 1 1 . .
6 . . . . . .
7 1 3 . . . .
8 . 1 . . 2 .
9 . . . . 1 4
10 . . . . . 2