Search code examples
rdplyrtidyversemagrittr

use output of previous magrittr chains as arguments to further arguments


if I have the following example:

library(text2vec)
library(magrittr)

reviews <- movie_review[1:10,]

vocabInsomnia <- reviews$review %>% itoken(tolower, word_tokenizer, n_chunks = 10) %>%
    create_vocabulary %>%
    prune_vocabulary(term_count_min = 10, doc_proportion_max = 0.5) %>%
    vocab_vectorizer %>%
    create_dtm(<output_from_itoken>,<output_from_vocab_vectorizer>)

You can see that in the very last chain sequence I want to use the outputs of two of the previous steps as arguments to the create_dtm function. I only know how to feed in the output from the chain directly before i.e. output from vocab_vectorizer, but not the output from the function itoken that was the first chain in the sequence. Does magrittr allow this?


Solution

  • We could create a temporary object using pipeR

    library(text2vec)
    library(pipeR)
    library(magrittr)
    reviews$review %>% 
      itoken(tolower, word_tokenizer, n_chunks = 10) %>>%
     (~ tmp) %>%
      create_vocabulary %>% 
      prune_vocabulary(term_count_min = 10, doc_proportion_max = 0.5) %>% 
      vocab_vectorizer %>% 
      create_dtm(tmp, .)
    

    -output

    10 x 6 sparse Matrix of class "dgCMatrix"
       an so by are he br
    1   2  4  .   2  9  8
    2   .  1  1   .  .  .
    3   1  .  6   7  .  2
    4   4  1  3   2  .  4
    5   2  .  1   1  .  .
    6   .  .  .   .  .  .
    7   1  3  .   .  .  .
    8   .  1  .   .  2  .
    9   .  .  .   .  1  4
    10  .  .  .   .  .  2