Search code examples

I can't create tf-idf matrix for my test data using text2vec

I'm following this tutorial and doing it as I did the training set, but it keeps saying the same thing. Someone know what's wrong with this?

> #Construct sample document-term matrix con el vectorizer inicial
> <- itoken(rawsample$Abstract, 
+                     preprocessor = prep_fun, 
+                     tokenizer = tok_fun, 
+                     ids = rawsample$id,
+                     progressbar = F) 
> sample.dtm <- create_dtm (, vectorizer, vtype = "dgTMatrix", progressbar = FALSE)
> sample.tfidf <- TfIdf$new() #define tfidf model
> sample.tfidf <- fit_transform(sample.dtm, tfidf)
Error in fit_transform.Matrix(sample.dtm, tfidf) : 
  inherits(model, "mlapiTransformation") is not TRUE
> sample.tfidf  = create_dtm(, vectorizer, vtype = "dgTMatrix", progressbar = FALSE) %>% 
+   transform(tfidf)
Error in transform.Matrix(., tfidf) : 
  inherits(model, "mlapiTransformation") is not TRUE


  • sample.tfidf <- TfIdf$new() #define tfidf model
    sample.tfidf <- fit_transform(sample.dtm, tfidf)

    Where do you define tfidf ? May be you need something like:

    model =  TfIdf$new() #define tfidf model
    sample.tfidf = fit_transform(sample.dtm, model)