Search code examples
rr-carettext2vec

I have done TF-IDF and want to implement models in caret package [R]


I have implemented the TF-IDF algorithm that is explained in this link: https://cran.r-project.org/web/packages/text2vec/vignettes/text-vectorization.html#tf-idf

So, the classifier is implemented like this:

glmnet_classifier = cv.glmnet(x = dtm_train_tfidf, y = train[['sentiment']], 
                              family = 'binomial', 
                              alpha = 1,
                              type.measure = "auc",
                              nfolds = NFOLDS,
                              thresh = 1e-3,
                              maxit = 1e3)

the types of x and y are:

> typeof(dtm_train_tfidf)
[1] "S4"
> typeof(train$setiment)
[1] "integer"

How can I use a different classifer, for example in "Caret" package you would write:

model_svm<-train(x = dtm_train_tfidf, y = train[['sentiment']],method='svmRadial')

The problem is that this does not work. Are there any way to implement different classifiers rather than cv.glmnet for example in caret package? Is there any connection between this inputs x,y and the caret classifiers? If not, are there any packages like cv.glmnet that can handle this type of inputs?


Solution

  • dtm is a sparse matrix in CSC format dgCMatrix. So look for packages which can take sparse matrix as an input. Or you can try to apply dimensionality reduction (for example LSA) and then feed this dense matrix to caret.