Search code examples
rdataframematrixformularweka

Trouble creating Naive Bayes classifier using RWeka


I'm trying to create a NB classifier via RWeka and it is returning a variable type error.

I have the following variables:

dtm_df.train as a data.frame containing the following

      ask check state
1_10    0     1   bad
1_100   1     0   bad
1_11    2     1  good
1_13    0     0   bad
1_14    0     0  good
1_15    0     1   bad
1_16    0     1  good
1_17    0     0   bad
1_19    0     0   bad
1_2     2     0   bad

and class.formula as a formula containing: state ~ ask + check

When using

NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayes")
classifier <- NB(class.formula ~ ., dtm_df.train)

It returns:

Error in model.frame.default(formula = class.formula ~ ., data = dtm_df.train) : object is not a matrix

Converting the data parameter dtm_df.train to matrix does not work, since it requires a data.frame.

When trying:

classifier <- NB(class.formula ~ ., dtm_df.train)

It returns

Error in .jcall(o, "Ljava/lang/Class;", "getClass") : 
  weka.core.UnsupportedAttributeTypeException: weka.classifiers.bayes.NaiveBayes: Cannot handle string class!

Solution

  • Found out that the cause was actually the format of the state column in the train set dtm_df.train.

    The solution was converting that column to factor via:

    dtm_df.train$state <- as.factor(dtm_df.train$state)