Search code examples
rmachine-learningclassificationnaivebayes

Trying to implement a simple Naive Bayes classifier using "bnlearn". Keep getting error "variables must be either numeric, factors or ordered factors"


I'm trying to implement a NB classifier in R by recreating the results of data given to me. Right now I'm simply testing on the training data itself, to see what the accuracy is like.

There are 29 variables in the dataset, one of which is called "Status". It has two values, Win and Lose. I've split the training data into roughly 2/3 training, 1/3 testing. The goal is to determine the accuracy of the prediction of Status being a win or lose.

I think I understand the error, in so much as "Win" and "Lose" aren't numeric values, but as I understand it, would they not be factors? I'll post my code below. I'm using the bnlearn examples from http://www.bnlearn.com/documentation/man/naive.bayes.html as my bases for this. If there are better examples out there, please let me know.

#Read in training data
trainingdata <- read.csv("C:\\.....filepath.csv", header=T)

#Split data into training and test sets
training.set = trainingdata[1:1200, ]
test.set = trainingdata[1201:1860, ]

#Train model
bn = naive.bayes(training.set, "Status") 
fitted = bn.fit(bn, training.set)

#Predict
pred = predict(fitted, test.set)
table(pred, test.set[, "Status"])

I start to get the error from the bn = naive.bayes(training.set, "Status") line. The specific error says "Error in data.type(x) : variables must be either numeric, factors or ordered factors

Is there a way I can get bnlearn to recognise that "Status" is a factor.


Solution

  • Coming back to this years later as I realised I never updated it with the answer, and no one has posted a solution since then for me to accept.

    Cotton.Rockwood's assumption was correct. With bnlearn's naive bayes classifier, all the variables need to be factors, otherwise you will encounter this error.