Search code examples
rnaivebayes

R error: all arguments must have the same length


I got an error when I'm doing naive Bayes by R, here's my code and error

library(e1071) 

#data

train_data <- read.csv('https://raw.githubusercontent.com/JonnyyJ/data/master/train.csv',header=T)
test_data <- read.csv('https://raw.githubusercontent.com/JonnyyJ/data/master/test.csv',header=T)      

efit <- naiveBayes(y~job+marital+education+default+contact+month+day_of_week+
                        poutcome+age+pdays+previous+cons.price.idx+cons.conf.idx+euribor3m
                       ,train_data)  

pre <- predict(efit, test_data)
bayes_table <- table(pre, test_data[,ncol(test_data)])
accuracy_test_bayes <- sum(diag(bayes_table))/sum(bayes_table)
    list('predict matrix'=bayes_table, 'accuracy'=accuracy_test_bayes)

ERROR:

bayes_table <- table(pre, test_data[,ncol(test_data)]) Error in table(pre, test_data[, ncol(test_data)]) : all arguments must have the same length accuracy_test_bayes <- sum(diag(bayes_table))/sum(bayes_table) Error in diag(bayes_table) : object 'bayes_table' not found list('predict matrix'=bayes_table, 'accuracy'=accuracy_test_bayes) Error: object 'bayes_table' not found

I really don't understand what's going on, because I'm new in R


Solution

  • For some reason, the default predict(efit, test_data, type = "class") doesn't work in this case (probably because your model predicts 0 for all observations in the test dataset). You also need to construct the table using your outcome (i.e. test_data[,ncol(test_data)] returns euribor3m). The following should work:

    pre <- predict(efit, test_data, type = "raw") %>%
      as.data.frame() %>%
      mutate(prediction = if_else(0 < 1, 0, 1)) %>%
      pull(prediction)
    
    bayes_table <- table(pre, test_data$y)
    
    accuracy_test_bayes <- sum(diag(bayes_table)) / sum(bayes_table)
    
    list('predict matrix' = bayes_table, 'accuracy' = accuracy_test_bayes)
    # $`predict matrix`
    #    
    # pre    0    1
    #   0 7282  956
    # 
    # $accuracy
    # [1] 0.8839524