i am currently trying to build a muti-class prediction model to predict the letter out of 26 English alphabets. I have currently built a few models using ANN, SVM, Ensemble and nB. But i am stuck at the evaluating the accuracy of these models. Although the confusion matrix shows me the Alphabet-wise True and False predictions, I am only able to get an overall accuracy of each model. Is there a way to evaluate the model's accuracy similar to the ROC and AUC values for a Binomial Classification. Note: I am currently running the model using the H2o package as it saves me more time.
Once you train a model in H2O, if you simply do: print(fit)
it will show you all the available metrics for that model type. For multiclass, I'd recommend h2o.mean_per_class_error()
.
R code example on the iris dataset:
library(h2o)
h2o.init(nthreads = -1)
data(iris)
fit <- h2o.naiveBayes(x = 1:4,
y = 5,
training_frame = as.h2o(iris),
nfolds = 5)
Once you have the model, we can evaluate model performance using the h2o.performance()
function to view all the metrics:
> h2o.performance(fit, xval = TRUE)
H2OMultinomialMetrics: naivebayes
** Reported on cross-validation data. **
** 5-fold cross-validation on training data (Metrics computed for combined holdout predictions) **
Cross-Validation Set Metrics:
=====================
Extract cross-validation frame with `h2o.getFrame("iris")`
MSE: (Extract with `h2o.mse`) 0.03582724
RMSE: (Extract with `h2o.rmse`) 0.1892808
Logloss: (Extract with `h2o.logloss`) 0.1321609
Mean Per-Class Error: 0.04666667
Hit Ratio Table: Extract with `h2o.hit_ratio_table(<model>,xval = TRUE)`
=======================================================================
Top-3 Hit Ratios:
k hit_ratio
1 1 0.953333
2 2 1.000000
3 3 1.000000
Or you can look at a particular metric, like mean_per_class_error
:
> h2o.mean_per_class_error(fit, xval = TRUE)
[1] 0.04666667
If you want to view performance on a test set, then you can do the following:
perf <- h2o.performance(fit, test)
h2o.mean_per_class_error(perf)