Search code examples
rmachine-learningneural-network

Single Value Accuracy for Neural Network (NeuralNet) in R


I have a multiple layer Neural Network, that uses two variables from the database (Alcohol and Malic.Acid)

My Code

#Reading in the wine data from last week's labs
winedata = read.csv('/Users/ali/Documents/CS3002/Lab2/winedata2.csv', header = TRUE, sep=",")

#Setting up my test and train data
winevaluesTrain = winedata[1:65,2:3]
wineclassesTrain = winedata[1:65,1]
winevaluesTest = winedata[66:130,2:3]
wineclassesTest = winedata[66:130,1]

#normalize
scaledtrain <- as.data.frame(scale(winevaluesTrain))
scaledtest <- as.data.frame(scale(winevaluesTest))

#Building the architecture of my neural network
set.seed(2)
NN2 = neuralnet(wineclassesTrain~., scaledtrain, hidden = c(3,3) , threshold = 0.001, stepmax = 1e+05, linear.output = FALSE)
plot(NN2)


predict_testNN2 = compute(NN2, scaledtest)

predict_outNN2 = predict_testNN2$net.result
print(predict_outNN2)

which plots the Neural Network Plot of Neural Network

and prints out the predicted results printout of predicted results

the final part is to 'Calculate the Accuracy' and I'm not sure where I need to go from here? asking the team they said I'm looking for a single value that shows the accuracy of my Neural Network

not sure if I need a confusion matrix? or how to present a single accuracy score


Solution

  • I can't see your data, but I'm assuming this is a classification problem (you're predicting a binary outcome), and the predicted results you show above are probabilities produced by your neural network model.

    The next step would be to apply a decision threshold to those probabilities. i.e. which of them should be 1 and which should be 0. e.g. you could keep the ratio of 0:1 the same as in your training data. Note that applying decisions to probabilities falls outside the modelling process.

    Once you have a binary classification for each row in your test data, you can compare those predicted classifications to the actual classifications. This can be done in the form of a confusion matrix. In classification problems, accuracy is defined as the number of correct predictions divided by the total number of predictions.

    Note that accuracy is not normally a very good indicator of model performance, especially with imbalanced datasets. It's better to assess your model based on the probabilities it produces rather than the classifications that follow your decision-making process. e.g. Look into using Brier scores as an alternative.