matlab deep-learning neural-network training-data confusion-matrix

Correct practice and approach for reporting the training and generalization performance

I am trying to learn the correct procedure for training a neural network for classification. Many tutorials are there but they never explain how to report for the generalization performance. Can somebody please tell me if the following is the correct method or not. I am using first 100 examples from the fisheriris data set that has labels 1,2 and call them as X and Y respectively. Then I split X into trainData and Xtest with a 90/10 split ratio. Using trainData I trained the NN model. Now the NN internally further splits trainData into tr,val,test subsets. My confusion is which one is usually used for generalization purpose when reporting the performance of the model to unseen data in conferences/Journals? The dataset can be found in the link: https://www.mathworks.com/matlabcentral/fileexchange/71468-simple-neural-networks-with-k-fold-cross-validation-manner

rng('default')
load iris.mat;

X = [f(1:100,:) l(1:100)];

numExamples = size(X,1);
indx = randperm(numExamples);
X = X(indx,:);
Y = X(:,end);


split1 = cvpartition(Y,'Holdout',0.1,'Stratify',true); %90% trainval 10% test

istrainval = training(split1); % index for fitting
istest = test(split1);      % indices for quality assessment

trainData = X(istrainval,:);

Xtest = X(istest,:);
Ytest = Y(istest);


numExamplesXtrainval = size(trainData,1);

indxXtrainval = randperm(numExamplesXtrainval);
trainData = trainData(indxXtrainval,:);
Ytrain = trainData(:,end);

hiddenLayerSize = 10;

% data format = rows = number of dim, column = number of examples
net  = patternnet(hiddenLayerSize);
net  = init(net);
net.performFcn = 'crossentropy';
net.trainFcn = 'trainscg';
net.trainParam.epochs=50;

[net tr]= train(net,trainData', Ytrain');
Trained = sim(net, trainData');  %outputs predicted labels

train_predict = net(trainData');

performanceTrain = perform(net,Ytrain',train_predict)
lbl_train=grp2idx(Ytrain);
Yhat_train = (train_predict >= 0.5);
Lbl_Yhat_Train = grp2idx(Yhat_train);   
[cmMatrixTrain]=  confusionmat(lbl_train,Lbl_Yhat_Train )

accTrain=sum(lbl_train ==Lbl_Yhat_Train)/size(lbl_train,1);
disp(['Training Set:    Total Train Acccuracy by MLP = ',num2str(100*accTrain ), '%'])

[confTest] =  confusionmat(lbl_train(tr.testInd),Lbl_Yhat_Train(tr.testInd) )


%unknown test
test_predict = net(Xtest');

performanceTest = perform(net,Ytest',test_predict);
Yhat_test = (test_predict >= 0.5);
test_lbl=grp2idx(Ytest);
Lbl_Yhat_Test = grp2idx(Yhat_test);

[cmMatrix_Test]=  confusionmat(test_lbl,Lbl_Yhat_Test )

This is the output.

Problem1: There seems to be no prediction for the other class. Why?

Problem2: Do I need a separate dataset like the one I created as Xtest for reporting generalization error or is it the practice to use the data trainData(tr.testInd,:) as the generalization test set? Did I create an unnecessary subset?

performanceTrain =

   2.2204e-16


cmMatrixTrain =

    45     0
    45     0

Training Set:    Total Train Acccuracy by MLP = 50%

confTest =

     9     0
     5     0


cmMatrix_Test =

     5     0
     5     0

Solution

There are a few issues with the code. Let's deal with them before answering your question. First, you set a threshold of 0.5 for making decisions (Yhat_train = (train_predict >= 0.5);) while all points of your net prediction are above 0.5. This means you only get zeros in your confusion matrices. You can plot the scores to choose a better threshold:

figure;
plot(train_predict(Ytrain == 1),'.b')
hold on
plot(train_predict(Ytrain == 2),'.r')
legend('label 1','label 2')

cvpartition gave me an error. It ran successfully as split1 = cvpartition(Y,'Holdout',0.1); In any case, artificial neural networks usuallly manage partitioning within the training process, so you feed in X and Y and some parameters for how to do it. See here for example: link where you set

net.divideParam.trainRatio = .4;
net.divideParam.valRatio = .3;
net.divideParam.testRatio = .3;

So how to report the results? Only for the test data. The train data will suffer from overfit, and will show false, too good results. If you use validation data (you havn't), then you cannot show results for it because it will also suffer from overfit. If you let the training do validation for you your test results will be safe from overfit.