I try to build a basic feedforward system using patternnet command that can recognise the data from MNIST dataset. Here is my code
one = [1];
one = repelem(one,100);
%%%%%%%%%%%%%%%Create Neural network%%%%%%%%%%%%%%%%%%%%%
nn = patternnet([100 100]);
nn.numInputs = 1;
nn.inputs{1}.size = 784;
nn.layers{1}.transferFcn = 'logsig';
nn.layers{2}.transferFcn = 'logsig';
nn.layers{3}.transferFcn = 'softmax';
nn.trainFcn = 'trainscg';
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
%%%%%%%%%%%%%%%%Dealing with data%%%%%%%%%%%%%%%%%%%%%%%%%%
mnist_in = csvread('mnist_train_100.csv');
mnist_test_in = csvread('mnist_test_10.csv');
[i,j] = size(mnist_in);
data_in = mnist_in(:,2:785);
data_in = data_in';
target_in = mnist_in(:,1);
target_in = target_in';
nn = train(nn,data_in,target_in);
The problem is when I build this system the transfer function in output layer is set to softmax function. Somehow when I train my system the transfer function turn into 'logsig' function and it stay that way until I clear my workspace. I even try to set the transfer function of output layer in the code and program still find a way to change it to logsig. So is there anything I can do.
PS. I even try building this system using network() to make everything from scrath the program still change my tranfer function back from softmax to logsig.
As I see, there is a mistake in the divideParam
parameter. You created the neural network as nn
but the parameters that you changed is belong to a variable called net
. Other than that, the creating neural network part is normal.
I think the problem lies in the data preparation part.
Your training target, the target_in
, has the dimension of 1 x < Number of sample>. Because of that, the train
function replace 'softmax' with 'logsig' to fit with the output.
The output data for softmax should be in the form of < Number of result> x < Number of sample>
For example, the output is either 1,2 or 3. Then the output array shouldn't be
[1 2 1 3 3 1 ...]
but it should be
[1 0 1 0 0 1 ...;
0 1 0 0 0 0 ...;
0 0 0 1 1 0 ...]
Hope this helps.
EDIT: To turn the single array (1 x < Number of sample>) to the multiple array (< Number of result> x < Number of sample>), the data in the single array will be map with index. For example, 11 sample in a single array:
[-1 -5.5 4 0 3.3 4 -1 0 0 0 -1]
Checking all the unique number and sort it. Now every number has its index.
[-5.5 -1 0 3.3 4] #index table
Going through the single array, for each number, place it in the right index. Basically, -1 will have index 2 so I will tick 1 in the second row at any column that -1 appear. Finally,
[ 0 1 0 0 0 0 0 0 0 0 0;
1 0 0 0 0 0 1 0 0 0 1; #there are three -1 in the single array
0 0 0 1 0 0 0 1 1 1 0;
0 0 0 0 1 0 0 0 0 0 0;
0 0 1 0 0 1 0 0 0 0 0]
Here is the code for it:
idx = sort(unique(target_in));
number_of_result = size(idx,2);
number_of_sample = size(target_in,2);
target_softmax = zeros(number_of_result,number_of_sample);
for i = 1:number_of_sample
place = find(idx == target_in(i)); % find the index of the value
target_softmax(place,i) = 1; % tick 1 at the row
end