Search code examples
matlabneural-networkpattern-recognition

Why my transfer function keep turn back in to 'logsig'?


   I try to build a basic feedforward system using patternnet command that can recognise the data from MNIST dataset. Here is my code

one = [1];
one = repelem(one,100);
%%%%%%%%%%%%%%%Create Neural network%%%%%%%%%%%%%%%%%%%%%
nn = patternnet([100 100]);
nn.numInputs = 1;
nn.inputs{1}.size = 784;
nn.layers{1}.transferFcn = 'logsig';
nn.layers{2}.transferFcn = 'logsig';
nn.layers{3}.transferFcn = 'softmax';
nn.trainFcn = 'trainscg';
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
%%%%%%%%%%%%%%%%Dealing with data%%%%%%%%%%%%%%%%%%%%%%%%%%

mnist_in = csvread('mnist_train_100.csv');
mnist_test_in = csvread('mnist_test_10.csv');
[i,j] = size(mnist_in);
data_in = mnist_in(:,2:785);
data_in = data_in';
target_in = mnist_in(:,1);
target_in = target_in';
nn = train(nn,data_in,target_in);

   The problem is when I build this system the transfer function in output layer is set to softmax function. Somehow when I train my system the transfer function turn into 'logsig' function and it stay that way until I clear my workspace. I even try to set the transfer function of output layer in the code and program still find a way to change it to logsig. So is there anything I can do.

PS. I even try building this system using network() to make everything from scrath the program still change my tranfer function back from softmax to logsig.


Solution

  • As I see, there is a mistake in the divideParam parameter. You created the neural network as nn but the parameters that you changed is belong to a variable called net. Other than that, the creating neural network part is normal.

    I think the problem lies in the data preparation part. Your training target, the target_in, has the dimension of 1 x < Number of sample>. Because of that, the train function replace 'softmax' with 'logsig' to fit with the output.

    The output data for softmax should be in the form of < Number of result> x < Number of sample>

    For example, the output is either 1,2 or 3. Then the output array shouldn't be

    [1 2 1 3 3 1 ...]
    

    but it should be

    [1 0 1 0 0 1 ...;
     0 1 0 0 0 0 ...;
     0 0 0 1 1 0 ...]
    

    Hope this helps.

    EDIT: To turn the single array (1 x < Number of sample>) to the multiple array (< Number of result> x < Number of sample>), the data in the single array will be map with index. For example, 11 sample in a single array:

    [-1    -5.5   4     0     3.3   4    -1     0     0     0    -1]
    

    Checking all the unique number and sort it. Now every number has its index.

    [-5.5  -1  0  3.3  4]   #index table
    

    Going through the single array, for each number, place it in the right index. Basically, -1 will have index 2 so I will tick 1 in the second row at any column that -1 appear. Finally,

    [ 0     1     0     0     0     0     0     0     0     0     0;
      1     0     0     0     0     0     1     0     0     0     1; #there are three -1 in the single array
      0     0     0     1     0     0     0     1     1     1     0;
      0     0     0     0     1     0     0     0     0     0     0;
      0     0     1     0     0     1     0     0     0     0     0]
    

    Here is the code for it:

    idx = sort(unique(target_in));   
    number_of_result = size(idx,2);
    number_of_sample = size(target_in,2);
    target_softmax = zeros(number_of_result,number_of_sample);
    for i = 1:number_of_sample
      place = find(idx == target_in(i));  % find the index of the value
      target_softmax(place,i) = 1;        % tick 1 at the row
    end