Search code examples
neural-networkxor

How to train this neural network?


I programmed a simple back propagation NN. Here is the code snippet:

        for (int i = 0; i < 10000; i++)
        {

            /// i1 = Convert.ToDouble(textBox1.Text);
            //i2 = Convert.ToDouble(textBox2.Text);
            //desired = Convert.ToDouble(textBox3.Text);

            Random rnd = new Random();
            i1 = rnd.Next(0, 1);
            Random rnd1 = new Random();
            i2 = rnd1.Next(0, 1);
            if(i1 == 1 && i2 == 1)
            {
                desired = 0;
            }
            else if(i1 == 0&&i2 == 0)
            {
                desired = 0;
            }
            else
            {
                desired = 1;
            }



            //hidden layer hidden values
            h1 = i1 * w1 + i2 * w2; //i1*w1+i2*w2
            h2 = i1 * w3 + i2 * w4;//i1*w3+i2*w4
            h3 = i1 * w5 + i2 * w6;//i1*w5+i2*w6;

            //hidden layer hidden values

            //VALUE OF HIDDEN LAYER
            h1v = Sigmoid(h1);
            h2v = Sigmoid(h2);
            h3v = Sigmoid(h3);
            //VALUE OF HIDDEN LAYER            

            //output final
            output = h1v * w7 + h2v * w8 + h3v * w9;
            outputS = Sigmoid(output);
            //output final

            //BACKPROPAGATION

            //MARGIN ERROR
            Error = desired - outputS; //desired-cena jaka ma byc OutputS-zgadnienta cena

            //Margin Error

            //DElta output sum
            deltaoutputsum = Derivative(output) * Error; //output bez sigmoida i error
                                                         //Delta output sum

            //weight of w7,w8,w9.
            w7b = w7; //0.3
            w8b = w8; // 0.5
            w9b = w9;// 0.9
            w7 = w7 + deltaoutputsum * h1v; //waga w7
            w8 = w8 + deltaoutputsum * h2v; //waga w8
            w9 = w9 + deltaoutputsum * h3v; //waga w9
                                            //weights of w7,w8,w9.

            //DELTA HIDDEN SUm
            h1 = deltaoutputsum * w7b * Derivative(h1);
            h2 = deltaoutputsum * w8b * Derivative(h2);
            h3 = deltaoutputsum * w9b * Derivative(h3);
            //DELTA HIDDEN SUM

            //weights 1,2,3,4,5,6
            w1 = w1 - h1 * i1;
            w2 = w2 - h1 * i2;
            w3 = w3 - h2 * i1;
            w4 = w4 - h2 * i2;
            w5 = w5 - h3 * i1;
            w6 = w6 - h3 * i2;
            label1.Text = outputS.ToString();
            label2.Text = w1.ToString();
            label3.Text = w2.ToString();
            label4.Text = w3.ToString();
            label5.Text = w4.ToString();
            label6.Text = w5.ToString();
            label7.Text = w6.ToString();
            label8.Text = w7.ToString();
            label9.Text = w8.ToString();
            label10.Text = w9.ToString();
            //weights 1,2,3,4,5,6

        }

It is very simple to solve XOR problems. But I'dont now how to predict the output. Here i must provide answear to set the weights, but how to predict? It train 10,000 on random training data. Now when it is trained how to predict the answear? Please help. Sorry for my english but I dont now it very well.

h1-3 are weights of nodes h1v are values of nodes w1-10 are weights


Solution

  • I believe your problem lies in how you are training.

    Do the following and I believe your program will be correct

    • Try training each of the data sets one after another instead of random, random works for continuous floating point values, but when you are working with XOR, you might run into issues where training too much on one or two sets of values (because of the nature of random) will cause issues moving the wieghts back toward a value that works with other input XOR values. So train on [1,1], then immediately [1,0] then [0,1] and then [0, 0] and repeat over and over.

    • Make sure the derivative function is correct; the derivative of a sigmoid should be sigmoid(x) - sigmoid(x)^2

    • name your hidden sum values something different than h1, h2 etc.. if you use that for the hidden node input values.

    If you do those things, it appears you should have something exactly mathematically equivalent to what "how to build a neural-network" has.

    I would also recommend having values that aren't persistent initialized inside your loop instead of outside. I may be wrong, but I don't think any value except your w1 w2 w3 etc... values need to be persistent through every training iteration. Not doing this causes hard to catch bugs and make reading the code harder since you can't guarantee variables aren't being modified elsewhere.