Search code examples
machine-learningneural-networkartificial-intelligenceperceptron

Perceptron : Weight for each sample data , or one common weight


With Perception learning, I am really confused on initializing and updating weight. If I have a sample data that contains 2 inputs x0 and x1 and I have 80 rows of these 2 inputs, hence 80x2 matrix.

Do I need to initialize weight as a matrix of 80x2 or just 2 values w0 and w1 ? Is final goal of perceptron learning is to find 2 weights w0 and w1 which should fit for all 80 input sample rows ?

I have following code and my errors never get to 0, despite going up to 10,000 iterations.

x=input matrix of 80x2
y=output matrix of 80x1
n = number of iterations
w=[0.1,0.1]  
learningRate = 0.1
for i in range(n): 
    expectedT = y.transpose();
    xT = x.transpose()
    prediction =  np.dot (w,xT) 

    for i in range (len(x)):    
        if prediction[i] >= 0:                               
                 ypred[i] = 1                               
        else:                                   
                ypred[i] = 0

    error = expectedT - ypred

    # updating the weights
    w = np.add(w,learningRate*(np.dot(error,x)))
    globalError = globalError + np.square(error)

Solution

  • For each feature you will have one weight. Thus you have two features and two weights. It also helps to introduce a bias which adds another weight. For more information about bias check this Role of Bias in Neural Networks. The weights indeed should learn how to fit the sample data best. Depending on the data this can mean that you will never reach error of 0. For example a single layer perceptron can not learn an XOR gate when using a monotonic activation function. (solving XOR with single layer perceptron).

    For your example I would recommend two things. Introducing a bias and stopping the training when the error is below a certain threshold or if error is 0 for example.

    I completed your example to learn a logical AND gate:

    # AND input and output
    x = np.array([[0,0],[0,1],[1,0],[1,1]])
    y = np.array([0,1,1,1])
    
    n = 1000
    w=[0.1,0.1,0.1]  
    learningRate = 0.01
    globalError = 0
    
    def predict(X):
        prediction =  np.dot(w[0:2],X) + w[2] 
        ypred = np.zeros(len(y))
        for i in range (len(y)):    
            if prediction[i] >= 0:                               
                    ypred[i] = 1                               
            else:                                   
                    ypred[i] = 0
        return ypred
    
    for i in range(n): 
        expectedT = y.transpose();
        xT = x.transpose()
        ypred = predict(xT)
    
        error = expectedT - ypred
        if sum(error) == 0:
            break
    
        # updating the weights
        w[0:2] = np.add(w[0:2],learningRate*(np.dot(error,x)))
        w[2] += learningRate*sum(error)
        globalError = globalError + np.square(error)
    

    After the training the error is 0

    print(error)
    # [0. 0. 0. 0.]
    

    And the weights are as follows

    print(w)
    #[0.1, 0.1, -0.00999999999999999]
    

    The perceptron can be used now as AND gate:

    predict(x.transpose())
    #array([0., 1., 1., 1.])
    

    Hope that helps