Search code examples
machine-learningneural-networklinear-regressionbackpropagationone-hot-encoding

How can I make a one neuron neural network?


I want to make a one neuron function like w1x1+w2x2+w3*x3+b1 My training input is

                    [1, 0, 0], 
                    [0, 1, 0],
                    [0, 0, 1],
                    [1, 1, 0],
                    [0, 1, 1],
                    [1, 1, 1],
                    [2, 0, 0]

And training output is :

[1,2,0,1,0,2,3]

I tried to make a code with one hot encoding but i failed. I am new to AI coding and I don't want to use any AI libraries such as Pytorch and Tensorflow or scikitlearn. This problem was troubling me for 2 weeks and I also tried different codes which didnt work. Here is an example code which I know it is wrong but It may give you an insight.


import numpy as np
import pandas as pd

def sigmoid(x):
    return 1 /(1+np.exp(-x))

def sigmoid_derivative(x):
    return x*(1-x)

training_inputs = np.array([
                    [1, 0, 0],
                    [0, 1, 0],
                    [0, 0, 1],
                    [1, 1, 0],
                    [0, 1, 1],
                    [1, 1, 1],
                    [2, 0, 0]
                ])

training_outputs = np.array([[1,2,0,1,0,2,3]]).T

np.random.seed(1)

synaptic_weights = 2 * np.random.random((3,1))-1
print('Random starting synaptic weights: ')
print(synaptic_weights)

for iteration in range(2000):

    input_layer = training_inputs

    outputs = sigmoid(np.dot(input_layer, synaptic_weights))

    error = training_outputs - outputs

    adjustments = error * sigmoid_derivative(outputs)

    synaptic_weights += np.dot(input_layer.T,adjustments)

print('Synaptic Weights After Training: ')
print(synaptic_weights)

print('Outputs after training: ')
print(outputs)

I want it to output results like [0,1,0,0] meaning 1 as one hot encoded. Which I dont know how to.


Solution

  • We can assume this is a classification problem seeing you using sigmoid.

    However, you've got 4 classes, so we'll need to use its multiclass generalization, the softmax function.

    As you properly noticed, we'll need to use one-hot encoding for the target: your predictions should be the same shape as training_outputs. It can be done trivially using np.eye().

    Next, you mention b in the linear equation, so you'll have to add a bias column to the training_inputs (all ones).

    As a result, the weights will become a 4x4 matrix (4 features including bias and 4 classes).

    Finally, the learning rate might be needed to be reduced a bit.

    No additional derivatives need to be calculated: the derivative of log loss w.r.t. weights is still proportional to X * error.

    This could probably be improved further if we scaled the training data.

    Worth noting multiclass classification is not quite one neuron but rather one neuron per class.

    import numpy as np
    import pandas as pd
    from scipy.special import softmax
    
    training_inputs = np.array([
                        [1, 0, 0],
                        [0, 1, 0],
                        [0, 0, 1],
                        [1, 1, 0],
                        [0, 1, 1],
                        [1, 1, 1],
                        [2, 0, 0]
                    ])
    
    # Adding bias
    training_inputs = np.hstack([training_inputs, np.ones((training_inputs.shape[0], 1))])
    
    training_outputs = np.array([1,2,0,1,0,2,3])
    # One-hot encoding
    training_outputs = np.eye(4)[training_outputs]
    
    np.random.seed(1)
    
    synaptic_weights = 2 * np.random.random((4,4))-1
    print('Random starting synaptic weights: ')
    print(synaptic_weights)
    
    for iteration in range(2000):
    
        input_layer = training_inputs
    
        # Softmax here
        outputs = softmax(np.dot(input_layer, synaptic_weights), axis=1)
    
        error = training_outputs - outputs
        
        synaptic_weights += np.dot(input_layer.T,error) * 0.1 # learning rate
    
    print('Synaptic Weights After Training: ')
    print(synaptic_weights)
    
    print('Outputs after training (one-hot): ')
    print(np.round(outputs))
    
    print('Outputs after training: ')
    print(np.argmax(outputs, axis=1))