I am trying to implement a NN from scratch in Python. It has 2 layers: input layer – output layer. The input layer will have 4 neurons and the output layer will have only a single node (+biases). I have the following code but I get the error message: ValueError: shapes (4,2) and (4,1) not aligned: 2 (dim 1) != 4 (dim 0). Can someone help me?
import numpy as np
# Step 1: Define input and output data
X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]])
y = np.array([[0, 1, 0, 1]])
# Step 2: Define the number of input neurons, hidden neurons, and output neurons
input_neurons = 4
output_neurons = 1
# Step 3: Define the weights and biases for the network
weights = np.random.rand(input_neurons, output_neurons)
biases = np.random.rand(output_neurons, 1)
# Step 4: Define the sigmoid activation function
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# Step 5: Define the derivative of the sigmoid function
def sigmoid_derivative(x):
return sigmoid(x) * (1 - sigmoid(x))
# Step 6: Define the forward propagation function
def forward_propagation(X, weights, biases):
output = sigmoid(np.dot(X.T, weights) + biases)
return output
# Step 7: Define the backward propagation function
def backward_propagation(X, y, output, weights, biases):
error = output - y
derivative = sigmoid_derivative(output)
delta = error * derivative
weights_derivative = np.dot(X, delta.T)
biases_derivative = np.sum(delta, axis=1, keepdims=True)
return delta, weights_derivative, biases_derivative
# Step 8: Define the train function
def train(X, y, weights, biases, epochs, learning_rate):
for i in range(epochs):
output = forward_propagation(X, weights, biases)
delta, weights_derivative, biases_derivative = backward_propagation(X, y, output, weights, biases)
weights -= learning_rate * weights_derivative
biases -= learning_rate * biases_derivative
error = np.mean(np.abs(delta))
print("Epoch ", i, " error: ", error)
# Step 9: Train the network
epochs = 5000
learning_rate = 0.1
train(X, y, weights, biases, epochs, learning_rate)
You have an output layer with one neuron, so your output should be of one dimension.
You're assuming that the output has 4 dims:
y = np.array([[0, 1, 0, 1]])
Since you are giving two inputs (a pair of 4 dim inputs) like this,
X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]])
You need also give two outputs (in one dim), for example like this:
y = np.array([[0],[1]])
Hope this helps.