Search code examples
pythonvectorneural-networkbackpropagation

Backpropagation and shapes of vectors not fitting


This might be a stupid question, but I am kind of stuck. I am trying to write a simple feedforward neural network in Python. My input, weights and output layers are declared like this:

self.inp = np.zeros(21)
self.weights1 = np.random.rand(self.inp.shape[0],15) 
self.weights2 = np.random.rand(15, 15)
self.layer1 = self.sigmoid(np.dot(self.inp, self.weights1))
self.output = self.sigmoid(np.dot(self.layer1, self.weights2))

Now I am trying to backpropagate but the sizes of my vectors don't fit. Here is my backpropagate function:

def backpropagate(self, dice, board):
    y = argmax(dice, self.moves)
    d_weights2 = np.dot(self.layer1.T, (2*(y - self.output) * self.sigmoidDerivative(self.output)))
    d_weights1 = np.dot(self.inp.T,  (np.dot(2*(y - self.output) * self.sigmoidDerivative(self.output), self.weights2.T) * self.sigmoidDerivative(self.layer1)))

    self.weights1 += d_weights1
    self.weights2 += d_weights2

I am getting an error while calculating d_weights1. The error is

ValueError: shapes (21,) and (15,) not aligned: 21 (dim 0) != 15 (dim 0)

How can I make my vectors fit?

Thanks in advance!

EDIT:

As requested, here is the entire class:

import numpy as np
from TestValues import argmax, testfunctions, zero

class AI:

    def __init__(self):
        self.moves = []
        self.inp = np.zeros(21)
        self.weights1 = np.random.rand(self.inp.shape[0],21) 
        self.weights2 = np.random.rand(21, 15)
        self.output = np.zeros(15)

    def getPlacement(self, dice, board):
        self.feedforward(dice, board)
        self.backpropagate(dice, board)
        result = self.output
        for x in self.moves:
            result[x] = -1.
        move = np.argmax(result)
        self.moves.append(move)
        return move

    def feedforward(self, dice, board):
        i = 0
        for x in dice:
            self.inp[i] = x
            i += 1
        for x in board:
            self.inp[i] = x
            i += 1

        self.layer1 = self.sigmoid(np.dot(self.inp, self.weights1))
        self.output = self.sigmoid(np.dot(self.layer1, self.weights2))

    def backpropagate(self, dice, board):

        y = argmax(dice, self.moves)

        d_weights2 = np.dot(self.layer1.T, np.dot(2*(y - self.output), self.sigmoidDerivative(self.output)))
        d_weights1 = np.dot(self.inp.T,  (np.dot(2*(y - self.output) * self.sigmoidDerivative(self.output), self.weights2.T) * self.sigmoidDerivative(self.layer1)))

        print(self.weights2.shape)

        self.weights1 += d_weights1
        self.weights2 += d_weights2

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

    def sigmoidDerivative(self, x):
        return self.sigmoid(x) * (1 - self.sigmoid(x))

Solution

  • It seems that the problem is the way you are initialising your input. You are producing an array of shape (21,), not (1, 21). If you are planning on backpropagating many training examples at one time, this might become obvious at some point. Also, it is often beneficial to try to debug the shapes of these resulting matrices. My d_weights2 was a single scalar for example. And if you are not familiar with matrix algebra, it is very helpful in understanding the dot products and what is supposed to come out.

    So, simply put, just initialise like this:

    inp = np.zeros((1, 21))
    

    This produced sensible shapes for me.


    Also, even though not CodeReview, I'm compelled to say something about your code. Don't repeat yourself. When backpropagating, you can first calculate the error at a layer and use that in both updates. error = 2*(output - y) * d_logistic(output) This will also simplify things a bit if you are planning on extending the network to have any arbitrary size, not just two layers.

    And one more thing, your functions sigmoid and sigmoidDerivative have no use in the class. Consider making them pure functions, not class methods.