python python-3.x machine-learning neural-network

Increasing The Amount Of Neurons In Hidden Layer

I am working to understand how to build my own ANN from scratch.

I have looked around and found a simple two layer architecture and the init_parameters function is like this.

def init_parameters():
    W1 = np.random.normal(size=(10, 784)) * np.sqrt(1./(784))
    b1 = np.random.normal(size=(10, 1)) * np.sqrt(1./10)
    W2 = np.random.normal(size=(10, 10)) * np.sqrt(1./20)
    b2 = np.random.normal(size=(10, 1)) * np.sqrt(1./(784))
    return W1, b1, W2, b2

This is following the MINST data set. I understand that the input layer is 784 for each 28x28 input image. For the hidden layer it is 10 if I am understanding the provided code. Lastly the output layer is 10 for each number from 0 to 9.

My goal is to increase the number of neurons in the hidden layer.

This is what I have tried:

 def init_parameters():
    W1 = np.random.normal(size=(20, 784)) * np.sqrt(1./(784))
    b1 = np.random.normal(size=(10, 1)) * np.sqrt(1./10)
    W2 = np.random.normal(size=(10, 20)) * np.sqrt(1./20)
    b2 = np.random.normal(size=(10, 1)) * np.sqrt(1./(784))
    return W1, b1, W2, b2

This gives me an error when using my `forward_prop´ function

def forward_prop(W1, b1, W2, b2, X):
    Z1 = W1.dot(X) + b1
    A1 = ReLU(Z1)
    Z2 = W2.dot(A1) + b2
    A2 = softmax(Z2)
    return Z1, A1, Z2, A2

Error:

-> Z1 = W1.dot(X) + b1
ValueError: operands could not be broadcast together with shapes (20,29400) (10,1)

From the error it seems that the b1 is not matched so I thought changing b1 to b1 = np.random.normal(size=(20, 1)) * np.sqrt(1./10) will work and it did.

The new init_parameters is this:

def init_parameters():
    W1 = np.random.normal(size=(20, 784)) * np.sqrt(1./(784))
    b1 = np.random.normal(size=(20, 1)) * np.sqrt(1./10)
    W2 = np.random.normal(size=(10, 20)) * np.sqrt(1./20)
    b2 = np.random.normal(size=(10, 1)) * np.sqrt(1./(784))
    return W1, b1, W2, b2

Did I actually increase the neuron in the hidden layer and is this the right approach to what I am trying to achieve?

Solution

Yes, you added 10 more hidden units (neurons) and there is nothing wrong with your approach. W1 and b1 contain the weights and biases of your first layer, and each row of those matrices represent a hidden unit.

From Andrew Ng's Deep Learning course (which uses notation similar to yours):

Notice the correspondence in number of hidden units and rows in the matrices.