I've been struggling to get the implementation of a neural net to converge to meaningful values. I have black and white images. Each image is either 40% black and 60% white or 60% white and 40% black. Classifying for more black or white.
I break the images into arrays of pixel values and feed them through the network. The issues is that it converges to the same constant value for all images. I am using 1000 images to train. 25*25 pixels for input and a hidden layer of 20.
def layer(x, w):
##bias node
b = np.array([1], dtype=theano.config.floatX)
##concate bias node
new_x = T.concatenate([x, b])
##evalu. matrix mult
m = T.dot(w.T, new_x)
##run through sigmoid
h = nnet.sigmoid(m)
return h
##for gradient descient, calc cost function to mininize
def grad_desc(cost, theta):
return theta - (.01 * T.grad(cost, wrt=theta))
##input x
x = T.dvector()
##y target
y = T.dscalar()
alpha = .1 #learning rate
###first layer weights
theta1 = theano.shared(np.array(np.random.rand((25*25)+1,20), dtype=theano.config.floatX)) # randomly initialize
###output layer weights
theta3 = theano.shared(np.array(np.random.rand(21,1), dtype=theano.config.floatX))
hid1 = layer(x, theta1) #hidden layer
out1 = T.sum(layer(hid1, theta3)) #output layer
fc = (out1 - y)**2 #cost expression to minimize
cost = theano.function(inputs=[x, y], outputs=fc, updates=[
##updates gradient weights
(theta1, grad_desc(fc, theta1)),
(theta3, grad_desc(fc, theta3))])
run_forward = theano.function(inputs=[x], outputs=out1)
inputs = np.array(inputs).reshape(1000,25*25) #training data X
exp_y = np.array(exp_y) #training data Y
cur_cost = 0
for i in range(10000):
for k in range(len(inputs)):
cur_cost = cost(inputs[k], exp_y[k])
if i % 10 == 0:
print('Cost: %s' % (cur_cost,))
Cost Coverages to Single value as well as any inputs having same output:
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Cost: 0.160380273066
Just an idea:
I have seen examples where the entire image was presented to the NN same way as you do it. However those networks were designed for character recognition and similar image processing. So if you feed the entire image to the network it will try to find similar images. I understood that your images are random and that can be the reason why it fails to train. Actually there may be no similarities between training images and there is nothing to learn. I would present the picture to the program this way if I want to distinguish between images of circles and squares. However for deciding if a picture is rather dark or light I would simply feed the network the count of black pixels and white pixels. Some linear pre-processing can be very beneficial.