Search code examples
pythontensorflowconv-neural-networktflearn

CNN output regression in tflearn


I'm working on a self driving car. I want to predict steering angle from pictures using a CNN in tflearn. The problem is that it only outputs 0.1. What do you think is the problem? The pictures is 128x128 but I have tried to resize them to 28x28 so I can use the code from the mnist example. The lables is steering angles between 0 and 180. I can also say that the loss is not getting any smaller when training.

Training.py

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tflearn.datasets.mnist as mnist
import numpy
from scipy import misc
import csv

nrOfFiles = 0
csv_list = []

with open('/Users/gustavoskarsson/Desktop/car/csvfile.csv', 'r') as f:
    reader = csv.reader(f)
    csv_list = list(reader)

nrOfFiles = len(csv_list)

pics = []
face = misc.face()
for i in range(0, nrOfFiles):
    face = misc.imread('/Users/gustavoskarsson/Desktop/car/pics/' + str(i) + '.jpg')
    face = misc.imresize(face[:,:,0], (28, 28))
    pics.append(face)

X = numpy.array(pics)


steer = []
throt = []
for i in range(0, nrOfFiles):
    steer.append(csv_list[i][1])
    throt.append(csv_list[i][2])

#y__ = numpy.array([steer, throt])
Y = numpy.array(steer)
Y = Y.reshape(-1, 1)
#Strunta i gasen till att börja med.


convnet = input_data(shape=[None, 28, 28, 1], name='input')

convnet = conv_2d(convnet, 32, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = conv_2d(convnet, 64, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)

convnet = fully_connected(convnet, 1, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=0.01, loss='mean_square', name='targets')

model = tflearn.DNN(convnet)
model.fit(X, Y, n_epoch=6, batch_size=10, show_metric=True)
model.save('mod.model')

Predict.py

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tflearn.datasets.mnist as mnist
import numpy
from scipy import misc


convnet = input_data(shape=[None, 28, 28, 1], name='input')
                           #[none, 28,  28, 1]

convnet = conv_2d(convnet, 32, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = conv_2d(convnet, 64, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)

convnet = fully_connected(convnet, 1, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=0.01, loss='mean_square', name='targets')

model = tflearn.DNN(convnet)
model.load('mod.model')

#load test image
face = misc.face()
pics = []
for i in range(0, 3):
    face = misc.imread('/Users/gustavoskarsson/Desktop/car/pics/' + str(i) + '.jpg')
    face = misc.imresize(face[:,:,0], (28, 28))
    pics.append(face) 

test_x = numpy.array(pics)
test_x = test_x.reshape([-1, 28, 28, 1])
print(model.predict([test_x[0]]))

Solution

  • The problem is probably due to your output layer. It uses a softmax activation function, which always produces outputs from 0-1.

    If you take a look at the softmax function definition you will see that it depends on every output node of your layer. Since you have only one output node, it should always return 1, since you are dividing the output by its own value. If you want to learn more about the softmax layers, check out Michael Nielsen's great free book on Neural Networks.

    Also the softmax function isn't a good choice if you are not trying to classify things.

    Try omitting the activation='softmax' in your last fully connected layer.