Search code examples
pythonnumpytensorflowtflearn

MachineLearning tflearn/tensorflow images to greyscale


I am currently trying to develop a CNN with tflearn, to detect Objects. My data comes from a pickle file, so I do not have any .png files or similar. My images are stored as numpy.array with the shape:

 (34799, 32, 32, 3)

34799 is the number of images, so basically the shape is 32,32,3.

my CNN is defined as the following:

    import tflearn
from tflearn.layers.core import input_data, fully_connected, flatten, dropout
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.estimator import regression
from tflearn.metrics import Accuracy

# Building convolutional network
def neural_network(X, y, dropoutRate=0.8):
    network = input_data(shape=[None, 32, 32, 3], name='input')

    network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")

    network = conv_2d(network, 6, 4, activation='relu')
    network = max_pool_2d(network, 2)

    network = conv_2d(network, 16, 5, strides=1, activation="relu", padding="VALID")
    network = max_pool_2d(network, 2, padding="VALID")

    network = dropout(incoming=network, keep_prob=dropoutRate)
    network = fully_connected(network, 84, activation="relu")
    network = flatten(network)
    network = fully_connected(network, 43, activation='softmax')

    acc = Accuracy()
    network = regression(network, optimizer='adam', learning_rate=0.001,
                         loss='categorical_crossentropy', name='target')
    # Training
    model = tflearn.DNN(network, tensorboard_verbose=0)
    model.fit(X_test, y_test, n_epoch=7, batch_size=20, show_metric=True, snapshot_epoch=True, run_id="trafficSign", snapshot_step=500, validation_set=(X_valid, y_valid))
    return model

my problem is that, when I turn the images to gray with the built-in tensor flow function:

tf.image.rgb_to_grayscale(X_train)

so that is the tensor coming from the function

<tf.Tensor 'rgb_to_grayscale_6:0' shape=(34799, 32, 32, 1) dtype=float64>

but when changing the first part of my CNN. The input_data() to the shape [32,32,1] I get an error that the shape is wrong and it can't fill the shape because it has the shape [32,32].

So my question is, is there an easy way to append the ,1 to my shape?

Thanks for all your help and please tell me if you need any more information


Solution

  • 1st solution You can do the changes inside the netwrok

    network = input_data(shape=[None, 32, 32, 3], name='input')
    network = tf.image.rgb_to_grayscale(network)
    network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
    ...
    

    2nd Solution : Apart from that you can reduce the extra complexity of converting the data every epochs

    use PIL/opencv to convert your RGB images to gray

      now you have X_TRAIN = (34799, 32, 32)
      # conver the input into 4D
      X_TRAIN = np.expand_dims(X_TRAIN, 3)
    

    use the minor modified version of the first code

    network = input_data(shape=[None, 32, 32, 1], name='input')
    network = conv_2d(network, nb_filter=6, filter_size=5, strides=1, activation='relu', padding="VALID")
    ...