Search code examples
python-3.xtensorflowconv-neural-networklstmtflearn

CNN RNN integration for images


I'm trying to integrate CNN and LSTM for MNIST images by the following code:

from __future__ import division, print_function, absolute_import
import tensorflow as tf
import tflearn
import numpy as np
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression

import tflearn.datasets.mnist as mnist
height = 128
width = 128
X, Y, testX, testY = mnist.load_data(one_hot=True)
X = X.reshape([-1, 28, 28, 1])
testX = testX.reshape([-1, 28, 28, 1])

# Building convolutional network
network = tflearn.input_data(shape=[None, 28, 28,1], name='input')
network = tflearn.conv_2d(network, 32, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = tflearn.conv_2d(network, 64, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = fully_connected(network, 128, activation='tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 256, activation='tanh')
network = dropout(network, 0.8)
network = tflearn.reshape(network, [-1, 1, 28*28])
#lstm
network = tflearn.lstm(network, 128, return_seq=True)
network = tflearn.lstm(network, 128)
network = tflearn.fully_connected(network, 10, activation='softmax')
network = tflearn.regression(network, optimizer='adam',
                     loss='categorical_crossentropy', name='target')

#train
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X, Y, n_epoch=1, validation_set=0.1, show_metric=True,snapshot_step=100)

CNN accepts a 4D tensor and LSTM a 3D. Hence I have reshaped the network by : network = tflearn.reshape(network, [-1, 1, 28*28])

But on running the error comes:

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 16384 values, but the requested shape requires a multiple of 784 [[Node: Reshape/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Dropout_1/cond/Merge, Reshape/Reshape/shape)]]

I'm not clear why they need a tensor of size 16384, and even if I hard code 128*128 it still doesn't work! I cannot proceed at all.


Solution

  • The error is in this line:

    network = tflearn.reshape(network, [-1, 1, 28*28])
    

    The previous FC layer has n_units=256, hence it can't be reshaped to 28*28. Change this line to:

    network = tflearn.reshape(network, [-1, 1, 256])
    

    Note that you're feeding the features produced by CNN, not the input MNIST images, to LSTM.