Search code examples
pythontensorflowkeraslstmmulticlass-classification

LSTM input shape error: Input 0 is incompatible with layer sequential_1


I am new to machine learning and keras. I was trying to create a LSTM model for my classification problem but I received this error: (I got few samples from internet and tried to modify them)

ValueError: Input 0 is incompatible with layer sequential_1: expected shape=(None, None, 30), found shape=[None, 3, 1]

This is what I need, I have a sequence like this 1,2,3,4 which 1,2,3 are my X_train and 4 is label(Y), so I mean timestep size is 3 and each has one feature only

I have 30 classes for my labels. So I expect the output to be one of these 30 classes. 64 is number of memory units.

this is my code

def get_lstm():
    model = Sequential()  
    model.add(LSTM(64, input_shape=(3, 30), return_sequences=True))  
    model.add(LSTM(64))  
    model.add(Dropout(0.2))  
    model.add(Dense(30, activation='softmax'))

X_train = user_data[:, 0:3]
X_train = np.asarray(X_train).astype(np.float32)  
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
Y_train = user_data[:, 3]    
Y_train = np.asarray(Y_train).astype(np.float32)
local_model = Mymodel.get_lstm()  
local_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
local_model.set_weights(global_weights)           
local_model.fit(X_train, Y_train, batch_size=32,
                                        epochs=1)

please let me know if you need more information or if it s not clear. I really need your help guys, thanks


Solution

  • Not sure why you are setting the input shape as (3,30) for the first LSTM. As you mentioned -

    This is what I need, I have a sequence like this 1,2,3,4 which 1,2,3 are my X_train and 4 is label(Y). so I mean timestep size is 3 and each has one feature only

    If you have 3-time steps, and only a single feature, then you should define each sequence as such.

    Also, since the model will always output a 30 length probability distribution but your y_train is a single value (out of unique 30 classes), you need to use the loss sparse_categorical_crossentropy instead of categorical_crossentropy. Read more here.

    from tensorflow.keras import layers, Model, utils
    
    #Dummy data and its shapes
    X = np.random.random((100,3,1)) #(100,3,1)
    y = np.random.randint(0,29,(100,)) #(100,)
    
    #Design model
    inp = layers.Input((3,1))
    x = layers.LSTM(64, return_sequences=True)(inp)
    x = layers.LSTM(64)(x)
    x = layers.Dropout(0.2)(x)
    out = layers.Dense(30, activation='softmax')(x)
    model = Model(inp, out)
    
    #Compile and fit
    model.compile(loss="sparse_categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
    model.fit(X, y, batch_size=32,epochs=3)
    
    Epoch 1/3
    4/4 [==============================] - 0s 4ms/step - loss: 3.4005 - accuracy: 0.0400
    Epoch 2/3
    4/4 [==============================] - 0s 5ms/step - loss: 3.3953 - accuracy: 0.0700
    Epoch 3/3
    4/4 [==============================] - 0s 8ms/step - loss: 3.3902 - accuracy: 0.0900
    
    utils.plot_model(model, show_layer_names=False, show_shapes=True)
    

    enter image description here