Search code examples
kerasoutputconv-neural-networkshapestimestep

Keras loaded model output is different from the training model output


When I train my model it has a two-dimension output - it is (none, 1) - corresponding to the time series I'm trying to predict. But whenever I load the saved model in order to make predictions, it has a three-dimensional output - (none, 40, 1) - where 40 corresponds to the n_steps required to fit the conv1D network. What is wrong?

Here is the code:

 df = np.load('Principal.npy')


        # Conv1D
    #model = load_model('ModeloConv1D.h5')
    model = autoencoder_conv1D((2, 20, 17), n_steps=40)

    model.load_weights('weights_35067.hdf5')

    # summarize model.
    model.summary()

        # load dataset
    df = df


        # split into input (X) and output (Y) variables
    X = f.separar_interface(df, n_steps=40)
    # THE X INPUT SHAPE (59891, 17) length and attributes, respectively ##    

    # conv1D input format
    X = X.reshape(X.shape[0], 2, 20, X.shape[2])

    # Make predictions    

    test_predictions = model.predict(X)
    ## test_predictions.shape =  (59891, 40, 1)

    test_predictions = model.predict(X).flatten()
    ##test_predictions.shape = (2395640, 1)


    plt.figure(3) 
    plt.plot(test_predictions)
    plt.legend('Prediction')
    plt.show()

In the plot below you can see that it is plotting the input format. enter image description here

Here is the network architecture:

 _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    time_distributed_70 (TimeDis (None, 1, 31, 24)         4104      
    _________________________________________________________________
    time_distributed_71 (TimeDis (None, 1, 4, 24)          0         
    _________________________________________________________________
    time_distributed_72 (TimeDis (None, 1, 4, 48)          9264      
    _________________________________________________________________
    time_distributed_73 (TimeDis (None, 1, 1, 48)          0         
    _________________________________________________________________
    time_distributed_74 (TimeDis (None, 1, 1, 64)          12352     
    _________________________________________________________________
    time_distributed_75 (TimeDis (None, 1, 1, 64)          0         
    _________________________________________________________________
    time_distributed_76 (TimeDis (None, 1, 64)             0         
    _________________________________________________________________
    lstm_17 (LSTM)               (None, 100)               66000     
    _________________________________________________________________
    repeat_vector_9 (RepeatVecto (None, 40, 100)           0         
    _________________________________________________________________
    lstm_18 (LSTM)               (None, 40, 100)           80400     
    _________________________________________________________________
    time_distributed_77 (TimeDis (None, 40, 1024)          103424    
    _________________________________________________________________
    dropout_9 (Dropout)          (None, 40, 1024)          0         
    _________________________________________________________________
    dense_18 (Dense)             (None, 40, 1)             1025      
    =================================================================

Solution

  • As I've found my mistake, and as I think it may be useful for someone else, I'll reply to my own question: In fact, the network output has the same format as the training dataset labels. It means, the saved model is generating an output with shape (None, 40, 1) since it is exactly the same shape you (me) have given to the training output labels.

    You (i.e. me) appreciate a difference between the network output while training and the network while predicting because you are most probably using a method such as train_test_split while training, which randomize the network output. Therefore, What you see at end of training is the production of this randomized batch.

    In order to correct your problem (my problem), you should change the shape of the dataset labels from (None, 40, 1) to (None, 1), as you have a regression problem for a time series. For fixing that in your above network, you'd better set a flatten layer before the dense output layer. Therefore, I'll get the result your are looking for.