Search code examples
tensorflowkeraslstmsequential

Keras maximize cross-sequence correlation


I am using Keras with Tensorflow backend to do sequence prediction on multiple sequences simultaneously one step at a time; that is, I have n sequences each with m timesteps and I load them all into Keras with one model.predict() command and the model returns an 'n' element array with each element corresponding to the predicted next step for each of the n sequences.

So my goal is to maximize the ability for Keras to detect cross-correlations between the sequences. For example, say sequence a strongly correlates with sequence b with some time delay and scaling factor difference, then in theory Keras should be able to use sequence b to better predict the next step for sequence a. How I have it now I am testing with 4 sequences, each 30 elements long, and sending them to my model as exactly that, input data has the shape (4, 30, 1) and output data has the shape (4, 1):

model = Sequential()

model.add(LSTM(
    input_shape=(30, 1),
    output_dim=100,
    return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(
    200,
    return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(
    output_dim=1,
    return_sequences=False))
model.add(Activation("linear"))

model.compile(loss="mse", optimizer="rmsprop")
return model

What I'm wondering is if Keras is already cross-correlating the sequences, or if my model currently views them as independent and just predicts them all at the same time (i.e. it would be the same as predicting the next step for sequence a, then b, then c, etc.). Would Keras be able to better cross-correlate the sequences if I treated them as 1 sequences with n features per time-step (that is, input data of shape (1, 30, 4) and output data of shape (1, 4))? And if so would I still be able to output an n-element array with each element still being correlated to one of the input arrays (i.e. would they maintain distinguish-ability)?

Thanks!


Solution

  • I went ahead and just tested both ways, and found that - to answer my own question:

    • Yes, Keras was originally treating the sequences individually, not cross-correlating at all.
    • Yes, therefore by inputting the data as a single sequences with multiple features it was able to better cross-correlate (in this case that was the only scenario that offered any cross-correlation)
    • Yes, Keras was able to maintain distinguish-ability between the features.

    To determine this I did multiple trials with each input format and originally was calculating a chi^2 fit for the prediction vs. the actual test data to see whether inputting sequences A-D better improved the prediction of A over just inputting A and B. When doing this however, I quickly noticed that in my original format inputs B-D had no effect on the prediction of A (using a pre-trained model for 4 inputs, switching sequence C with D - which since distinguish-ability is maintained and required should have significantly changed the output - caused absolutely no change in the prediction). When doing the same input switch with the new model for 4 features, there was a clear dependence of the prediction for A on the other inputs B-D. Using the multiple features model over the multiple inputs model also resulted in a far smoother output, which is better for my application and makes sense as it is now dependent on more variables which presumably act to control outliers.