I'm currently working with timeseries forecasts using tensorflow and keras. I built an CNN which performs quite good and a basic LSTM with shows also quite good results. Now I was thinking to combine the strengths of both networks. My first thought was just stack the LSTM on top of the CNN but regardless from the weak results I realized that I want both Networks to see the Input data so the CNN can learn about features while the LSTM should focus on the time related aspects. What would be a good start to try for building this kind of architecture? I was also wondering if it makes any sense to concatenate the outputs of both networks? I saw this often but I don't get why this would be useful. I always think about concatenating two different timeseries, which would not make sense at all. I already visited posts which seemed related to my question but it was not what I was looking for. independent
I attach a simple model example using two branches (CNN and LSTM)
import tensorflow as tf
class CNNLSTMTimeseries(tf.keras.Model):
def __init__(self, n_classes):
super(CNNLSTMTimeseries, self).__init__()
self.conv1 = tf.keras.layers.Conv1D(64, 7, padding='same',
activation=None)
self.bn1 = tf.keras.layers.BatchNormalization()
self.conv2 = tf.keras.layers.Conv1D(64, 5, padding='same',
activation=None)
self.bn2 = tf.keras.layers.BatchNormalization()
self.lstm = tf.keras.layers.LSTM(64, return_sequences=True)
self.classifier = tf.keras.layers.Dense(n_classes, activation='softmax')
def call(self, x):
conv_x = tf.nn.relu(self.bn1(self.conv1(x)))
conv_x = tf.nn.relu(self.bn2(self.conv2(conv_x)))
lstm_x = self.lstm(x)
x = tf.concat([conv_x, lstm_x], axis=-1)
x = tf.reduce_mean(x, axis=1) # Average all timesteps
return self.classifier(x)
TIMESTEPS = 16
FEATURES = 32
model = CNNLSTMTimeseries(3)
print(model(tf.random.uniform([1, TIMESTEPS, FEATURES])).shape)
The example is really simple and probabily won't work as a well studied architecture. You should modify the example and add Max pooling, dropouts, etc.