Having a sequence of 10 days of sensors events, and a true / false label, specifying if the sensor triggered an alert within the 10 days duration:
sensor_id | timestamp | feature_1 | feature_2 | 10_days_alert_label |
1 | 2020-12-20 01:00:34.565 | 0.23 | 0.1 | 1 |
1 | 2020-12-20 01:03:13.897 | 0.3 | 0.12 | 1 |
2 | 2020-12-20 01:00:34.565 | 0.13 | 0.4 | 0 |
2 | 2020-12-20 01:03:13.897 | 0.2 | 0.9 | 0 |
95% of the sensors do not trigger an alert, therefore the data is imbalanced. I was thinking of an autoEncoder model in order to detect the anomalies (Sensors that triggered an alarm). Since I'm not interested in decoding the entire sequence, just the LSTM learned context vector, I was thinking of something like the figure below, where the decoder is reconstructing the encoder output:
I've googled around and found this simple LSTM auto encoder example:
# lstm autoencoder recreate sequence
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import RepeatVector
from tensorflow.keras.layers import TimeDistributed
from tensorflow.keras.utils import plot_model
# define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
I would like to modify the above example so the first LSTM output is used as the decoder target. Something like:
# lstm autoencoder recreate sequence
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import RepeatVector
from tensorflow.keras.layers import TimeDistributed
from tensorflow.keras.utils import plot_model
# define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(Dense(100, activation='relu')) # First LSTM output
model.add(Dense(32, activation='relu')) # Bottleneck
model.add(Dense(100, activation='sigmoid')) # Decoded vector
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, FIRST_LSTM_OUTPUT, epochs=300, verbose=0) # <--- ???
Q: Can I use the first LSTM output vector as a target?
You can do it using model.add_loss
. In add_loss
we specify the loss of our interest (in our case: mse
) and set the layers used to compute it (in our case: the LSTM output and model predictions)
Below a dummy example:
n_sample, timesteps = 100, 9
X = np.random.uniform(0,1, (100, 9, 1))
def mse(enc_output, pred):
return tf.reduce_mean(tf.square(enc_output - pred))
inp = Input((timesteps,1,))
enc = LSTM(100, activation='relu')(inp)
x = Dense(100, activation='relu')(enc)
x = Dense(32, activation='relu')(x)
out = Dense(100, activation='sigmoid')(x)
model = Model(inp, out)
model.add_loss(mse(enc, out))
model.compile(optimizer='adam', loss=None)
model.fit(X, y=None, epochs=3)
Here the running code