I currently have a simple LSTM model built using Keras Functional API that takes an X_train dataset to predict associated Y_train data. I've built a custom loss function that can use y_pred and y_true values generated by the model, however I wish to expand this further by bringing in an additional ground truth (essentially y_train_v2) which would be used to calculate loss only (i.e. not used to inform the LSTM training directly).
I've produced code that can bring the input data into the loss function (although this was the incorrect shape) but I am struggling to figure out how I can have this additional data brought in without it being part of the rest of the model and to ensure that Y_train_V2 matches the original Y_train once in the loss function (essentially producing a y_true and y_true_V2) in terms of shape and the batch. Below is my current code which uses a wrapper to bring in the LSTM input into the loss function. As mentioned, I'm looking to change this so to have an additional y_true value from y_train_V2 that is just used in the loss function.
def custom_loss_wrapper(i):
def custom_loss(y_true, y_pred):
return K.sqrt(K.mean(K.square(y_pred - y_true)))
return custom_loss
def baseline_model():
i = Input(shape = (14,1))
x = LSTM(64,return_sequences=False)(i)
o = Dense(1)(x)
model = keras.Model(i,o)
print(i.shape)
model.compile(optimizer='adam', loss=custom_loss_wrapper(i))
return model
baseline_model().fit(x_train, y_train, epochs=10, batch_size= 4, validation_data=(x_test, y_test))
I've gone through several queries that have been asked on a similar topic (such as bringing input data for use in the LSTM also into the loss function) but I've struggled to apply them to my specific problem - particularly as I am unsure in regards to how to ensure the values remain consistent between y_true and y_true_2 in terms of the batching, etc.
Apologies in advance for any errors or repetition in the above query and I'd be happy to elaborate further if there are any queries.
I've tried to bring in y_train_V2 as another input to the model but my concern is how to ensure this won't be used in the LSTM training and is still processed so to be the same batch as y_train.
You could just concatenate them to the y labels like
# random dummy data
x_train = tf.random.uniform((100, 14, 1))
y_train_v1 = tf.random.uniform((100, 1))
y_train_v2 = tf.random.uniform((100, 1))
y_train = tf.concat((y_train_v1, y_train_v2), axis=-1)
def custom_loss_wrapper(i):
def custom_loss(y_true, y_pred):
y_true = y_true[:, :1]
y_true2 = y_true[:, 1:] # do here what you want with it
return K.sqrt(K.mean(K.square(y_pred - y_true)))
return custom_loss
Here, y_train
consists of your ground truth and also your additional information. Beware that if you want to use other metrics, you need an additional wrapper for them which deals with the two informations in y_true
.
An alternative way could be that you have another input and output for your network, which throughputs this information. Then you can specify different metrics for different outputs. But this is a bit more complicated to set up.