Search code examples
deep-learningloss-functiontf.keras

Why model's loss is always revolving around 1 in every epoch?


During training, loss of my model is revolving around "1". It is not converging. I tried various optimizer but it still showing the same pattern. I am using keras with tensorflow backend. What could be possible reasons? Any help or reference link will be appreciable.

enter image description here here is my model:

def model_vgg19():
  vgg_model = VGG19(weights="imagenet", include_top=False, input_shape=(128,128,3))

  for layer in vgg_model.layers[:10]:
    layer.trainable = False

  intermediate_layer_outputs = get_layers_output_by_name(vgg_model, ["block1_pool", "block2_pool", "block3_pool", "block4_pool"])
  convnet_output = GlobalAveragePooling2D()(vgg_model.output)
  for layer_name, output in intermediate_layer_outputs.items():
    output = GlobalAveragePooling2D()(output)
    convnet_output = concatenate([convnet_output, output])

  convnet_output = Dense(2048, activation='relu')(convnet_output)
  convnet_output = Dropout(0.6)(convnet_output)
  convnet_output = Dense(2048, activation='relu')(convnet_output)
  convnet_output = Lambda(lambda  x: K.l2_normalize(x,axis=1)(convnet_output)

  final_model = Model(inputs=[vgg_model.input], outputs=convnet_output)

  return final_model


model=model_vgg19()

here is my loss function:

def hinge_loss(y_true, y_pred):
    y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON)
    loss = tf.convert_to_tensor(0,dtype=tf.float32)
    g = tf.constant(1.0, shape=[1], dtype=tf.float32)

    for i in range(0, batch_size, 3):
        try:
            q_embedding = y_pred[i+0]
            p_embedding = y_pred[i+1]
            n_embedding = y_pred[i+2]
            D_q_p =  K.sqrt(K.sum((q_embedding - p_embedding)**2))
            D_q_n = K.sqrt(K.sum((q_embedding - n_embedding)**2))
            loss = (loss + g + D_q_p - D_q_n)            
        except:
            continue
    loss = loss/(batch_size/3)
    zero = tf.constant(0.0, shape=[1], dtype=tf.float32)
    return tf.maximum(loss,zero)

Solution

  • What is definitely a problem is that you shuffle your data and then try to learn triplets out of this.

    As you can see here: https://keras.io/models/model/ model.fit shuffles your data in each epoch, making your triplet setup obsolete. Try to set the shuffle parameter to false and see what happens, there might be different errors as well.