Search code examples
pythontensorflowtensorflow2.0huggingface-transformers

huggingface transformer with tensorflow saves two files as model weights


This is how I build the model for classification task:

    def bert_for_classification(transformer_model_name, max_sequence_length, num_labels):
        config = ElectraConfig.from_pretrained(
            transformer_model_name,
            num_labels=num_labels,
            output_hidden_states=False,
            output_attentions=False
        )
        model = TFElectraForSequenceClassification.from_pretrained(transformer_model_name, config=config)
        # This is the input for the tokens themselves(words from the dataset after encoding):
        input_ids = tf.keras.layers.Input(shape=(max_sequence_length,), dtype=tf.int32, name='input_ids')

        # attention_mask - is a binary mask which tells BERT which tokens to attend and which not to attend.
        # Encoder will add the 0 tokens to the some sequence which smaller than MAX_SEQUENCE_LENGTH,
        # and attention_mask, in this case, tells BERT where is the token from the original data and where is 0 pad
        # token:
        attention_mask = tf.keras.layers.Input((max_sequence_length,), dtype=tf.int32, name='attention_mask')

        # Use previous inputs as BERT inputs:
        output = model([input_ids, attention_mask])[0]
        output = tf.keras.layers.Dense(num_labels, activation='softmax')(output)
        model = tf.keras.models.Model(inputs=[input_ids, attention_mask], outputs=output)

        model.compile(loss=keras.losses.CategoricalCrossentropy(),
                      optimizer=keras.optimizers.Adam(3e-05, epsilon=1e-08),
                      metrics=['accuracy'])

        return model

After I trained this model I save it using model.save_weights('model.hd5') But it turns out there are two files that are saved: model.hd5.index and model.hd5.data-00000-of-00001

How should I load this model from the disk?


Solution

  • You have 2 possibilities to save a model, either in keras h5 format or in tensorflow SavedModel format.

    You can determine the format by passing the save_format argument and set it to either "h5" or "tf". If you don't specify this argument, the format will be determined by the name you have passed. If the name has the .h5 suffix, it will be saved in keras, otherwise in the SavedModel format.

    Anyway, since you have specified suffix hd5 instead of h5, it will be saved in SavedModel format.

    You can simply load them in the same way you have saved them.

    model.save_weights("model.h5")   #h5 format
    model.load_weights("model.h5")
    #or
    model.save_weights("mymodel")    #SavedModel format
    model.load_weights("mymodel")