Search code examples
pre-trained-modelallennlpelmo

Allennlp: How to load a pretrained ELMo as the embedding of allennlp model?


I am new in allennlp. I trained an elmo model to apply it to other allennlp models as the embedding but failed. It seems that my model is not compatible to the interface the config gives. What can I do?

My elmo is trained by allennlp with the command:

allennlp train config/elmo.jsonnet --serialization-dir /xxx

The elmo.jsonnet is almost the same to https://github.com/allenai/allennlp-models/blob/main/training_config/lm/bidirectional_language_model.jsonnet except the dataset and vocabulary.

After that, I got an elmo model with:

config.json
weights.th
vocabulary/
vocabulary/.lock
vocabulary/non_padded_namespaces.txt
vocabulary/tokens.txt
meta.json

When I try to load the model into other models like bidaf-elmo in https://github.com/allenai/allennlp-models/blob/main/training_config/rc/bidaf_elmo.jsonnet, I found it requires the options and weights:

"elmo": {
    "type": "elmo_token_embedder",
    "do_layer_norm": false,
    "dropout": 0,
    "options_file": "xxx/options.json",
    "weight_file": "xxx/weights.hdf5"
}

Which are not included in my model. I tried to change model.state_dict() to weights.hdf5 but I received an error:

KeyError: "Unable to open object (object 'char_embed' doesn't exist)"

Which is required in

File "/home/xxx/anaconda3/envs/thesis_torch1.8/lib/python3.8/site-packages/allennlp/modules/elmo.py", line 393, in _load_char_embedding
    char_embed_weights = fin["char_embed"][...]

It seems that the model I trained by allennlp is not compatible with the interface. How can I apply my elmo as the embedding of other models?


Solution

  • You are right, those two formats don't align.

    I'm afraid there is no easy way out. I think you'll have to write a TokenEmbedder that can read and apply the output from bidirectional_language_model.jsonnet.

    If you do, we'd love to have it as a contribution to AllenNLP!