python numpy tensorflow feature-extraction tensorflow-hub

Extracting ELMo features using tensorflow and convert them to numpy

So I am interested to extract sentence embeddings using ELMo model.

I tried this at first:

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

x = ["Hi my friend"]

embeddings = elmo_model(x, signature="default", as_dict=True)["elmo"]


print(embeddings.shape)
print(embeddings.numpy())

It works well until the last line, that I could not convert it to numpy array.

I searched a little and I found if I put the following line in the beginning of my codes, the problem must be solved.

tf.enable_eager_execution()

However, I put this at the beginning of my code, I realized I could not compile the

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

I received this error:

Exporting/importing meta graphs is not supported when eager execution is enabled. No graph exists when eager execution is enabled.

How can I solve my problem? My goal is to obtain sentence features and use them in NumPy array.

Thanks in advance

Solution

TF 2.x

TF2 behavior is closer to the classic python behavior, because it defaults to eager execution. However, you should use hub.load to load your model in TF2.

elmo = hub.load("https://tfhub.dev/google/elmo/2").signatures["default"]
x = ["Hi my friend"]
embeddings = elmo(tf.constant(x))["elmo"]

Then, you can access the results and convert them to numpy array using the numpy method.

>>> embeddings.numpy()
array([[[-0.7205108 , -0.27990735, -0.7735629 , ..., -0.24703965,
         -0.8358178 , -0.1974785 ],
        [ 0.18500198, -0.12270843, -0.35163105, ...,  0.14234722,
          0.08479916, -0.11709933],
        [-0.49985904, -0.88964033, -0.30124515, ...,  0.15846594,
          0.05210422,  0.25386307]]], dtype=float32)

TF 1.x

If using TF 1.x, you should run the operation inside a tf.Session. TensorFlow does not use eager execution and requires to first build the graph, and then evaluate the results inside a session.

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
x = ["Hi my friend"]
embeddings_op = elmo_model(x, signature="default", as_dict=True)["elmo"]
# required to load the weights into the graph
init_op = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init_op)
    embeddings = sess.run(embeddings_op)

In that case, the result will already be a numpy array:

>>> embeddings
array([[[-0.72051036, -0.27990723, -0.773563  , ..., -0.24703972,
         -0.83581805, -0.19747877],
        [ 0.18500218, -0.12270836, -0.35163072, ...,  0.14234722,
          0.08479934, -0.11709933],
        [-0.49985906, -0.8896401 , -0.3012453 , ...,  0.15846589,
          0.05210405,  0.2538631 ]]], dtype=float32)