Search code examples
pythontensorflowtensorflow-hubelmo

Strongly increasing memory consumption when using ELMo from Tensorflow-Hub


I am currently trying to compare the similarity of millions of documents. For a first test on a CPU I reduced them to around 50 characters each and try to get the ELMo Embedding for 10 of them at a time like this:

ELMO = "https://tfhub.dev/google/elmo/2"
for row in file:
    split = row.split(";", 1)
    if len(split) > 1:
        text = split[1].replace("\n", "")
            texts.append(text[:50])
    if i == 300:
        break
    if i % 10 == 0:
        elmo = hub.Module(ELMO, trainable=False)
                 executable = elmo(
                 texts,
                 signature="default",
                 as_dict=True)["elmo"]

    vectors = execute(executable)
    texts = []
    i += 1

However, even with this small example, after around 300 sentences (and not even saving the vectors) the program consumes up to 12GB of RAM. Is this a know issue (the other issues I found suggest something similar, but not quite that extreme) or did I make a mistake?


Solution

  • This is for TensorFlow 1.x without Eager mode, I suppose (or else the use of hub.Module would likely hit bigger problems).

    In that programming model, you need to first express your computation in a TensorFlow graph, and then execute that graph repeatedly for each batch of data.

    • Constructing the module with hub.Module() and applying it to map an input tensor to an output tensor are both parts of graph building and should happen only once.

    • The loop over the input data should merely call session.run() to feed input and fetch output data from the fixed graph.

    Fortunately, there is already a utility function to do all this for you:

    import numpy as np
    import tensorflow_hub as hub
    
    # For demo use only. Extend to your actual I/O needs as you see fit.
    inputs = (x for x in ["hello world", "quick brown fox"])
    
    with hub.eval_function_for_module("https://tfhub.dev/google/elmo/2") as f:
      for pystr in inputs:
        batch_in = np.array([pystr])
        batch_out = f(batch_in)
        print(pystr, "--->", batch_out[0])
    

    What this does for you in terms of raw TensorFlow is roughly this:

    module = Module(ELMO_OR_WHATEVER)
    tensor_in = tf.placeholder(tf.string, shape=[None])  # As befits `module`.
    tensor_out = module(tensor_in)
    
    # This kind of session handles init ops for you.
    with tf.train.SingularMonitoredSession() as sess:
      for pystr in inputs:
        batch_in = np.array([pystr])
        batch_out = sess.run(tensor_out, feed_dict={tensor_in: batch_in}
        print(pystr, "--->", batch_out[0])
    

    If your needs are too complex for with hub.eval_function_for_module ..., you could build out this more explicit example.

    Notice how the hub.Module is neither constructed nor called in the loop.

    PS: Tired of worrying about building graphs vs running sessions? Then TF2 and eager execution are for you. Check out https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/tf2_text_classification.ipynb