android python tensorflow kivy python-multithreading

Keep Tensorflow session open in a Kivy app

I am trying to run an app made in Kivy along with a Tensorflow session and keep it from loading it every time when I make a prediction. To be more precise, I want to know how I can call the function from inside the session.

Here is the code for the session:

def decode():
    # Only allocate part of the gpu memory when predicting.
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
    config = tf.ConfigProto(gpu_options=gpu_options)

    with tf.Session(config=config) as sess:
        # Create model and load parameters.
        model = create_model(sess, True)
        model.batch_size = 1

        enc_vocab_path = os.path.join(gConfig['working_directory'],"vocab%d.enc" % gConfig['enc_vocab_size'])
        dec_vocab_path = os.path.join(gConfig['working_directory'],"vocab%d.dec" % gConfig['dec_vocab_size'])

        enc_vocab, _ = data_utils.initialize_vocabulary(enc_vocab_path)
        _, rev_dec_vocab = data_utils.initialize_vocabulary(dec_vocab_path)

        # !!! This is the function that I'm trying to call. !!!
        def answersqs(sentence):
            token_ids = data_utils.sentence_to_token_ids(tf.compat.as_bytes(sentence), enc_vocab)
            bucket_id = min([b for b in xrange(len(_buckets))
                            if _buckets[b][0] > len(token_ids)])
            encoder_inputs, decoder_inputs, target_weights = model.get_batch(
                {bucket_id: [(token_ids, [])]}, bucket_id)
            _, _, output_logits = model.step(sess, encoder_inputs, decoder_inputs,
                                            target_weights, bucket_id, True)
            outputs = [int(np.argmax(logit, axis=1)) for logit in output_logits]
            if data_utils.EOS_ID in outputs:
                outputs = outputs[:outputs.index(data_utils.EOS_ID)]

            return " ".join([tf.compat.as_str(rev_dec_vocab[output]) for output in outputs])

Here is where I'm calling the function:

def resp(self, msg):
    def p():
        if len(msg) > 0:
            # If I try to do decode().answersqs(msg), it starts a new session.
            ansr = answersqs(msg)
            ansrbox = Message()
            ansrbox.ids.mlab.text = str(ansr)
            ansrbox.ids.mlab.color = (1, 1, 1)
            ansrbox.pos_hint = {'x': 0}
            ansrbox.source = './icons/ansr_box.png'
            self.root.ids.chatbox.add_widget(ansrbox)
            self.root.ids.scrlv.scroll_to(ansrbox)

    threading.Thread(target=p).start()

And here is the last part:

if __name__ == "__main__":
    if len(sys.argv) - 1:
        gConfig = brain.get_config(sys.argv[1])
    else:
        # get configuration from seq2seq.ini
        gConfig = brain.get_config()

    threading.Thread(target=decode()).start()

    KatApp().run()

Also, should I change the session from GPU to CPU before I port it on Android?

Solution

You should have two variables graph and session that you keep around.

When you load the model you do something like:

graph = tf.Graph()
session = tf.Session(config=config)
with graph.as_default(), session.as_default():
  # The reset of your model loading code.

When you need to make a prediction:

with graph.as_default(), session.as_default():
  return session.run([your_result_tensor])

What happens is that the sessions is loaded and in memory and you just tell the system that's the context where you want to run.

In your code move def answersqs outside of the with part. It should bind automatically to graph and session from the surrounding function (but you need to make them available outside the with).

For the second part, normally if you follow the guides the exported model should be free of hardware binding information and when you load it tensorflow will figure out a good placement (that might be GPU if available and sufficiently capable).