Is it possible to run multiples of the same model in parallel on the Coral dev board?

I'm running mobilnet SSD and getting around 14ms per input image. Is it possible for me to run two of these models at the same time on the same dev board tpu? For example I have a backlog of 100 images I want to get through and the only thing that is important to me is how long it takes to get through all 100. So if I could run 2 or 4 at a time that would be amazing. I tried to read through the docs and I looked at pipelining but the edge compiler tells me "~$ Warning: For the given model, you're creating more segments than is necessary". Everything else I've read about running in parallel is about using two physical edge TPUs. If it's not possible that's fine I just want to know for sure :)

Thank you

Solution

You can run multiple models, but the TPU has limited memory and will swap your models in and out so you may not see a performance improvement by delegating your task to multiple models. However, you could co-compile your models. This process 'compiles' each model with the same identifier (a caching token) which enables them both to run on the TPU without getting swapped in and out.

Compiling models is done with the edgetpu_compiler; the process works like this:

edgetpu_compiler someModel.tflite someOtherModel.tflite

Or with the same model:

edgetpu_compiler someModelA.tflite someModelA_duplicate.tflite

There are some nuances to the process, such as the order in which you feed the models to the edgetpu_compiler process can impact performance as does the scenario where your combined models are too big to fit into the TPU RAM. I suggest starting with this documentation about multiple models.