I trained different types of mobilenet v2 models using Tensorflow object detection API, then converted them to tfjs and ran them in the web. It seems that the execution of these models only supports the executeAsync() method. I feel like being able to use the execute() method will speed up the inference time which now is ~100ms. However, when I try the execute() method, I get errors regarding some dynamic ops. Since I prefer speed over accuracy, is there anything that I can do in order to speed up the inference time? Alternatively, are there other recommended object detection models that will run in real-time on the web? Or anything else that I should try?
why would execute
by faster than executeAsync
? there is minimal time wasted by overhead of async functions, you'd gain in order of 0.1-0.3ms at best
better question is which tfjs backend are you using?
cpu
is slowest, wasm
is fast to start, but overall still uses cpu and webgl
is slow to warmup (since it has to compile glsl functions and upload weights as shaders), but has overall fastest inference when gpu is available
do keep in mind that while mobilenetv2 is a lightweight model, it is quite old and performance is not the same as the latest lightweight sota models
for fast and lightweight object detection, my favorite is mobilenet v3 head combined with centernet