Search code examples
tensorflowtensorflow-serving

Using the Same Model with Multiple Tensorflow Serving Instances


I am currently running two Tensorflow Serving Docker images (one for production and one for testing) that use the same exact model. If my testing instance is having a lot of traffic, will it still affect the production instance's performance because they are using the same exact model files?

Will I need to copy the model to a different location and have my testing instance use that copy in order to not negatively impact the performance of my production instance? Thanks!

I do want to note that the two instances are running on different Kubernetes pods, so they won't be using the same CPU and memory resources, just the same files.


Solution

  • I actually ran a test myself where I overwrote the model files while a TFServing instance was running. The new model didn't get loaded into TFServing until I restarted it. Therefore, it seems like TFServing loads the model in memory once and doesn't constantly use the model's files, so using the same model files for two different instances will not affect the other's performance at all.