tensorflow model tensorflow-serving kubeflow

What's different between TFServing and KFServing on KubeFlow

TFServin and KFServing both deploy the model on Kubeflow, and let users easy to use the model as a service, don't need to know detail about Kubernetes, hiding the infra layers.

TFServing is from TensorFlow, it can also run on Kubeflow or standalone. TFserving on kubeflow
KFServing is from Kubeflow, which can support multiple frameworks like PyTorch, TensorFlow, MXNet, etc. KFServing

My question is what's the main difference between these two projects.

If I want to launch my model in production, which should I use? which has better performance?

Solution

KFServing is an abstraction on top of inferencing rather than a replacement. It seeks to simplify deployment and make inferencing clients agnostic to what inference server is doing the actual work behind the scenes (be it TF Serving, Triton (formerly TRT-IS), Seldon, etc). It does this by seeking agreement among inference server vendors on an inferencing dataplane specification which allows extra components (such as transformations and explainers) to be more pluggable.