Search code examples
continuous-integrationartificial-intelligencekubeflowmleapseldon

ML model deployment CI/CD


I am training models using MLFlow on DataBricks and outputing the final models onto S3. Than, using Seldon-Core to to package AND deploy the models to AWS EKS.

I am looking for the tool that bridges the gap by taking the model from S3, packages it into a docker container, and using Seldon-Core K8S template to push it to AWS EKS.

I believe the tool that seem to fit the job is Kubeflow Pipelines. Other contenders are Jenkins, Gitlab, and TravisCI.

Is Kubeflow the absolute right tool for the job and what are the pros / cons of Kubeflow vs the other guys? if anyone has already done the research of maybe even built the pipeline...


Solution

  • GitLab actually does exactly what Kubeflow Pipelines out of the box, it is similar Yaml to CircleCI or TravisCI. I ended up using that for an alternative to Kubeflow Pipelines.

    Regarding Kubeflow... After experimenting with Kubeflow at version 0.5 and 0.6 our feeling was that is quite unstable yet. Installation never went smooth neither into MiniKube ( local K8S ) not into the AWS EKS. For MiniKube the install scripts from the documentation are broken and you will be able to see many people having issues and editing the install scripts by hand ( which is what I had to do to get it install properly ). On EKS we were not able to install 0.5 and had to install a much older version. Kubeflow wants to manage worker nodes in a particular manner and our security policies to not allow that, only in an order version you can overwrite that option.

    Kubeflow is also switching to Kuztomize and it is not stable yet, so if you use it now you will be using Ksonnet which is not supported anymore and you will learn a tool that you will through out the window sooner or later.

    All in all, should wait for version 1.0 but Gitlab does an awesome job as an alternative to kubeflow Pipelines.

    Hope this help other who have the same thoughts