Search code examples
javaspring-bootapache-sparkapache-spark-mllib

How can I deploy a spark AlsModel into a spring boot microservice


I want to create a microservice using spring boot that is able to deliver recommendations made with a previously trained AlsModel. (collaborative filtering with apache spark mllib) The AlsModel is trained in a completely separate environment which is not suitable for production use cases. We have methods to transfer files or data stored in a hdfs into our service layer (As a file or transfer the data in a sql database). I know I could just save the user and item feature dataframes, transfer them and then calculate the predictions myself but I want to have an easier solution that handles regular updates well. In my mind the process is the following:

  • Train the model inside the spark cluster
  • Save the model to a file (pmml format ??)
  • Transfer the file into the service layer
  • The spring boot microservice just loads the file with the help of some framework (that doesn't pull in spark dependencies)
  • Enjoy spring-boot doing it's magic making everything easy ;-)

Solution

  • If you don't want to include spark lib into your spring boot application.

    You can try mleap.

    Deploying machine learning data pipelines and algorithms should not be a time-consuming or difficult task. MLeap allows data scientists and engineers to deploy machine learning pipelines from Spark and Scikit-learn to a portable format and execution engine.

    So you can use mleap to read spark model, and use it in your spring boot application.

    For more use case you can see this project sagemaker-sparkml-serving-container. Amazon SageMaker has also developed a fully Java based serving setup powered by mleap-runtime