How can I operationalize a SparkML model as a real-time webservice?

Once a SparkML model has been trained on a Spark cluster, how can I take the trained model and make it available for scoring through a restful API?

The problem is that it requires a SparkContext in order to be loaded, but is there a way to 'fake it' since it does not seem really necessary, or what is the minimum required to create a SparkContext?

Solution

In some cases - yes, it can.

Many models in Spark can be exported to JPMML, standarized format for ML models. Then you can use it with other Java library like https://github.com/jpmml/jpmml-sparkml

How to export you can read in this question - Spark ml and PMML export.

You can also use Spark Streaming to calculate values, however it will have higher latency until Continous Processing Mode being available

For very time-consuming calculations, such as recommendation algorithms, it's I think quite normal to pre-calculate values and save in database like Cassandra