I have trained a model for text classification using huggingface/transformers
, then I exported it using the built-in ONNX functionality.
Now, I'd like to use it for inference on millions of texts (around 100 millions of sentences). My idea is to put all the texts in a Spark DataFrame
, then bundle the .onnx
model into a Spark UDF, and run inference that way, on a Spark cluster.
Is there a better way of doing this? Am I doing things "the right way"?
I am not sure if you are aware of and/or allowed to use SynapseML, due to the requirements (cf. "SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+," as of today, per the landing page), but SynapseML does have support for ONNX Inference on Spark. This could probably be the cleanest solution for you.
EDIT. Also, MLflow has support for exporting a python_function
model as an Apache Spark UDF. With MLflow, you save your model in, say, the ONNX format, log/register the model via mlflow.onnx.log_model
, and later retrieve it in the mlflow.pyfunc.spark_udf
call via its path, i.e., models:/<model-name>/<model-version>
.