I find it straightforward to deploy simple Machine Learning models on AzureML after training, as we can serialize the trained model into a single .pkl file, for example. However, for Neural Machine Translation models, especially when fine-tuning pre-trained models, we end up with up to eleven files. These files can be uploaded to Hugging Face for testing.
I have been attempting to deploy these checkpoints on Azure as endpoints but have been unsuccessful, despite reading and following the Azure documentation. If anyone knows of a clear tutorial or has experience deploying such models on Azure, I would greatly appreciate the guidance. Additionally, I would love to know how to deploy these models as microservices on Azure. Thank you.
I have read Microsoft's documentation on deployment, but I couldn't find specific information on deploying checkpoints for Neural Machine Translation models.
Here are files from the chekpoints after training and finetuning NMT models
Currently, foundation models of this kind can be deployed from the Hugging Face
registry in Azure ML.
So, you register your model with Hugging Face and try deploying it in Azure ML.
There is a component in the Azure ML registry where you can convert these models to Mlflow
. Check this notebook for more information.
After converting, you can deploy it to an endpoint.
But before all of this, you need to register a new model in Hugging Face with these files.