Search code examples
.netmachine-learningml.net

Distributed training of a model i ML.NET?


Is it possible to do distributed training/fit of a model in ML.NET to multiple workers/servers? I got a huge number of records with up to 10 or more features that I would like to retrain with, but it will take ages to do on a single computer.


Solution

  • There is currently no built-in way to distribute training in ML.NET. Though if it's something you'd like to have in the framework, you can create an issue in the dotnet/machinelearning repo.

    Have you tried training on a single PC? The reason I ask is because ML.NET works well with large datasets, so it may be the case that training on a single machine works well enough for your scenario.

    Depending on the type of model you're looking to train, another alternative would be to split your data and train different models on the respective data splits. Then, take the weights / model parameters of the individual models and create a single model that averages them. I don't think all models provide you with their weights / model parameters, but here are the ones that do and how you can extract their weights / model parameters.

    https://learn.microsoft.com/dotnet/machine-learning/how-to-guides/retrain-model-ml-net