azure azure-machine-learning-service automl azure-auto-ml

K-fold cross validation in azure ML

I am currently training a model using an azure ML pipeline that i build with sdk. I am trying to add cross-validation to my ml step. I have noticed that you can add this in the parameters when you configure the autoML. My dataset consists of 30% label 0 and 70% label 1.

My question is, does azure autoML stratify data when performing the cross-validation? If not i would have to do the split/stratify myself before passing it to autoML.

Solution

Auto ML can stratify the data when performing cross-validation. The following procedure needs to be followed to perform cross-validation

Create the workspace resource.

After giving all the details, click on create

Launch the Studio and go to AutoML and click on New Automated ML job

Upload the dataset from here and give the basic details required.

Dataset uploaded with some basic categories

After uploading dataset use that dataset for the prediction model performance

Here for prediction, we can choose the k-fold cross validation for validation type and number of cross validations as 5. There is no split we are performing. The model will perform according to the validation requirements.