Search code examples
machine-learningazure-machine-learning-serviceazure-auto-ml

How to adjust feature importance in Azure AutoML


I am hoping to have some low code model using Azure AutoML, which is really just going to the AutoML tab, running a classification experiment with my dataset, after it's done, I deploy the best selected model.

The model kinda works (meaning, I publish the endpoint and then I do some manual validation, seems accurate), however, I am not confident enough, because when I am looking at the explanation, I can see something like this:

enter image description here

4 top features are not really closely important. The most "important" one is really not the one I prefer it to use. I am hoping it will use the Title feature more.

Is there such a thing I can adjust the importance of individual features, like ranking all features before it starts the experiment?

I would love to do more reading, but I only found this:

Increase feature importance

The only answer seems to be about how to measure if a feature is important.

Hence, does it mean, if I want to customize the experiment, such as selecting which features to "focus", I should learn how to use the "designer" part in Azure ML? Or is it something I can't do, even with the designer. I guess my confusion is, with ML being such a big topic, I am looking for a direction of learning, in this case of what I am having, so I can improve my current model.


Solution

  • Here is link to the document for feature customization.

    Using the SDK you can specify "feauturization": 'auto' / 'off' / 'FeaturizationConfig' in your AutoMLConfig object. Learn more about enabling featurization.

    Automated ML tries out different ML models that have different settings which control for overfitting. Automated ML will pick which overfitting parameter configuration is best based on the best score (e.g. accuracy) it gets from hold-out data. The kind of overfitting settings these models has includes:

    • Explicitly penalizing overly-complex models in the loss function that the ML model is optimizing
    • Limiting model complexity before training, for example by limiting the size of trees in an ensemble tree learning model (e.g. gradient boosting trees or random forest)

    https://learn.microsoft.com/en-us/azure/machine-learning/concept-manage-ml-pitfalls