Search code examples
azureazure-cognitive-servicesmicrosoft-custom-vision

Azure Cognitive Services: Where are the missing Custom Vision Performance Statistics?


I am querying the Azure Custom Vision V3.0 Training API (see https://westeurope.dev.cognitive.microsoft.com/docs/services/Custom_Vision_Training_3.0/operations/5c771cdcbf6a2b18a0c3b809) so I can generate per-tag ROCs myself via the GetIterationPerformance operation, part of whose output is:

{u'averagePrecision': 0.92868346,
 u'perTagPerformance': [{u'averagePrecision': 0.4887446,
                         u'id': u'uuid1',
                         u'name': u'tag_name_1',
                         u'precision': 0.0,
                         u'precisionStdDeviation': 0.0,
                         u'recall': 0.0,
                         u'recallStdDeviation': 0.0},
                        {u'averagePrecision': 1.0,
                         u'id': u'uuid2',
                         u'name': u'tag_name_2',
                         u'precision': 0.0,
                         u'precisionStdDeviation': 0.0,
                         u'recall': 0.0,
                         u'recallStdDeviation': 0.0},
                    {u'averagePrecision': 0.9828302,
                     u'id': u'uuid3',
                     u'name': u'tag_name_3',
                     u'precision': 1.0,
                     u'precisionStdDeviation': 0.0,
                     u'recall': 0.5555556,
                     u'recallStdDeviation': 0.0}],

u'precision': 0.9859485, u'precisionStdDeviation': 0.0, u'recall': 0.3752228, u'recallStdDeviation': 0.0}

The precision and recall uncertainties, precisionStdDeviation and recallStdDeviation respectively, always seem to be 0.0. Is this user error and if not are there any plans to activate these stats?


Solution

  • So currently precisionStdDeviation and recallStdDeviation are not used so it will always be zero, it is not user error. These two metric exists because previously we do a cross validation on user dataset, and for each cross validation fold we have a precision and recall, the stddev measures the variation of precision and recall across folds. Now instead of cross validation, we split a proportion of the user data as validation set and report IterationPerformance based on that, as there's no multiple folds, the stddev will be always be zero. We are on our plan to retire these two fields, sorry for the confusion, it will be very likely to be removed in the next major version.