Search code examples
pythonpython-3.xazureazure-machine-learning-service

Comparing brier score for Azure ML classifier


I'm trying to compare the brier score for two classifiers in Azure ML studio:

import pandas as pd
import numpy as np
from sklearn.metrics import brier_score_loss

def azureml_main(dataframe1, dataframe2):
    colnames_1 = dataframe1.columns
    y_true_1 = np.array(dataframe1[colnames_1[1]])
    y_prob_1 = np.array(dataframe1[colnames_1[-1]])
    brier_score_1 = brier_score_loss(y_true_1, y_prob_1)

    colnames_2 = dataframe2.columns
    y_true_2 = np.array(dataframe2[colnames_2[1]])
    y_prob_2 = np.array(dataframe2[colnames_2[-1]])
    brier_score_2 = brier_score_loss(y_true_2, y_prob_2)

    data = {'brier_score': [brier_score_1, brier_score_2]}
    result = pd.DataFrame(data, columns=['brier_score'])

    return result

My problem is that the script only outputs a value in the first row with the brier score of the first dataset. The second row is empty. This is how I have connected the script: azure


Solution

  • I turned out that the problem was caused by a few NaN values in the second dataframe. Adding dataframe2 = dataframe2.dropna() to the top of the script solved the problem.