Search code examples
pythonscikit-learndata-scienceauc

How to get ROC curve for decision tree?


I am trying to find ROC curve and AUROC curve for decision tree. My code was something like

clf.fit(x,y)
y_score = clf.fit(x,y).decision_function(test[col])
pred = clf.predict_proba(test[col])
print(sklearn.metrics.roc_auc_score(actual,y_score))
fpr,tpr,thre = sklearn.metrics.roc_curve(actual,y_score)

output:

 Error()
'DecisionTreeClassifier' object has no attribute 'decision_function'

basically, the error is coming up while finding the y_score. Please explain what is y_score and how to solve this problem?


Solution

  • First of all, the DecisionTreeClassifier has no attribute decision_function.

    If I guess from the structure of your code , you saw this example

    In this case the classifier is not the decision tree but it is the OneVsRestClassifier that supports the decision_function method.

    You can see the available attributes of DecisionTreeClassifier here

    A possible way to do it is to binarize the classes and then compute the auc for each class:

    Example:

    from sklearn import datasets
    from sklearn.metrics import roc_curve, auc
    from sklearn.model_selection import train_test_split
    from sklearn.preprocessing import label_binarize
    from sklearn.tree import DecisionTreeClassifier
    from scipy import interp
    
    
    iris = datasets.load_iris()
    X = iris.data
    y = iris.target
    
    y = label_binarize(y, classes=[0, 1, 2])
    n_classes = y.shape[1]
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)
    
    classifier = DecisionTreeClassifier()
    
    y_score = classifier.fit(X_train, y_train).predict(X_test)
    
    fpr = dict()
    tpr = dict()
    roc_auc = dict()
    for i in range(n_classes):
        fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
        roc_auc[i] = auc(fpr[i], tpr[i])
    
    # Compute micro-average ROC curve and ROC area
    fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
    roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])
    
    #ROC curve for a specific class here for the class 2
    roc_auc[2]
    

    Result

    0.94852941176470573