Search code examples
pythonpandasdataframecurveroc

How to obtain an ROC Curve?


I am new to Python. I need to obtain the ROC curve with two values in my pandas data frame, any solution or recommendation? I need to use this formula:

    x = (1-dfpercentiles['acum_0%'])
    y = (1-dfpercentiles['acum_1%'])

I tries using sklearn libs and matplotlib but I didn't find a solution. This is my DF:

    In [109]: dfpercentiles['acum_0%']
    Out[110]: 
    0     10.89
    1     22.93
    2     33.40
    3     44.83
    4     55.97
    5     67.31
    6     78.15
    7     87.52
    8     95.61
    9    100.00
    Name: acum_0%, dtype: float64

and

    In [111]:dfpercentiles['acum_1%']
    Out[112]: 
    0      2.06
    1      5.36
    2      8.30
    3     13.49
    4     18.98
    5     23.89
    6     29.72
    7     42.87
    8     62.31
    9    100.00
    Name: acum_1%, dtype: float64

Solution

  • This seems to be a matplotlib question.

    Before anything, your percentiles are in the range 0-100 but your adjustment is 1 - percentile_value so you need to rescale your values to 0-1.

    I just used pyplot.plot to generate the ROC curve

    import matplotlib.pyplot as plt
    
    plt.plot([1-(x/100) for x in [10.89, 22.93, 33.40, 44.83, 55.97, 67.31, 78.15, 87.52, 95.61, 100.00]],
             [1-(x/100) for x in [2.06, 5.36, 8.30, 13.49, 18.98, 23.89, 29.72, 42.87, 62.31, 100.0]])
    

    Using your dataframe, it would be

    plt.plot((1-(dfpercentiles['acum_0%']/100)), (1-(dfpercentiles['acum_1%']/100))
    

    ROC