Search code examples
pythonpandasnumpydataframesklearn-pandas

Python dataframe manupulation


I am trying` to convert the below input dataframe to the output dataframe

import pandas as pd

data = {'Model1': [86,23,32,13,45,12],
        'Model2': [96,98,34,12,22,19], 
        'Model3': [56,23,44,12,32,33]
       }

Input = pd.DataFrame(data, 
                     columns=['Model1','Model2','Model3'], 
                     index=['I1', 'I2','I3','I4','I5','I6'])

Output = pd.DataFrame(data={'Best Model': ['Model2','Model2', 'Model3','Model1', 'Model1', 'Model3'],
                            'Best Model Accuracy': [96,98,44,13,45,33]}, 
                      columns=['Best Model','Best Model Accuracy'], 
                      index=['I1', 'I2','I3','I4','I5','I6'])

Logic: I have 3 models accuracy results with me for 6 customers and I want to pick the best model with its accuracy for each of the customer. Best model would mean the model with maximum accuracy for that customer.

I am able to do the pivot of each but stuck at finding the best accuracy for each customer logic


Solution

  • You can use idxmax and lookup:

    idx = Input.idxmax(1)
    output = pd.DataFrame({'Best Model':idx, 
                           'Best Acc':Input.lookup(Input.index, idx)
                         })
    

    Output:

       Best Model  Best Acc
    I1     Model2        96
    I2     Model2        98
    I3     Model3        44
    I4     Model1        13
    I5     Model1        45
    I6     Model3        33