Search code examples
pandasdataframeindexingmode

How to get the index of the mode value of a specific column in a pandas data frame


I have a sorted data frame as below:

            x_test         test_label     x_train             train_label  \
37  [[6.3, 3.3, 4.7, 1.6]]        [1]  [[6.4, 3.2, 4.5, 1.5]]         [1]   
63  [[6.3, 3.3, 4.7, 1.6]]        [1]  [[6.0, 3.4, 4.5, 1.6]]         [1]   
67  [[6.3, 3.3, 4.7, 1.6]]        [1]  [[6.1, 3.0, 4.6, 1.4]]         [1]   
96  [[6.3, 3.3, 4.7, 1.6]]        [1]  [[6.1, 3.0, 4.9, 1.8]]         [2]   
51  [[6.3, 3.3, 4.7, 1.6]]        [1]  [[5.9, 3.2, 4.8, 1.8]]         [1]   

    dist  
37  0.26  
63  0.37  
67  0.42  
96  0.46  
51  0.47  

I'd like to find the mode value at the 'train_label' column (any one) and get it's index. Next I'd like to find the value at the 'test_label' based on that index. how do I do it?

I've tried using df.mode() but didn't succeed.


Solution

  • First, to find the index of the mode value in the train column:

     df.loc[:, 'train_label'] = df['train_label'].apply(lambda x: x[0])
     df.loc[:, 'test_label'] = df['test_label'].apply(lambda x: x[0])
    
     tr_mode_idx = df['train_label'].mode().index.values
    

    Then to find the value of test_label based on that index:

     df.loc[tr_mode_index, 'test_label']