I have a sorted data frame as below:
x_test test_label x_train train_label \
37 [[6.3, 3.3, 4.7, 1.6]] [1] [[6.4, 3.2, 4.5, 1.5]] [1]
63 [[6.3, 3.3, 4.7, 1.6]] [1] [[6.0, 3.4, 4.5, 1.6]] [1]
67 [[6.3, 3.3, 4.7, 1.6]] [1] [[6.1, 3.0, 4.6, 1.4]] [1]
96 [[6.3, 3.3, 4.7, 1.6]] [1] [[6.1, 3.0, 4.9, 1.8]] [2]
51 [[6.3, 3.3, 4.7, 1.6]] [1] [[5.9, 3.2, 4.8, 1.8]] [1]
dist
37 0.26
63 0.37
67 0.42
96 0.46
51 0.47
I'd like to find the mode value at the 'train_label' column (any one) and get it's index. Next I'd like to find the value at the 'test_label' based on that index. how do I do it?
I've tried using df.mode()
but didn't succeed.
First, to find the index of the mode value in the train column:
df.loc[:, 'train_label'] = df['train_label'].apply(lambda x: x[0])
df.loc[:, 'test_label'] = df['test_label'].apply(lambda x: x[0])
tr_mode_idx = df['train_label'].mode().index.values
Then to find the value of test_label
based on that index:
df.loc[tr_mode_index, 'test_label']