I have an array like this:
['A 100', 'A 200', 'A 300', 'A 400', 'A 500', 'B 100', 'B 200', 'B 300', 'B 400']
I also have a dataframe like this:
BIN CA SUM
100 B B 100
300 A A 300
300 B B 300
400 B B 400
400 A A 400
200 B B 200
100 A A 100
200 A A 200
I want to use pd.Categorical
to order the column dataframe according to the array.
The expected output is:
BIN CA SUM
100 A A 100
200 A A 200
300 A A 300
400 A A 400
100 B B 100
200 B B 200
300 B B 300
400 B B 400
You can use pd.Categorical
to convert the SUM
column to categorical column having order, then sort
the values:
df['SUM'] = pd.Categorical(df['SUM'], categories=arr, ordered=True)
df.sort_values('SUM')
Alternatively you can create a dictionary that maps the items in arr
to their sorting order then .map
this dictionary on SUM
column and use np.argsort
to get the indices that would sort the dataframe:
dct = {v: i for i, v in enumerate(arr)}
df.iloc[np.argsort(df['SUM'].map(dct))]
BIN CA SUM
6 100 A A 100
7 200 A A 200
1 300 A A 300
4 400 A A 400
0 100 B B 100
5 200 B B 200
2 300 B B 300
3 400 B B 400