Assume a Pandas data frame (for the sake of simplicity, let's say with three columns). The columns are titled A
, B
and d
.
$ import pandas as pd
$ df = pd.DataFrame([[1, 2, "a"], [1, "b", 3], ["c", 4, 6]], columns=['A', 'B', 'd'])
$ df
A B d
0 1 2 a
1 1 b 3
2 c 4 6
Further assume that I wish to sort the data frame so that the columns have exactly the following order: d
, A
, B
. The rows of the data frame shall not be rearranged in any way. The desired output is:
$ col_target_order = ['d', 'A', 'B']
$ df_desired
d A B
0 a 1 2
1 3 1 b
2 6 c 4
I know that this can be done via the sort_index
function of pandas. However, the following won't work, as the input list (col_target_order
) is not callable:
$ df.sort_index(axis=1, key=col_target_order)
What key specification do I have to use?
Don't sort, just index:
out = df[col_target_order]
For the sake of the argument, you could sort_index
with a crafted Series as key:
df.sort_index(axis=1, key=pd.Series(range(len(col_target_order)), index=col_target_order).get)
Or an Index indexer:
df.sort_index(axis=1, key=pd.Index(col_target_order).get_indexer)
Output:
d A B
0 a 1 2
1 3 1 b
2 6 c 4