I am working on an automation task and my dataframe columns are as shown below
Defined Discharge Bin Apr-20 Jan-20 Mar-20 May-20 Grand Total
2-4 min 1 1
4-6 min 5 1 6
6-8 min 5 7 2 14
I want to sort the columns starting from Jan-20. The problem here is that the columns automatically get sorted according to alphabetical order. Sorting can be done manually but since I'm working on an automation task I need to ensure that each month when we feed the data the columns should automatically get sorted according to the months of the year.
Try this:
import pandas as pd
df = pd.DataFrame(data={'Defined Discharge Bin':['2-4 min', '4-6 min','6-8 min'], 'Apr-20':['', '', ''], 'Jan-20':['', 5, 5], 'Mar-20':['', '', 7], 'May-20':[1, 1, 2], 'Grand Total':[1, 6, 14]})
cols_exclude = ['Defined Discharge Bin', 'Grand Total']
cols_date = [c for c in df.columns.tolist() if c not in cols_exclude]
cols_sorted = sorted(cols_date, key=lambda x: pd.to_datetime(x, format='%b-%y'))
df = df[cols_exclude[0:1] + cols_sorted + cols_exclude[-1:]]
print(df)
Output:
Defined Discharge Bin Jan-20 Mar-20 Apr-20 May-20 Grand Total
0 2-4 min 1 1
1 4-6 min 5 1 6
2 6-8 min 5 7 2 14