Search code examples
pandasdataframecolumnsorting

Sorting columns in pandas dataframe while automating reports


I am working on an automation task and my dataframe columns are as shown below

Defined Discharge Bin   Apr-20  Jan-20  Mar-20  May-20  Grand Total
2-4 min                                             1       1
4-6 min                            5                1       6
6-8 min                            5      7         2       14

I want to sort the columns starting from Jan-20. The problem here is that the columns automatically get sorted according to alphabetical order. Sorting can be done manually but since I'm working on an automation task I need to ensure that each month when we feed the data the columns should automatically get sorted according to the months of the year.


Solution

  • Try this:

    import pandas as pd
    
    
    df = pd.DataFrame(data={'Defined Discharge Bin':['2-4 min', '4-6 min','6-8 min'], 'Apr-20':['', '', ''], 'Jan-20':['', 5, 5], 'Mar-20':['', '', 7], 'May-20':[1, 1, 2], 'Grand Total':[1, 6, 14]})
    cols_exclude = ['Defined Discharge Bin', 'Grand Total']
    cols_date = [c for c in df.columns.tolist() if c not in cols_exclude]
    cols_sorted = sorted(cols_date, key=lambda x: pd.to_datetime(x, format='%b-%y'))
    df = df[cols_exclude[0:1] + cols_sorted + cols_exclude[-1:]]
    print(df)
    

    Output:

      Defined Discharge Bin Jan-20 Mar-20 Apr-20  May-20  Grand Total
    0               2-4 min                            1            1
    1               4-6 min      5                     1            6
    2               6-8 min      5      7              2           14