Search code examples
pythonpandassortingreindex

Sorting Dataframe Columns By Alphabetical & Numerical Order


I would like to re-index the columns of a dataframe by sorting them by both alphabetical order and numerical order.

In my dataframe df, my columns are named:

Round 1 Position, Round 2 Position, Round 1 Score, Round 2 Score, Round 1 Price, Round 2 Price, ..., Round 10 Position, Round 10 Score, Round 10 Price

I would like them to be ordered as follows:

Round 1 Position, Round 1 Price, Round 1 Score, Round 2 Position, Round 2 Price, Round 2 Score, Round 3 Position, Round 3 Price, Round 3 Score, ..., Round 10 Position, Round 10 Price, Round 10 Score

I have tried the following:

df.reindex(sorted(df.columns, key=lambda x: float(x[1:])), axis=1) 

but had no luck as the column name was a string.

Thank you for taking the time to read through my question and help me. Much appreciated!


Solution

  • Once you sort the column names into their natural order, you can use [] indexing to re-order the columns.

    import pandas as pd
    import natsort
    
    df = pd.DataFrame({'A2': [1, 2, 3],
                   'B1': [4, 5, 6],
                   'A1': [7, 8, 9]})
    
    # sort the column names
    col_names = df.columns
    sorted_col_names = natsort.natsorted(col_names)
    
    # re-order the columns
    df = df[sorted_col_names]
    print(df)
    

    Output:

       A1  A2  B1
    0   7   1   4
    1   8   2   5
    2   9   3   6
    

    There are several options for sorting alpha-numeric values in the answers to How to sort alpha numeric set in python. I tested the natsort library, and it works with your input.

    columns = ['Round 1 Position', 'Round 2 Position', 'Round 1 Score', 'Round 2 Score', 'Round 1 Price',
               'Round 2 Price', 'Round 10 Position', 'Round 10 Score', 'Round 10 Price']
    columns_s = natsort.natsorted(columns)
    print(columns_s)
    

    Output:

    ['Round 1 Position', 'Round 1 Price', 'Round 1 Score', 'Round 2 Position', 'Round 2 Price', 'Round 2 Score', 'Round 10 Position', 'Round 10 Price', 'Round 10 Score']