Search code examples
pythonsortingpivot-tablecategorical-data

Sorting pivoted columns in pd.pivot_table


I have a dataframe of the kind: original dataframe

I intend to pivot it so that the 3 types of medals become the columns and their respective counts and mean scores become the value. Like this: pivot_table

Although, I want the columns of the pivot table in the order : 'Gold', 'Silver', 'Bronze' instead of the alphabetic order that it is currently getting displayed in.

I have tried this:

df['medals']=pd.Categorical(values = df['medals'], ordered = True, categories =['Gold','Silver','Bronze'])

df = df.sort_values(by =[ 'name','medals'])

This resulted in my original dataframe's rows being in the expected order : original dataframe after custom sort

However, my medal names go back to being in alphabetic order when pivoted into column names. Sorting the columns of pivot_table (.sort_values(by = 'medals', axis = 1) doesn't help either. Any clues on how to carry this customised order all the way to the pivot_table?


Solution

  • You didn't provide sample data, so I just made this up.

    import pandas as pd
    
    # Sample DataFrame
    data = {
        'Year': [2010, 2010, 2011, 2011, 2012, 2012],
        'Month': ['Jan', 'Feb', 'Jan', 'Feb', 'Jan', 'Feb'],
        'Value': [10, 20, 15, 25, 30, 35]
    }
    
    df = pd.DataFrame(data)
    
    # Pivot the DataFrame
    pivot_df = df.pivot(index='Year', columns='Month', values='Value')
    
    # Reindex columns to ensure all 12 months are present
    pivot_df = pivot_df.reindex(columns=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
    
    # Display the pivoted DataFrame before sorting
    print("DataFrame before sorting:")
    print(pivot_df)
    
    # Sort columns alphabetically
    pivot_df_sorted = pivot_df.sort_index(axis=1)
    
    # Display the pivoted DataFrame after sorting
    print("\nDataFrame after sorting columns:")
    print(pivot_df_sorted)
    

    Result:

    DataFrame before sorting:
    Month  Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
    Year                                                             
    2010    10   20  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
    2011    15   25  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
    2012    30   35  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
    
    DataFrame after sorting columns:
    Month  Apr  Aug  Dec  Feb  Jan  Jul  Jun  Mar  May  Nov  Oct  Sep
    Year                                                             
    2010   NaN  NaN  NaN   20   10  NaN  NaN  NaN  NaN  NaN  NaN  NaN
    2011   NaN  NaN  NaN   25   15  NaN  NaN  NaN  NaN  NaN  NaN  NaN
    2012   NaN  NaN  NaN   35   30  NaN  NaN  NaN  NaN  NaN  NaN  NaN