Search code examples
pythonpandasmulti-index

Sort outer multi-index


I want to sort a dataframe highest to lowest based on column B. I can't find an answer on how to sort the outer (i.e. first) index column.

I have this example data:

                  A         B
Item    Type
0       X         'rtr'     2
        Tier      'sfg'     104
1       X         'zad'     7
        Tier      'asd'     132
2       X         'frs'     4
        Tier      'plg'     140
3       X         'gfq'     9
        Tier      'bcd'     100

Each multi-index row contains a "Tier" row. I want to sort the outer index "Item" based on the "B" column value relating to each "Tier". The "A" column can be ignored for sorting purposes but needs to be included in the dataframe.

                  A         B
Item    Type
2       X         'frs'     4
        Tier      'plg'     140
1       X         'zad'     7
        Tier      'asd'     132
0       X         'rtr'     2
        Tier      'sfg'     104
3       X         'gfq'     9
        Tier      'bcd'     100

Solution

  • New Response #2

    Based on all the inputs received, here's the solution. Hope this works for you.

    import pandas as pd
    
    df = pd.read_csv("xyz.txt")
    
    df1 = df.copy()
    #capture the original index of each row. This will be used for sorting later
    df1['idx'] = df1.index
    
    #create a dataframe with only items that match 'Tier'
    #assumption is each Index has a row with 'Tier'
    tier = df1.loc[df1['Type']=='Tier']
    
    #sort Total for only the Tier rows
    tier = tier.sort_values('Total')
    
    #Create a list of the indexes in sorted order
    #this will be the order to print the rows
    tier_list = tier['Index'].tolist()
    
    # Create the dictionary that defines the order for sorting
    sorterIndex = dict(zip(tier_list, range(len(tier_list))))
    
    # Generate a rank column that will be used to sort the dataframe numerically
    df1['Tier_Rank'] = df1['Index'].map(sorterIndex)
    
    #Now sort the dataframe based on rank column and original index
    df1.sort_values(['Tier_Rank','idx'],ascending = [True, True],inplace = True)
    
    #drop the temporary column we created
    df1.drop(['Tier_Rank','idx'], 1, inplace = True)
    
    #print the dataframe
    print (df1)
    

    Based on the source data, here's the final output. Let me know if this is in line with what you were looking for.

        Index         Type                     Id  ...  Intellect  Strength  Total
    12      2  Chest Armor  "6917529202229928161"  ...         17         8     62
    13      2    Gauntlets  "6917529202229927889"  ...         16        14     60
    14      2       Helmet  "6917529202223945870"  ...         10         9     66
    15      2    Leg Armor  "6917529202802011569"  ...         15         2     61
    16      2          Set                    NaN  ...         58        33    249
    17      2         Tier                    NaN  ...          5         3     22
    24      4  Chest Armor  "6917529202229928161"  ...         17         8     62
    25      4    Gauntlets  "6917529202802009244"  ...          7         9     63
    26      4       Helmet  "6917529202223945870"  ...         10         9     66
    27      4    Leg Armor  "6917529202802011569"  ...         15         2     61
    28      4          Set                    NaN  ...         49        28    252
    29      4         Tier                    NaN  ...          4         2     22
    42      7  Chest Armor  "6917529202229928161"  ...         17         8     62
    43      7    Gauntlets  "6917529202791088503"  ...          7        14     61
    44      7       Helmet  "6917529202223945870"  ...         10         9     66
    45      7    Leg Armor  "6917529202229923870"  ...          7        19     57
    46      7          Set                    NaN  ...         41        50    246
    47      7         Tier                    NaN  ...          4         5     22
    0       0  Chest Armor  "6917529202229928161"  ...         17         8     62
    1       0    Gauntlets  "6917529202778947311"  ...         10        15     62
    2       0       Helmet  "6917529202223945870"  ...         10         9     66
    3       0    Leg Armor  "6917529202802011569"  ...         15         2     61
    4       0          Set                    NaN  ...         52        34    251
    5       0         Tier                    NaN  ...          5         3     23
    6       1  Chest Armor  "6917529202229928161"  ...         17         8     62
    7       1    Gauntlets  "6917529202778947311"  ...         10        15     62
    8       1       Helmet  "6917529202223945870"  ...         10         9     66
    9       1    Leg Armor  "6917529202229923870"  ...          7        19     57
    10      1          Set                    NaN  ...         44        51    247
    11      1         Tier                    NaN  ...          4         5     23
    18      3  Chest Armor  "6917529202229928161"  ...         17         8     62
    19      3    Gauntlets  "6917529202229927889"  ...         16        14     60
    20      3       Helmet  "6917529202223945870"  ...         10         9     66
    21      3    Leg Armor  "6917529202229923870"  ...          7        19     57
    22      3          Set                    NaN  ...         50        50    245
    23      3         Tier                    NaN  ...          5         5     23
    30      5  Chest Armor  "6917529202229928161"  ...         17         8     62
    31      5    Gauntlets  "6917529202802009244"  ...          7         9     63
    32      5       Helmet  "6917529202223945870"  ...         10         9     66
    33      5    Leg Armor  "6917529202229923870"  ...          7        19     57
    34      5          Set                    NaN  ...         41        45    248
    35      5         Tier                    NaN  ...          4         4     23
    36      6  Chest Armor  "6917529202229928161"  ...         17         8     62
    37      6    Gauntlets  "6917529202791088503"  ...          7        14     61
    38      6       Helmet  "6917529202223945870"  ...         10         9     66
    39      6    Leg Armor  "6917529202802011569"  ...         15         2     61
    40      6          Set                    NaN  ...         49        33    250
    41      6         Tier                    NaN  ...          4         3     23
    
    [48 rows x 11 columns]
    

    New Response:

    Based on the source data file shared, here's the group by and sort. Let me know how you want the values to be sorted. I have assumed that you want it sorted by Index, then Total.

    df = df.groupby(['Index','Type',])\
           .agg({'Total':'mean'})\
           .sort_values(['Index','Total'])
    

    The output of this will be as follows:

                       Total
    Index Type              
    0     Tier            23
          Leg Armor       61
          Chest Armor     62
          Gauntlets       62
          Helmet          66
          Set            251
    1     Tier            23
          Leg Armor       57
          Chest Armor     62
          Gauntlets       62
          Helmet          66
          Set            247
    2     Tier            22
          Gauntlets       60
          Leg Armor       61
          Chest Armor     62
          Helmet          66
          Set            249
    3     Tier            23
          Leg Armor       57
          Gauntlets       60
          Chest Armor     62
          Helmet          66
          Set            245
    4     Tier            22
          Leg Armor       61
          Chest Armor     62
          Gauntlets       63
          Helmet          66
          Set            252
    

    Initial Response:

    I dont have your raw data. Created some data to show you how sorting would work on groupby data. See if this is what you are looking for.

    import pandas as pd

    df = pd.DataFrame({'Animal': ['Falcon', 'Falcon','Parrot', 'Parrot'],
                       'Type':['Wild', 'Captive', 'Wild', 'Captive'],
                       'Air': ['Good','Bad', 'Bad', 'Good'],
                       'Max Speed': [380., 370., 24., 26.]})
    
    df = df.groupby(['Animal','Type','Air'])\
           .agg({'Max Speed':'mean'})\
           .sort_values('Max Speed')
    print(df)
    

    The output will be as follows:

                         Max Speed
    Animal Type    Air            
    Parrot Wild    Bad        24.0
           Captive Good       26.0
    Falcon Captive Bad       370.0
           Wild    Good      380.0
    

    Without the sort command, the output will be a bit different.

    df = df.groupby(['Animal','Type','Air'])\
           .agg({'Max Speed':'mean'})
    

    This will result in below. The Max Speed is not sorted. Instead it is using the group by sort of Animal then Type:

                         Max Speed
    Animal Type    Air            
    Falcon Captive Bad       370.0
           Wild    Good      380.0
    Parrot Captive Good       26.0
           Wild    Bad        24.0