Search code examples
pythonpandasdataframelambdatuples

Sort Rows Based on Tuple Index


Overview:

Pandas dataframe with a tuple index and corresponding 'Num' column:

Index                               Num

('Total', 'A')                      23

('Total', 'A', 'Pandas')            3

('Total', 'A', 'Row')               7

('Total', 'A', 'Tuple')             13

('Total', 'B')                      35

('Total', 'B', 'Rows')              12

('Total', 'B', 'Two')               23

('Total', 'C')                      54

('Total', 'C', 'Row')               54

Total                               112

The index and 'Num' column are already sorted with a lambda function by Alphabetical Order and based on the length of tuple elements:

dataTable = dataTable.reindex(sorted(dataTable.index, key=lambda x: (not isinstance(x, tuple), x)))

Problem:

Now, I want to sort only the 3rd tuple index element based on it's corresponding 'Num' value. Here would be an updated example of the dataframe:

Index                               Num

('Total', 'A')                      23

('Total', 'A', 'Tuple')             13

('Total', 'A', 'Row')               7

('Total', 'A', 'Pandas')            3

('Total', 'B')                      35

('Total', 'B', 'Two')               23

('Total', 'B', 'Rows')              12

('Total', 'C')                      54

('Total', 'C', 'Row')               54

Total                               112

Question:

What Lambda function can achieve this?


Solution

  • You can try:

    def fn(x):
        vals = x.sort_values(by='Num', ascending=False)
        df.loc[x.index] = vals.values
    
    m = df['Index'].apply(len).eq(3)
    df[m].groupby(df.loc[m, 'Index'].str[1], group_keys=False).apply(fn)
    
    print(df)
    

    Prints:

                    Index  Num
    0          (Total, A)   23
    1   (Total, A, Tuple)   13
    2     (Total, A, Row)    7
    3  (Total, A, Pandas)    3
    4          (Total, B)   35
    5     (Total, B, Two)   23
    6    (Total, B, Rows)   12
    7          (Total, C)   54
    8     (Total, C, Row)   54
    9               Total  112
    

    Initial df:

                    Index  Num
    0          (Total, A)   23
    1  (Total, A, Pandas)    3
    2     (Total, A, Row)    7
    3   (Total, A, Tuple)   13
    4          (Total, B)   35
    5    (Total, B, Rows)   12
    6     (Total, B, Two)   23
    7          (Total, C)   54
    8     (Total, C, Row)   54
    9               Total  112