Search code examples
pythonpandasspyder

String Into Integer while sorting


Curious if there is a way to convert a string into an integer, only during the sort_values() process, or if it's easier to convert the variable to an integer prior to sorting and then convert back to string after sorting.

Current code ran, but code is not correct, because I believe the D_Index is actually a string, so it was sorted as 11,12,2,21,22,3 instead of 2,3,11,12,21,22. See table example and code below.

Model D_Index
First 11
Second 12
Third 2
Fourth 21
Fifth 22
Sixth 3
df_New = df_Old.sort_values(['Model','D_Index'])

Solution

  • You can pass a sorting key to sort_values:

    out = df.sort_values(by='D_Index', key=lambda x: x.astype(int))
    

    Output:

        Model D_Index
    2   Third       2
    5   Sixth       3
    0   First      11
    1  Second      12
    3  Fourth      21
    4   Fifth      22
    

    If you want to include Model into it, you can use the Series' name:

    df.sort_values(by=['Model','D_Index'], key=lambda x: x.astype(int) if x.name=='D_Index' else x)
    

    Output (in your example, the sorting is trivial, since no Model has different D_Indexes):

        Model  D_Index
    4   Fifth       22
    0   First       11
    3  Fourth       21
    1  Second       12
    5   Sixth        3
    2   Third        2