Search code examples
pythonpandaspython-3.12

Why does x[::-1] not reverse a grouped pandas dataframe column in Python 3.12.3?


I have a large pandas DataFrame (160k records, 60 columns of mostly text) but for this example, I have the following DataFrame:

df1 = pd.DataFrame([{"GROUP": "A", "COL2": "1", "COL3": "P"},
                {"GROUP": "A", "COL2": "2", "COL3": "Q"},
                {"GROUP": "A", "COL2": "3", "COL3": "R"},
                {"GROUP": "B", "COL2": "4", "COL3": "S"},
                {"GROUP": "B", "COL2": "5", "COL3": "T"},
                {"GROUP": "B", "COL2": "6", "COL3": "U"},
                {"GROUP": "B", "COL2": "7", "COL3": "V"}])

I am trying to create another column that reverses the order of COL2 but only within groups of A and B...so I would expect from top to bottom, the values to be 3,2,1,7,6,5,4. This is achieved with this line of code:

df1['REVERSED_COL'] = df1.groupby("GROUP")["COL2"].transform(lambda x: x[::-1])

Instead, I get this: enter image description here

This line of code worked when I was using Python 3.11.7. However, I recently upgraded to 3.12.3 (and all other modules including pandas to 2.2.1) and removed 3.11.7 interpreter from my machine so I can't go back and test it again. I also have another machine using Python 3.7.4 where the same line of code still works as expected.

I tried using the x[::-1] on a list, and that reversed them as expected: This code:

x = [1,2,3,4,5,6,7,8,9]
print(x)
y = x[::-1]
print(y)

Results in:

[1, 2, 3, 4, 5, 6, 7, 8, 9]
[9, 8, 7, 6, 5, 4, 3, 2, 1]

My question is, am I doing something wrong? Is there a different way I can do this? I checked the pandas and python documentation on Python changes and pandas 2.2.1 changes (respectively) but I couldn't find anything relevant enough.


Solution

  • Because the output of transform is aligned back to the original index when it is a Series, which in your case reverts it back to its original order.

    To avoid this you must convert to array/list (with values/array/to_numpy):

    df1['REVERSED_COL'] = (df1.groupby("GROUP")["COL2"]
                           .transform(lambda x: x[::-1].array)
                          )
    

    Output:

      GROUP COL2 COL3 REVERSED_COL
    0     A    1    P            3
    1     A    2    Q            2
    2     A    3    R            1
    3     B    4    S            7
    4     B    5    T            6
    5     B    6    U            5
    6     B    7    V            4