Search code examples
pythonpandastypeerror

Why is there a "TypeError: incompatible index of inserted column with frame index" error in Pandas version 2.0.0 only


Complete newb here, for asking questions on this platform as well as python development.

I was given a script and there is a line of code that works with Pandas 1.4.4, but not Pandas 2.0.0.

The line of code is:

CLEANCPM.groupby(['UNIT_NUM'])['COST'].apply(lambda x:x.cumsum())

When I force install pandas v1.4.4, the script completes without error.

When using Pandas 2.0.0, I get:

"TypeError: incompatible index of inserted column with frame index"

I created a virtual environment and installed Pandas v1.4.4. This is the version that is in the Ananconda distribution I was originally using to run this script in a Jupyter notebook.

When I tried using VSCode and installed Pandas, version 2.0.0 was installed. Then the error started.

I know enough that certain package versions are necessary and that's the point of virtual environments.

Just wondering if I should post an issue on github for this. I found similar looking issue posts, but thought they may be for older (than 1.4.4) pandas versions.


Solution

  • You probably shouldn't use apply+cumsum. You can directly use groupby.cumsum:

    CLEANCPM.groupby(['UNIT_NUM'])['COST'].cumsum()
    

    Unlike apply, this will handle the index differently, leaving it unchanged.