Search code examples
pythonpandasdataframegroup-by

What happens to the column I do "groupby" in Pandas? Does it still exist in a new dataframe?


say I have a dataframe df1 with a column "A". I do group by operation

df2 = df1.groupby(["A"]).sum() 

to create a new dataframe df2.

When I display the new dataframe df2, I can still see column A, but when I run the command df2.columns to inspect the columns of df2, I can see that the Index does not bring up A any more. It seems that df2 does not actually hold A as a column. Why is this? What can I do to still keep A in df2 as an official "column"?


Solution

  • It is because groupby automatically creates indices out of the groups. You can undo this with reset_index():

    df2 = df2.reset_index()
    

    Or, you can have it not do this with the as_index argument:

    df2 = df1.groupby(["A"], as_index=False).sum()