say I have a dataframe df1 with a column "A". I do group by operation
df2 = df1.groupby(["A"]).sum()
to create a new dataframe df2.
When I display the new dataframe df2, I can still see column A, but when I run the command df2.columns to inspect the columns of df2, I can see that the Index does not bring up A any more. It seems that df2 does not actually hold A as a column. Why is this? What can I do to still keep A in df2 as an official "column"?
It is because groupby automatically creates indices out of the groups. You can undo this with reset_index()
:
df2 = df2.reset_index()
Or, you can have it not do this with the as_index
argument:
df2 = df1.groupby(["A"], as_index=False).sum()