I have the following type of a dataframe, values which are grouped by 3 different categories A,B,C:
import pandas as pd
A = ['A1', 'A2', 'A3', 'A2', 'A1']
B = ['B3', 'B2', 'B2', 'B1', 'B3']
C = ['C2', 'C2', 'C3', 'C1', 'C3']
value = ['6','2','3','3','5']
df = pd.DataFrame({'categA': A,'categB': B, 'categC': C, 'value': value})
df
Which looks like:
categA categB categC value
0 A1 B3 C2 6
1 A2 B2 C2 2
2 A3 B2 C3 3
3 A2 B1 C1 3
4 A1 B3 C3 5
Now, when I want to unstack this df by the C category, .unstack() returns some multi-indexed dataframe with 'value' at the first level and my categories of interest C1, C2 & C3 at the second level:
df = df.set_index(['categA','categB','categC']).unstack('categC')
df
Output:
value
categC C1 C2 C3
categA categB
A1 B3 NaN 6 5
A2 B1 3 NaN NaN
B2 NaN 2 NaN
A3 B2 NaN NaN 3
Is there a quick and clean way to get rid of the multi-index by reducing it to the highest available level? This is what I'd like to have as output:
categA categB C1 C2 C3
A1 B3 NaN 6 5
A2 B1 3 NaN NaN
B2 NaN 2 NaN
A3 B2 NaN NaN 3
Many thanks in advance!
Edit:
print(df.reset_index())
gives:
categA categB value
categC C1 C2 C3
0 A1 B3 NaN 6 5
1 A2 B1 3 NaN NaN
2 A2 B2 NaN 2 NaN
3 A3 B2 NaN NaN 3
Adding reset_index
also , unstack
with Series
df.set_index(['categA','categB','categC']).value.unstack('categC').reset_index()
Out[875]:
categC categA categB C1 C2 C3
0 A1 B3 None 6 5
1 A2 B1 3 None None
2 A2 B2 None 2 None
3 A3 B2 None None 3