I have a Pandas Dataframe
with MultiIndex
in the row indexers like this:
This dataframe is a result of a groupby
operation and then slicing from a 3-level MultiIndex
.I would like the 'date' row indexer to remain, but shift the 'SlabType' level of row indexers into column indexer with non-available values as NaN
.
This is what I would like to get to:
What operations do I need to do to achieve this? Also if the title of the question can be improved, please suggest so.
Use unstack
with select column SlabLT
:
print (df['SlabLT'].unstack())
But if possible duplicates in MultiIndex
is necessary aggregate column, a.g. by mean
:
print (df.groupby(level=[0,1])['SlabLT'].mean().unstack())
Sample:
df = pd.DataFrame({'date':['2017-10-01','2017-10-08','2017-10-08','2017-10-15', '2017-10-15'],
'SlabType':['UOM2','AMOUNT','UOM2','AMOUNT','AMOUNT'],
'SlabLT':[1,6000,1,6000,5000]}).set_index(['date','SlabType'])
print (df)
SlabLT
date SlabType
2017-10-01 UOM2 1
2017-10-08 AMOUNT 6000
UOM2 1
2017-10-15 AMOUNT 6000 <-duplicated MultiIndex '2017-10-15', 'AMOUNT'
AMOUNT 5000 <-duplicated MultiIndex '2017-10-15', 'AMOUNT'
print (df['SlabLT'].unstack())
ValueError: Index contains duplicate entries, cannot reshape
print (df.groupby(level=[0,1])['SlabLT'].mean())
date SlabType
2017-10-01 UOM2 1
2017-10-08 AMOUNT 6000
UOM2 1
2017-10-15 AMOUNT 5500
Name: SlabLT, dtype: int64
print (df.groupby(level=[0,1])['SlabLT'].mean().unstack())
SlabType AMOUNT UOM2
date
2017-10-01 NaN 1.0
2017-10-08 6000.0 1.0
2017-10-15 5500.0 NaN