After a long While I visited to the SO's pandas section and got a question which is not indeed nicely framed thus thought to put here in an explicit way as similar kind of situation I'm too as well :-)
Below is the data frame construct:
>>> df
measure Pend Job Run Job Time
cls
ABC [inter, batch] [101, 93] [302, 1327] [56, 131]
DEF [inter, batch] [24279, 421] [4935, 5452] [75, 300]
I tried working hard but didn't get any solution, thus though to Sketch it here as that's somewhat I would like it be achieved.
----------------------------------------------------------------------------------
| |Pend Job | Run Job | Time |
cls | measure |-----------------------------------------------------------
| |inter | batch| |inter | batch| |inter | batch |
----|-----------------|------|------|-------|------|------|-----|------|----------
ABC |inter, batch |101 |93 | |302 |1327 | |56 |131 |
----|-----------------|-------------|-------|------|------|-----|------|---------|
DEF |inter, batch |24279 |421 | |4935 |5452 | |75 |300 |
----------------------------------------------------------------------------------
Saying that I want my dataFrame into MultiIndex Dataframe where Pend Job
, Run Job
, and Time
to be on the top as above.
cls
is not in the columns
This is my approach, you can modify it to your need:
s = (df.drop('measure', axis=1) # remove the measure column
.set_index(df['measure'].apply(', '.join),
append=True) # make `measure` second level index
.stack().explode().to_frame() # concatenate all the values
)
# assign `inter` and `batch` label to each new cell
new_lvl = np.array(['inter','batch'])[s.groupby(level=(0,1,2)).cumcount()]
# or
# new_lvl = np.tile(['inter', 'batch'], len(s)//2)
(s.set_index(new_level, append=True)[0]
.unstack(level=(-2,-1)
.reset_index()
)
Output:
cls measure Pend Job
inter batch
0 ABC inter, batch 101 93
1 DEF inter, batch 24279 421