Search code examples
pandasmulti-index

Reset index when both index and columns are multi indices


I have a df with multi index in both rows and cols, and I want to reset_index on both index and cols so that all the mulitindices are new columns. Below is an example of what I have and what I want.

What I have:

indexAarrays = [['bar', 'bar', 'baz', 'baz', ],
          ['one', 'two', 'one', 'two']]
indexTuples = list(zip(*indexAarrays))
index =  pd.MultiIndex.from_tuples(indexTuples, names=['firstIndex', 'secondIndex'])

colAarrays = [['c1', 'c1', 'c2', 'c2', ],
          ['d1', 'd2', 'd1', 'd2']]
colTuples = list(zip(*colAarrays ))
col =  pd.MultiIndex.from_tuples(colTuples, names=['firstCol', 'secondCol'])

df = pd.DataFrame(data=np.random.random_sample((len(index), len(col))), 
                  index=index, columns=col)
df

The above gives the DF i have:

firstCol                      c1                  c2          
secondCol                     d1        d2        d1        d2
firstIndex secondIndex                                        
bar        one          0.231221  0.846196  0.037493  0.516474
           two          0.810847  0.204095  0.423766  0.513262
baz        one          0.433040  0.118018  0.267039  0.356261
           two          0.529042  0.181886  0.093488  0.643357

What I want:

wantedCols = [idxName for idxName in index.names] \
                        + [colName for colName in col.names]\
                        + ['Value']
dfWanted = pd.DataFrame(index = range(int(df.shape[0]*df.shape[1]/(len(wantedCols)-1))),
                        columns=wantedCols)

idxCounter = 0
for idx1 in df.index.get_level_values(0).unique():
    for idx2 in df.index.get_level_values(1).unique():
        for c1 in df.columns.get_level_values(0).unique():
            for c2 in df.columns.get_level_values(1).unique():
                dfWanted.loc[idxCounter, 'firstIndex'] = idx1
                dfWanted.loc[idxCounter, 'secondIndex'] = idx2
                dfWanted.loc[idxCounter, 'firstCol'] = c1
                dfWanted.loc[idxCounter, 'secondCol'] = c2
                dfWanted.loc[idxCounter, 'Value'] = df.loc[(idx1, idx2), (c1, c2)]
                idxCounter += 1

dfWanted     

The above gives the DF I want:

firstIndex secondIndex firstCol secondCol      Value
0         bar         one       c1        d1   0.231221
1         bar         one       c1        d2   0.846196
2         bar         one       c2        d1   0.037493
3         bar         one       c2        d2   0.516474
4         bar         two       c1        d1   0.810847
5         bar         two       c1        d2   0.204095
6         bar         two       c2        d1   0.423766
7         bar         two       c2        d2   0.513262
8         baz         one       c1        d1    0.43304
9         baz         one       c1        d2   0.118018
10        baz         one       c2        d1   0.267039
11        baz         one       c2        d2   0.356261
12        baz         two       c1        d1   0.529042
13        baz         two       c1        d2   0.181886
14        baz         two       c2        d1  0.0934878
15        baz         two       c2        d2   0.643357    

Does anyone know of an easier way to reset the indices than the method I used above?


Solution

  • Use DataFrame.stack by both levels and then convert MultiIndex Series to columns by Series.reset_index:

    df = df.stack([0,1]).reset_index(name='Value')
    print (df)
       firstIndex secondIndex firstCol secondCol     Value
    0         bar         one       c1        d1  0.746027
    1         bar         one       c1        d2  0.622784
    2         bar         one       c2        d1  0.613197
    3         bar         one       c2        d2  0.449560
    4         bar         two       c1        d1  0.560810
    5         bar         two       c1        d2  0.125046
    6         bar         two       c2        d1  0.147148
    7         bar         two       c2        d2  0.622862
    8         baz         one       c1        d1  0.537280
    9         baz         one       c1        d2  0.801410
    10        baz         one       c2        d1  0.889445
    11        baz         one       c2        d2  0.226477
    12        baz         two       c1        d1  0.100759
    13        baz         two       c1        d2  0.279383
    14        baz         two       c2        d1  0.041767
    15        baz         two       c2        d2  0.739942