Search code examples
pythonpandasmulti-index

Split column into multindex columns based on unique values


If I have a data frame

test = pd.DataFrame({'IDs': ['one', 'one', 'one', 'two', 'two',
                            'two'],
                    'X': ['A', 'B', 'C', 'A', 'B', 'C'],
                    'Y': [1, 2, 3, 4, 5, 6]})

that looks like

    IDs     X   Y
0   one     A   1
1   one     B   2
2   one     C   3
3   two     A   4
4   two     B   5
5   two     C   6

what is the best way to output the format such as:

IDs  one     two    
Pos X   Y   X   Y
0   A   1   A   4
1   B   2   B   5
2   C   3   C   6

The ID's can change in name and length (i.e. number of actual ID's).


Solution

  • Use GroupBy.cumcount for counter, then create MultiIndex by DataFrame.set_index, reshape by DataFrame.unstack and last DataFrame.swaplevel with DataFrame.sort_index for change level order:

    df = (test.set_index([test.groupby('IDs').cumcount(), 'IDs'])
               .unstack()
               .rename_axis(('Pos','IDs'), axis=1)
               .swaplevel(1, 0, axis=1)
               .sort_index(axis=1))
    print (df)
    IDs one    two   
    Pos   X  Y   X  Y
    0     A  1   A  4
    1     B  2   B  5
    2     C  3   C  6