Search code examples
pythonpandas

Transforming Data with implicit categories in header with pandas


I have a table like:

           | 2022 | 2022 | 2021 | 2021
class      |   A  |   B  |   A  |   B
-----------|------|------|------|------
X          |   1  |   2  |   3  |   4
Y          |   5  |   6  |   7  |   8

How can I transform it to following form?

year   |  category | class | value
----------------------------------
2022   |     A     |   X   |   1
2022   |     A     |   Y   |   5
2022   |     B     |   X   |   2
2022   |     B     |   Y   |   6
2021   |     A     |   X   |   3
2021   |     A     |   Y   |   7
2021   |     B     |   X   |   4
2021   |     B     |   Y   |   8

I tried various combinations of pd.melt with no success..

Thx in advance!


Solution

  • You could melt with ignore_index=False and rename_axis/rename:

    out = (df.rename_axis(columns=['year', 'category'])
             .melt(ignore_index=False)
             .reset_index()
          )
    

    Or:

    out = (df.melt(ignore_index=False)
             .rename(columns={'variable_0': 'year',
                              'variable_1': 'category'})
             .reset_index()
          )
    

    Output:

      class  year category  value
    0     X  2022        A      1
    1     Y  2022        A      5
    2     X  2022        B      2
    3     Y  2022        B      6
    4     X  2021        A      3
    5     Y  2021        A      7
    6     X  2021        B      4
    7     Y  2021        B      8
    

    Reproducible input:

    df = pd.DataFrame.from_dict({'index': ['X', 'Y'],
                                 'columns': [('2022', 'A'), ('2022', 'B'), ('2021', 'A'), ('2021', 'B')],
                                 'data': [[1, 2, 3, 4], [5, 6, 7, 8]],
                                 'index_names': ['class'],
                                 'column_names': [None, None]},
                                orient='tight')