Search code examples
pythonpandasdataframerowcol

How to convert the first row of the dataframe to a column


i'm working with pandas, and im trying to convert the first row of the dataframe to a column, i have tried the stack() function, but it looks like f%#¤ed. I cannot really use stack() since i have columns of different headers. let's say, your data frame looks like following: (red box: how my dataframe looks like (yellow box: how i want my dataframe to look like) enter image description here

I'm gratefull if you can help me.

Cheers,


Solution

  • I think you need:

    df = df.stack().rename_axis(('a','b','c')).reorder_levels([2,0,1]).sort_index() 
    print (df)
    c     a  b 
    2014  A  A1    1
             A1    9
          B  B1    9
             B2    1
    2015  A  A1    5
             A1    4
          B  B1    1
             B2    7
    2016  A  A1    5
             A1    1
          B  B1    6
             B2    7
    dtype: int64
    

    And if need columns from Multiindex:

    df = df.stack()
           .rename_axis(('a','b','c'))
           .reorder_levels([2,0,1])
           .sort_index()
           .reset_index(name='count')
    print (df)
    
           c  a   b  count
    0   2014  A  A1      1
    1   2014  A  A1      9
    2   2014  B  B1      9
    3   2014  B  B2      1
    4   2015  A  A1      5
    5   2015  A  A1      4
    6   2015  B  B1      1
    7   2015  B  B2      7
    8   2016  A  A1      5
    9   2016  A  A1      1
    10  2016  B  B1      6
    11  2016  B  B2      7
    

    EDIT:

    There is problem you need convert first and second column to Multiindex by parameter index_col:

    tab3 = "C:\\Users\\abk29\\NordicMontlyDemand\\NordicDY\\Book1.xls"
    df = pd.read_excel(tab3, sheetname="Sheet1", skiprows=2, index_col=[0,1])
    print (df)
                  2016  2017  2018
      Unnamed: 1                  
    A A1             1     1     1
      A2             2     2     2
    B B1             3     3     3
      B2             4     4     4
    df = df.stack().rename_axis(('a','b','c')).reorder_levels([2,0,1]).sort_index()
    print (df)
    c     a  b 
    2016  A  A1    1
             A2    2
          B  B1    3
             B2    4
    2017  A  A1    1
             A2    2
          B  B1    3
             B2    4
    2018  A  A1    1
             A2    2
          B  B1    3
             B2    4
    dtype: int64