Search code examples
pythonpandasnumpypivottranspose

Pandas transpose multiple columns for every row in dataframe with 2 header rows


I have a csv (see image below) with the first 2 rows as the column headers and I need the first header row to be the header and second header row to be transposed for every row.

enter image description here

I tried transposing and using the pivot functions but those didn't work. I also tried the pivot_table but that wasn't working either.

The expected output is:

enter image description here


Solution

  • I just assumed your CSV file is like the one I have inline in my code.

    Solution

    First, read the file and ignore the headers.

    import io
    import pandas as pd
    import numpy as np
    
    csv_file = io.StringIO('''\
    ,,,,Quantity,Quantity,Quantity,Dollars,Dollars,Dollars
    Brand Family,Brand,Channel,Product,2022-01-12,2022-01-13,2022-01-14,2022-01-12,2022-01-13,2022-01-14
    Brand A,Brand A,Reg,A1,18,47,41,216,164,492
    Brand A,Brand A,Reg,A2,9,23,20,108,276,240
    Brand A,Brand A,Reg,A3,28,80,82,392,1120,1148
    Brand A,Brand A,Reg,A4,,,,,,
    Brand A,Brand A,Reg,A5,7,15,13,98,210,182\
    ''')
    
    df = pd.read_csv(csv_file, header=None).replace(np.nan, 0)
    print(df)
    
                  0        1        2        3           4           5           6           7           8           9
    0             0        0        0        0    Quantity    Quantity    Quantity     Dollars     Dollars     Dollars
    1  Brand Family    Brand  Channel  Product  2022-01-12  2022-01-13  2022-01-14  2022-01-12  2022-01-13  2022-01-14
    2       Brand A  Brand A      Reg       A1          18          47          41         216         164         492
    3       Brand A  Brand A      Reg       A2           9          23          20         108         276         240
    4       Brand A  Brand A      Reg       A3          28          80          82         392        1120        1148
    5       Brand A  Brand A      Reg       A4           0           0           0           0           0           0
    6       Brand A  Brand A      Reg       A5           7          15          13          98         210         182
    

    And let's divide it into two pieces, the left part

    left = df.iloc[2:, :4].set_axis(df.iloc[1, :4], axis='columns')
    print(left)
    
    1 Brand Family    Brand Channel Product
    2      Brand A  Brand A     Reg      A1
    3      Brand A  Brand A     Reg      A2
    4      Brand A  Brand A     Reg      A3
    5      Brand A  Brand A     Reg      A4
    6      Brand A  Brand A     Reg      A5
    

    and the right part.

    right = df.iloc[:, 4:].T
    print(right)
    
              0           1    2    3     4  5    6
    4  Quantity  2022-01-12   18    9    28  0    7
    5  Quantity  2022-01-13   47   23    80  0   15
    6  Quantity  2022-01-14   41   20    82  0   13
    7   Dollars  2022-01-12  216  108   392  0   98
    8   Dollars  2022-01-13  164  276  1120  0  210
    9   Dollars  2022-01-14  492  240  1148  0  182
    

    Now, pivot the right part

    pivoted = right.pivot(index=0, columns=[1]).T.reset_index(level=1, names=[0, 'Date'])
    print(pivoted)
    

    and note the index is consistent with the left part's.

    0        Date Dollars Quantity
    2  2022-01-12     216       18
    2  2022-01-13     164       47
    2  2022-01-14     492       41
    3  2022-01-12     108        9
    3  2022-01-13     276       23
    3  2022-01-14     240       20
    4  2022-01-12     392       28
    4  2022-01-13    1120       80
    4  2022-01-14    1148       82
    5  2022-01-12       0        0
    5  2022-01-13       0        0
    5  2022-01-14       0        0
    6  2022-01-12      98        7
    6  2022-01-13     210       15
    6  2022-01-14     182       13
    

    Finally, join them by the common index.

    df = left.join(pivoted).reset_index(drop=True)
    print(df)
    
       Brand Family    Brand Channel Product        Date Dollars Quantity
    0       Brand A  Brand A     Reg      A1  2022-01-12     216       18
    1       Brand A  Brand A     Reg      A1  2022-01-13     164       47
    2       Brand A  Brand A     Reg      A1  2022-01-14     492       41
    3       Brand A  Brand A     Reg      A2  2022-01-12     108        9
    4       Brand A  Brand A     Reg      A2  2022-01-13     276       23
    5       Brand A  Brand A     Reg      A2  2022-01-14     240       20
    6       Brand A  Brand A     Reg      A3  2022-01-12     392       28
    7       Brand A  Brand A     Reg      A3  2022-01-13    1120       80
    8       Brand A  Brand A     Reg      A3  2022-01-14    1148       82
    9       Brand A  Brand A     Reg      A4  2022-01-12       0        0
    10      Brand A  Brand A     Reg      A4  2022-01-13       0        0
    11      Brand A  Brand A     Reg      A4  2022-01-14       0        0
    12      Brand A  Brand A     Reg      A5  2022-01-12      98        7
    13      Brand A  Brand A     Reg      A5  2022-01-13     210       15
    14      Brand A  Brand A     Reg      A5  2022-01-14     182       13