I have a csv (see image below) with the first 2 rows as the column headers and I need the first header row to be the header and second header row to be transposed for every row.
I tried transposing and using the pivot functions but those didn't work. I also tried the pivot_table but that wasn't working either.
The expected output is:
I just assumed your CSV file is like the one I have inline in my code.
First, read the file and ignore the headers.
import io
import pandas as pd
import numpy as np
csv_file = io.StringIO('''\
,,,,Quantity,Quantity,Quantity,Dollars,Dollars,Dollars
Brand Family,Brand,Channel,Product,2022-01-12,2022-01-13,2022-01-14,2022-01-12,2022-01-13,2022-01-14
Brand A,Brand A,Reg,A1,18,47,41,216,164,492
Brand A,Brand A,Reg,A2,9,23,20,108,276,240
Brand A,Brand A,Reg,A3,28,80,82,392,1120,1148
Brand A,Brand A,Reg,A4,,,,,,
Brand A,Brand A,Reg,A5,7,15,13,98,210,182\
''')
df = pd.read_csv(csv_file, header=None).replace(np.nan, 0)
print(df)
0 1 2 3 4 5 6 7 8 9
0 0 0 0 0 Quantity Quantity Quantity Dollars Dollars Dollars
1 Brand Family Brand Channel Product 2022-01-12 2022-01-13 2022-01-14 2022-01-12 2022-01-13 2022-01-14
2 Brand A Brand A Reg A1 18 47 41 216 164 492
3 Brand A Brand A Reg A2 9 23 20 108 276 240
4 Brand A Brand A Reg A3 28 80 82 392 1120 1148
5 Brand A Brand A Reg A4 0 0 0 0 0 0
6 Brand A Brand A Reg A5 7 15 13 98 210 182
And let's divide it into two pieces, the left part
left = df.iloc[2:, :4].set_axis(df.iloc[1, :4], axis='columns')
print(left)
1 Brand Family Brand Channel Product
2 Brand A Brand A Reg A1
3 Brand A Brand A Reg A2
4 Brand A Brand A Reg A3
5 Brand A Brand A Reg A4
6 Brand A Brand A Reg A5
and the right part.
right = df.iloc[:, 4:].T
print(right)
0 1 2 3 4 5 6
4 Quantity 2022-01-12 18 9 28 0 7
5 Quantity 2022-01-13 47 23 80 0 15
6 Quantity 2022-01-14 41 20 82 0 13
7 Dollars 2022-01-12 216 108 392 0 98
8 Dollars 2022-01-13 164 276 1120 0 210
9 Dollars 2022-01-14 492 240 1148 0 182
Now, pivot the right part
pivoted = right.pivot(index=0, columns=[1]).T.reset_index(level=1, names=[0, 'Date'])
print(pivoted)
and note the index is consistent with the left part's.
0 Date Dollars Quantity
2 2022-01-12 216 18
2 2022-01-13 164 47
2 2022-01-14 492 41
3 2022-01-12 108 9
3 2022-01-13 276 23
3 2022-01-14 240 20
4 2022-01-12 392 28
4 2022-01-13 1120 80
4 2022-01-14 1148 82
5 2022-01-12 0 0
5 2022-01-13 0 0
5 2022-01-14 0 0
6 2022-01-12 98 7
6 2022-01-13 210 15
6 2022-01-14 182 13
Finally, join them by the common index.
df = left.join(pivoted).reset_index(drop=True)
print(df)
Brand Family Brand Channel Product Date Dollars Quantity
0 Brand A Brand A Reg A1 2022-01-12 216 18
1 Brand A Brand A Reg A1 2022-01-13 164 47
2 Brand A Brand A Reg A1 2022-01-14 492 41
3 Brand A Brand A Reg A2 2022-01-12 108 9
4 Brand A Brand A Reg A2 2022-01-13 276 23
5 Brand A Brand A Reg A2 2022-01-14 240 20
6 Brand A Brand A Reg A3 2022-01-12 392 28
7 Brand A Brand A Reg A3 2022-01-13 1120 80
8 Brand A Brand A Reg A3 2022-01-14 1148 82
9 Brand A Brand A Reg A4 2022-01-12 0 0
10 Brand A Brand A Reg A4 2022-01-13 0 0
11 Brand A Brand A Reg A4 2022-01-14 0 0
12 Brand A Brand A Reg A5 2022-01-12 98 7
13 Brand A Brand A Reg A5 2022-01-13 210 15
14 Brand A Brand A Reg A5 2022-01-14 182 13