I have an excel file which I've read using Pandas the output is as follows:
+------------+-----+-----+-----+
| Type | 1 | 2 | 3 |
| Category | A | A | C |
| Dates | NaN | NaN | NaN |
| 01/01/2021 | 12 | 12 | 9 |
| 02/01/2021 | 10 | 10 | 2 |
| 03/01/2021 | 30 | 16 | NaN |
| 04/01/2021 | 15 | 23 | 4 |
| 05/01/2021 | 14 | 20 | 5 |
+------------+-----+-----+-----+
The first two rows gives information for each time series column wise. So for column 1
the Type
is 1
and Category
is A
. I want to melt the time series and not quite sure how to approach the problem given the structure of the sheet.
Expected Output:
+------------+-------+----------+------+
| Dates | Price | Category | Type |
+------------+-------+----------+------+
| 01/01/2021 | 12 | A | 1 |
| 02/01/2021 | 10 | A | 1 |
| 03/01/2021 | 30 | A | 1 |
| 04/01/2021 | 15 | A | 1 |
| 05/01/2021 | 14 | A | 1 |
| 01/01/2021 | 12 | B | 2 |
| 02/01/2021 | 10 | B | 2 |
| 03/01/2021 | 16 | B | 2 |
| 04/01/2021 | 23 | B | 2 |
| 05/01/2021 | 20 | B | 2 |
| 01/01/2021 | 9 | C | 3 |
| 02/01/2021 | 2 | C | 3 |
| 04/01/2021 | 4 | C | 3 |
| 05/01/2021 | 5 | C | 3 |
+------------+-------+----------+------+
In case of Type 3
and Category C
since value is NaN
we drop that date. How can achieve the expected output?
Assuming the following input dataframe:
col0 col1 col2 col3
0 Type 1 2 3
1 Category A A C
2 Dates NaN NaN NaN
3 01/01/2021 12 12 9
4 02/01/2021 10 10 2
5 03/01/2021 30 16 NaN
6 04/01/2021 15 23 4
7 05/01/2021 14 20 5
Here is a working pipeline:
(df.iloc[3:]
.set_index('col0').rename_axis('Date') # set first column aside
# next 3 lines to rename columns index
.T
.set_index(pd.MultiIndex.from_arrays(df.iloc[:2, 1:].values, names=df.iloc[:2, 0]))
.T
.stack(level=[0,1]) # columns to rows
.rename('Price') # rename last unnamed column
.reset_index() # all indexes back to columns
)
output:
Date Type Category Price
0 01/01/2021 1 A 12
1 01/01/2021 2 A 12
2 01/01/2021 3 C 9
3 02/01/2021 1 A 10
4 02/01/2021 2 A 10
5 02/01/2021 3 C 2
6 03/01/2021 1 A 30
7 03/01/2021 2 A 16
8 04/01/2021 1 A 15
9 04/01/2021 2 A 23
10 04/01/2021 3 C 4
11 05/01/2021 1 A 14
12 05/01/2021 2 A 20
13 05/01/2021 3 C 5