Here is the input dataframe. Case 1:
df = pd.DataFrame({'order': {0: '1',
1: '1',
2: '2',
3: '2',
4: '3',
5: '3'},
'start': {0: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
1: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
2: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
3: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
4: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
5: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC')},
'end': {0: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
1: pd.Timestamp('2023-06-01 04:00:00+0000', tz='UTC'),
2: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
3: pd.Timestamp('2023-06-01 04:00:00+0000', tz='UTC'),
4: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
5: pd.Timestamp('2023-06-01 04:00:00+0000', tz='UTC')},
'quant': {0: 10, 1: 10, 2: 20, 3: 30, 4: 40, 5: 50},
'price': {0: 44, 1: 44, 2: 5, 3: 6, 4: 8, 5: 8}})
input df image - >input df image
My requirement is to wide this dataframe based on order , So my expected output is expected df
Case 2:
`
df = pd.DataFrame({'order': {0: '1',
1: '1',
2: '2',
3: '2',
4: '3',
5: '3',
6: '3',
7: '3'},
'start': {0: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
1: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
2: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
3: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
4: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
5: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
6: pd.Timestamp('2023-03-01 04:00:00+0000', tz='UTC'),
7: pd.Timestamp('2023-02-01 04:00:00+0000', tz='UTC')},
'end': {0: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
1: pd.Timestamp('2023-06-01 04:00:00+0000', tz='UTC'),
2: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
3: pd.Timestamp('2023-06-01 04:00:00+0000', tz='UTC'),
4: pd.Timestamp('2023-05-01 04:00:00+0000', tz='UTC'),
5: pd.Timestamp('2023-06-01 04:00:00+0000', tz='UTC'),
6: pd.Timestamp('2023-04-01 04:00:00+0000', tz='UTC'),
7: pd.Timestamp('2023-03-01 04:00:00+0000', tz='UTC')},
'quant': {0: 10, 1: 10, 2: 20, 3: 30, 4: 40, 5: 50, 6:10, 7:10},
'price': {0: 44, 1: 44, 2: 5, 3: 6, 4: 8, 5: 8, 6:9, 7:8}})`
In this case we have two more rows for order 3. So resulting dataframe quant & price should be Nan for order 1 & 2 for extra rows of order 3.
input frame -> input df image
expected output is, expected output
Can anyone please help me with this ?
This is a pivot
, you can then reorder the levels:
out = (df.pivot(index=['start', 'end'], columns='order')
.sort_index(axis=1, level=1, sort_remaining=False)
.swaplevel(axis=1)
)
Output:
order 1 2 3
quant price quant price quant price
start end
2023-04-01 04:00:00+00:00 2023-05-01 04:00:00+00:00 10 44 20 5 40 8
2023-05-01 04:00:00+00:00 2023-06-01 04:00:00+00:00 10 44 30 6 50 8