I have a dataframe like this:
Sequence | Message |
---|---|
1 | A |
2 | B |
3 | C |
4 | D |
1 | A |
2 | C |
3 | D |
4 | E |
5 | F |
1 | A |
2 | C |
3 | D |
And I'm trying to pivot it to have table with maximum Sequence
length as columns and corresponding possible Message
as values.
1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|
A | B | C | D | |||
A | C | D | E | F | ||
A | C | D |
I'm trying to use pd.get_dummies() for Sequence
, but can't figure out how to keep Message
values.
Appreciate any help.
Use a pivot
with df['Sequence'].diff().le(0).cumsum()
as index (new row when the Sequence restarts to an earlier number):
(df.assign(index=df['Sequence'].diff().le(0).cumsum())
.pivot(index='index', columns='Sequence', values='Message')
.rename_axis(index=None, columns=None) # optional
)
Output:
1 2 3 4 5
0 A B C D NaN
1 A C D E F
2 A C D NaN NaN