I have a dataframe like this:
Time User Route
11:03:01 1234 home
11:03:04 1234 category
11:03:10 1234 product
11:03:21 1234 cart
11:04:01 4321 home
11:04:04 4321 category
11:04:10 4321 product
11:04:21 4321 cart
I want to create this:
Time User Route Journey
11:03:01 1234 home home
11:03:04 1234 category home, category
11:03:10 1234 product home, category, product
11:03:21 1234 cart home, category, product, cart
11:04:01 4321 home home
11:04:04 4321 category home, category
11:04:10 4321 product home, category, product
11:04:21 4321 cart home, category, product, cart
How can I do this in a dataframe?
Here you go:
df['Journey'] = (df.Route.add(', ')
.groupby(df['User'])
.transform(lambda x: x.cumsum().str[:-2])
)
output:
Time User Route Journey
0 11:03:01 1234 home home
1 11:03:04 1234 category home, category
2 11:03:10 1234 product home, category, product
3 11:03:21 1234 cart home, category, product, cart
4 11:04:01 4321 home home
5 11:04:04 4321 category home, category
6 11:04:10 4321 product home, category, product
7 11:04:21 4321 cart home, category, product, cart