Pandas data frame splitting by cycles

I have a pandas data frame for the stops & scheduled times of a single transit route throughout a given day. I would like to split this into multiple frames each corresponding to individual trips made by a given bus (based only on the stop cycles & not when the scheduled periodicity would happen).

For example, the following has two A->B->C trips, so looking how to split the frame (ie: at index 3 in this case) such that each sub frame has the same sequence of stops.

import pandas as pd
df = pd.DataFrame({
    "scheduled": ["2023-05-25 13:00", "2023-05-25 13:15", "2023-05-25 13:45", "2023-05-25 14:35", "2023-05-25 14:50", "2023-05-25 15:20"],
    "stop": ["A", "B", "C", "A", "B", "C"]
})
pd.to_datetime(df["scheduled"])

Solution

Assuming that you don't know how many stops you have but that the pattern always repeats, you could compare to the first name and increment every time this stop is found, then use groupby to split:

group = df['stop'].eq(df['stop'].iloc[0]).cumsum()

out = [g for _,g in df.groupby(group)]

Output:

[  scheduled stop
 0   1:00 pm    A
 1   1:15 pm    B
 2   1:45 pm    C,
   scheduled stop
 3   2:35 pm    A
 4   2:50 pm    B
 5   3:20 pm    C]

Intermediate with the group number:

  scheduled stop  group
0   1:00 pm    A      1
1   1:15 pm    B      1
2   1:45 pm    C      1
3   2:35 pm    A      2
4   2:50 pm    B      2
5   3:20 pm    C      2

Other option, compute the number of unique stops (nunique) and use this to split with numpy.array_split:

import numpy as np

n = df['stop'].nunique()
out = np.array_split(df, range(n, len(df), n))