i have timeseries data spanning multiple days that i need to sequence (i.e., create a column each time a value changes). i have the sequencing working without groupby, but i'm a little lost on how to apply the same or similar code to the grouped data.
my data looks like:
index timestamp value
0 1684713605000 1
1 1684713610000 1
2 1684713611000 1
3 1684713614000 0
4 1684713615000 0
5 1684713616000 0
6 1684713619000 1
7 1684713620000 1
8 1684713621000 1
9 1684832896000 1
10 1684832897000 1
11 1684832898000 1
12 1684832901000 0
13 1684832902000 0
14 1684832903000 0
15 1684832906000 1
16 1684832907000 1
17 1684832908000 1
my timestamp
column is not guaranteed to be sequential, but is generally one record per second of the day. i need my desired sequence
column to increment up until the end of the day, then begin counting again at 0 the next day.
the code i'm using to sequence is:
subset = df[["value"]].copy()
subset["change"] = (subset["value"].shift() != subset["value"]) * 1
subset["seq"] = subset["change"].cumsum(axis = 0) - 1
df["seq"] = subset["seq"]
i've been able to create groups with:
subset = df[["timestamp", "value"]].copy()
subset["date"] = pd.to_datetime(subset["timestamp"], unit="ms", origin="unix").dt.date
g = subset.groupby("date")
but i'm not sure how to proceed. my desired result is a sequence column that increments every time value
changes but resets
index timestamp value seq
0 1684713605000 1 0
1 1684713610000 1 0
2 1684713611000 1 0
3 1684713614000 0 1
4 1684713615000 0 1
5 1684713616000 0 1
6 1684713619000 1 2
7 1684713620000 1 2
8 1684713621000 1 2
9 1684832896000 1 0 <-- first record of a new day
10 1684832897000 1 0
11 1684832898000 1 0
12 1684832901000 0 1
13 1684832902000 0 1
14 1684832903000 0 1
15 1684832906000 1 2
16 1684832907000 1 2
17 1684832908000 1 2
Try:
t = pd.to_datetime(df["timestamp"], unit="ms", origin="unix").dt.date
df["seq"] = df.groupby(t, group_keys=False)["value"].apply(
lambda x: x.ne(x.shift()).cumsum() - 1
)
print(df)
Prints:
index timestamp value seq
0 0 1684713605000 1 0
1 1 1684713610000 1 0
2 2 1684713611000 1 0
3 3 1684713614000 0 1
4 4 1684713615000 0 1
5 5 1684713616000 0 1
6 6 1684713619000 1 2
7 7 1684713620000 1 2
8 8 1684713621000 1 2
9 9 1684832896000 1 0
10 10 1684832897000 1 0
11 11 1684832898000 1 0
12 12 1684832901000 0 1
13 13 1684832902000 0 1
14 14 1684832903000 0 1
15 15 1684832906000 1 2
16 16 1684832907000 1 2
17 17 1684832908000 1 2