I have a dataset like this
index Date_Time Pass_ID El
0 3/30/23 05:12:36.36 A 1
1 3/30/23 05:12:38.38 A 2
1 3/30/23 05:12:40.40 A 3
1 3/30/23 05:12:42.42 A 4
1 3/30/23 05:12:44.44 A 4
1 3/30/23 12:12:50.50 B 3
1 3/30/23 12:12:52.52 B 4
1 3/30/23 12:12:54.54 B 5
1 3/30/23 12:12:56.56 B 6
1 3/30/23 12:12:58.58 B 7
1 3/30/23 12:13:00.00 B 8
1 3/30/23 12:13:02.02 B 9
1 3/31/23 20:02:02.02 C 3
1 3/31/23 20:02:05.05 C 4
The Date_Time is pandas datetime object.
I'd like to group the records by Pass_ID
, and then select out only those unique Pass_IDs that occur between specific hours in the day: for instance, between 10:00 and 13:00 would return B.
I don't know how to get groupby and 'between_time' to work in this case... which would seem to be the best way forward. I've also tried using a lambda function after setting the Date_Time as the index, but that didn't work. And using aggregate doesn't seem to allow me to pull out the dt.hour of the Date_Time field. Anyone know how to do this concisely?
Try:
# to datetime if necessary
# df["Date_Time"] = pd.to_datetime(df["Date_Time"])
out = df.set_index("Date_Time").between_time("10:00", "13:00")["Pass_ID"].unique()
print(out)
Prints:
['B']
OR: If you want to filter whole groups between time 10:00-13:00:
out = (
df.groupby("Pass_ID")
.filter(
lambda x: len(x.set_index("Date_Time").between_time("10:00", "13:00")) == len(x)
)["Pass_ID"]
.unique()
)
print(out)