Having a list of dates ordered:
[
datetime.date(2006, 8, 15),
datetime.date(2006, 9, 12),
datetime.date(2007, 8, 10),
datetime.date(2021, 4, 6),
datetime.date(2021, 4, 16),
datetime.date(2021, 4, 19)
...
]
I would like to have groups that contain dates that are a maximum of 30 days between all dates (the distance between first element of the group and the last of these group will be <= 30 days)
For example, using the previous list, I will get:
I tried to use iter-tools groupby but the key function doesn't allow 2 dates comparation like "lambda x,y: (x-y).days <= 30...." I don't know if I can use groupby for solve this problem or I need other itertools function. I know that I could build a python algorithm for it but I think that will exist a simple option to solve this but I didn't found it :(
Thanks!
An iterative solution with plain old for-loop is quite straightforward in this case.
I don't think it will be easy or efficient to use itertools
to solve this problem since the grouping in this case depends on the context of the data, which will probably yield a O(N^2) solution, whereas iterative approach is O(N).
dts = [
datetime.date(2006, 8, 15),
datetime.date(2006, 9, 12),
datetime.date(2007, 8, 10),
datetime.date(2021, 4, 6),
datetime.date(2021, 4, 16),
datetime.date(2021, 4, 19)
]
def groupDateTimes(dts):
i = 0
ans = []
group = []
delta30days = datetime.timedelta(days=30)
while i < len(dts):
cur = dts[i]
if not group:
group.append(cur)
elif cur - group[0] <= delta30days:
group.append(cur)
else:
ans.append(group)
group = [cur]
i += 1
if group:
ans.append(group)
return ans
print(groupDateTimes(dts)) // [[datetime.date(2006, 8, 15), datetime.date(2006, 9, 12)], [datetime.date(2007, 8, 10)], [datetime.date(2021, 4, 6), datetime.date(2021, 4, 16), datetime.date(2021, 4, 19)]]