This is a self-answered post. A common problem is to randomly generate dates between a given start and end date.
There are two cases to consider:
For example, given some start date 2015-01-01
and an end date 2018-01-01
, how can I sample N random dates between this range using pandas?
We can speed up @akilat90's approach about twofold (in @coldspeed's benchmark) by using the fact that datetime64
is just a rebranded int64
hence we can view-cast:
def pp(start, end, n):
start_u = start.value//10**9
end_u = end.value//10**9
return pd.DatetimeIndex((10**9*np.random.randint(start_u, end_u, n, dtype=np.int64)).view('M8[ns]'))