Search code examples
arraysnumpydatetimedate-range

Numpy arange with spesific datetime given by month interval


I want to make array of datetime given by interval of months. It easy if I use days as interval like this

xyz = np.arange(np.datetime64('2020-03-24'), 3)
xyz

OUTPUT

array(['2020-03-24', '2020-03-25', '2020-03-26'], dtype='datetime64[D]')

It only incremental by 3 dyas. How about 3 months? I have tried this way and ERROR

np.arange(datetime('2020-03-28'), np.timedelta64(3,'M'))

I have tried this and Giving wrong result

np.arange(np.datetime64("2020-03-24"), np.datetime64("2020-06-24"), 
          np.timedelta64(1, 'M'), 
          dtype='datetime64[M]').astype('datetime64[D]')

OUTPUT

array(['2020-03-01', '2020-04-01', '2020-05-01'], dtype='datetime64[D]')

Solution

  • Your arange without the dtype raises an error:

    In [91]: x = np.arange(np.datetime64("2020-03-24"), np.datetime64("2020-06-24"),  
        ...:           np.timedelta64(1, 'M'))                                                     
    ...
    TypeError: Cannot get a common metadata divisor for NumPy datetime metadata [M] and [D] because they have incompatible nonlinear base time units
    

    Stepping by one month is not the same as stepping by n days.

    With the dtype:

    In [85]: x = np.arange(np.datetime64("2020-03-24"), np.datetime64("2020-06-24"),  
        ...:           np.timedelta64(1, 'M'),  
        ...:           dtype='datetime64[M]')                                                      
    In [86]: x                                                                                     
    Out[86]: array(['2020-03', '2020-04', '2020-05'], dtype='datetime64[M]')
    

    The end points have been converted to month (without any implied date).

    Note that the differences are the expected 1 month:

    In [87]: np.diff(x)                                                                            
    Out[87]: array([1, 1], dtype='timedelta64[M]')
    

    If I convert the dates to D dtype, it chooses the start of the month:

    In [89]: x.astype('datetime64[D]')                                                             
    Out[89]: array(['2020-03-01', '2020-04-01', '2020-05-01'], dtype='datetime64[D]')
    

    The date time delta is no longer uniform:

    In [90]: np.diff(x.astype('datetime64[D]'))                                                    
    Out[90]: array([31, 30], dtype='timedelta64[D]')
    

    ===

    Instead of astype, you could add the appropriate timedelta:

    In [96]: x + np.array(3, 'timedelta64[D]')                                                     
    Out[96]: array(['2020-03-04', '2020-04-04', '2020-05-04'], dtype='datetime64[D]')