Search code examples
pythondatetimenumpytimedelta

How can I make a python numpy arange of datetime


I have some input data, with timestamps in the input file in the form of hours from the date time specified in the filename.

This is a bit useless, so I need to convert it to python datetime.datetime objects, and then put it in a numpy array. I could write a for loop, but I'd like to do something like:

numpy.arange(datetime.datetime(2000, 1,1), datetime.datetime(2000, 1,2), datetime.timedelta(hours=1))

which throws a TypeError.

Can this be done? I'm stuck with python 2.6 and numpy 1.6.1.


Solution

  • See NumPy Datetimes and Timedeltas. Since NumPy 1.7, you can represent datetimes in NumPy using the numpy.datetime64 type, which permits you to do ranges of values:

    >>> np.arange(np.datetime64("2000-01-01"), np.datetime64("2000-01-02"), np.timedelta64(1, "h"))
    array(['2000-01-01T00', '2000-01-01T01', '2000-01-01T02', '2000-01-01T03',
           '2000-01-01T04', '2000-01-01T05', '2000-01-01T06', '2000-01-01T07',
           '2000-01-01T08', '2000-01-01T09', '2000-01-01T10', '2000-01-01T11',
           '2000-01-01T12', '2000-01-01T13', '2000-01-01T14', '2000-01-01T15',
           '2000-01-01T16', '2000-01-01T17', '2000-01-01T18', '2000-01-01T19',
           '2000-01-01T20', '2000-01-01T21', '2000-01-01T22', '2000-01-01T23'],
          dtype='datetime64[h]')
    

    For NumPy 1.6, which has a much less useful datetime64 type, you can use a suitable list comprehension to build the datetimes (see also Creating a range of dates in Python):

    base = datetime.datetime(2000, 1, 1)
    arr = numpy.array([base + datetime.timedelta(hours=i) for i in xrange(24)])
    

    This produces

    array([2000-01-01 00:00:00, 2000-01-01 01:00:00, 2000-01-01 02:00:00,
       2000-01-01 03:00:00, 2000-01-01 04:00:00, 2000-01-01 05:00:00,
       2000-01-01 06:00:00, 2000-01-01 07:00:00, 2000-01-01 08:00:00,
       2000-01-01 09:00:00, 2000-01-01 10:00:00, 2000-01-01 11:00:00,
       2000-01-01 12:00:00, 2000-01-01 13:00:00, 2000-01-01 14:00:00,
       2000-01-01 15:00:00, 2000-01-01 16:00:00, 2000-01-01 17:00:00,
       2000-01-01 18:00:00, 2000-01-01 19:00:00, 2000-01-01 20:00:00,
       2000-01-01 21:00:00, 2000-01-01 22:00:00, 2000-01-01 23:00:00], dtype=object)