Search code examples
pythondatetimedatediffpython-datetime

How to correctly generate list of UTC timestamps, by hour, between two datetimes Python?


I'm new to Python. After a couple days researching and trying things out, I've landed on a decent solution for creating a list of timestamps, for each hour, between two dates.

Example:

import datetime
from datetime import datetime, timedelta

timestamp_format = '%Y-%m-%dT%H:%M:%S%z'

earliest_ts_str = '2020-10-01T15:00:00Z'
earliest_ts_obj = datetime.strptime(earliest_ts_str, timestamp_format)

latest_ts_str = '2020-10-02T00:00:00Z'
latest_ts_obj = datetime.strptime(latest_ts_str, timestamp_format)

num_days = latest_ts_obj - earliest_ts_obj
num_hours = int(round(num_days.total_seconds() / 3600,0))

ts_raw = []
for ts in range(num_hours):
    ts_raw.append(latest_ts_obj - timedelta(hours = ts + 1))

dates_formatted = [d.strftime('%Y-%m-%dT%H:%M:%SZ') for d in ts_raw]

# Need timestamps in ascending order
dates_formatted.reverse()

dates_formatted

Which results in:

['2020-10-01T00:00:00Z',
 '2020-10-01T01:00:00Z',
 '2020-10-01T02:00:00Z',
 '2020-10-01T03:00:00Z',
 '2020-10-01T04:00:00Z',
 '2020-10-01T05:00:00Z',
 '2020-10-01T06:00:00Z',
 '2020-10-01T07:00:00Z',
 '2020-10-01T08:00:00Z',
 '2020-10-01T09:00:00Z',
 '2020-10-01T10:00:00Z',
 '2020-10-01T11:00:00Z',
 '2020-10-01T12:00:00Z',
 '2020-10-01T13:00:00Z',
 '2020-10-01T14:00:00Z',
 '2020-10-01T15:00:00Z',
 '2020-10-01T16:00:00Z',
 '2020-10-01T17:00:00Z',
 '2020-10-01T18:00:00Z',
 '2020-10-01T19:00:00Z',
 '2020-10-01T20:00:00Z',
 '2020-10-01T21:00:00Z',
 '2020-10-01T22:00:00Z',
 '2020-10-01T23:00:00Z']

Problem:

  • If I change earliest_ts_str to include minutes, say earliest_ts_str = '2020-10-01T19:45:00Z', the resulting list does not increment the minute intervals accordingly.

Results:

['2020-10-01T20:00:00Z',
 '2020-10-01T21:00:00Z',
 '2020-10-01T22:00:00Z',
 '2020-10-01T23:00:00Z']

I need it to be:

['2020-10-01T20:45:00Z',
 '2020-10-01T21:45:00Z',
 '2020-10-01T22:45:00Z',
 '2020-10-01T23:45:00Z']

Feels like the problem is in the num_days and num_hours calculation, but I can't see how to fix it.

Ideas?


Solution

  • import datetime
    from datetime import datetime, timedelta
    
    timestamp_format = '%Y-%m-%dT%H:%M:%S%z'
    
    earliest_ts_str = '2020-10-01T00:00:00Z'
    ts_obj = datetime.strptime(earliest_ts_str, timestamp_format)
    
    latest_ts_str = '2020-10-02T00:00:00Z'
    latest_ts_obj = datetime.strptime(latest_ts_str, timestamp_format)
    
    ts_raw = []
    while ts_obj <= latest_ts_obj:
        ts_raw.append(ts_obj)
        ts_obj += timedelta(hours=1)
    
    dates_formatted = [d.strftime('%Y-%m-%dT%H:%M:%SZ') for d in ts_raw]
    print(dates_formatted)
    

    EDIT:

    Here is example with Maya

    import maya
    
    earliest_ts_str = '2020-10-01T00:00:00Z'
    latest_ts_str = '2020-10-02T00:00:00Z'
    start = maya.MayaDT.from_iso8601(earliest_ts_str)
    end = maya.MayaDT.from_iso8601(latest_ts_str)
    
    # end is not included, so we add 1 second
    my_range = maya.intervals(start=start, end=end.add(seconds=1), interval=60*60)
    dates_formatted = [d.iso8601() for d in my_range]
    print(dates_formatted)
    

    Both output

    ['2020-10-01T00:00:00Z',
     '2020-10-01T01:00:00Z',
     ... some left out ...
     '2020-10-01T23:00:00Z',
     '2020-10-02T00:00:00Z']