It is common for a GTFS time to exceed 23:59:59 due to the timetable cycle. Ie, the last time may be 25:20:00 (01:20:00 the next day), so when you convert the times to datetime, you will get an error when these times are encountered.
Is there a way to convert the GTFS time values into standard datetime format, without splitting the hour out and then converting back to a string in the correct format, to then convert it to a datetime.
t = ['24:22:00', '24:30:00', '25:40:00', '26:27:00']
'0'+str(pd.to_numeric(t[0].split(':')[0])%24)+':'+':'.join(t[0].split(':')[1:])
For the above examples, i would expect to just see
['00:22:00', '00:30:00', '01:40:00', '02:27:00']
I didn't find an easy way, so i just wrote a function to do it.
If anyone else wants the solution, here is mine:
from datetime import timedelta
import pandas as pd
def list_to_real_datetime(time_list, date_exists=False):
'''
Convert a list of GTFS times to real datetime list
:param time_list: GTFS times
:param date_exists: Flag indicating if the date exists in the list elements
:return: An adjusted list of time to conform with real date times
'''
# new list of times to be returned
new_time = []
for time in time_list:
plus_day = False
hour = int(time[0:2])
if hour >= 24:
hour -= 24
plus_day = True
# reset the time to a real format
time = '{:02d}'.format(hour)+time[2:]
# Convert the time to a datetime
if not date_exists:
time = pd.to_datetime('1970-01-01 '+time, format='%Y-%m-%d')
if plus_day:
time = time + timedelta(days=1)
new_time.append(time)
return new_time