Search code examples
pythondatetimegtfs

Python convert GTFS time to datetime


It is common for a GTFS time to exceed 23:59:59 due to the timetable cycle. Ie, the last time may be 25:20:00 (01:20:00 the next day), so when you convert the times to datetime, you will get an error when these times are encountered.

Is there a way to convert the GTFS time values into standard datetime format, without splitting the hour out and then converting back to a string in the correct format, to then convert it to a datetime.

t = ['24:22:00', '24:30:00', '25:40:00', '26:27:00']
'0'+str(pd.to_numeric(t[0].split(':')[0])%24)+':'+':'.join(t[0].split(':')[1:])

For the above examples, i would expect to just see

['00:22:00', '00:30:00', '01:40:00', '02:27:00']

Solution

  • I didn't find an easy way, so i just wrote a function to do it.

    If anyone else wants the solution, here is mine:

    from datetime import timedelta
    import pandas as pd
    
    def list_to_real_datetime(time_list, date_exists=False):
        '''
        Convert a list of GTFS times to real datetime list
    
        :param time_list: GTFS times
        :param date_exists: Flag indicating if the date exists in the list elements
        :return: An adjusted list of time to conform with real date times
        '''
    
        # new list of times to be returned
        new_time = []
    
        for time in time_list:
    
            plus_day = False
            hour = int(time[0:2])
    
            if hour >= 24:
                hour -= 24
                plus_day = True
    
            # reset the time to a real format
            time = '{:02d}'.format(hour)+time[2:]
    
            # Convert the time to a datetime
            if not date_exists:
                time = pd.to_datetime('1970-01-01 '+time, format='%Y-%m-%d')
    
            if plus_day:
                time = time + timedelta(days=1)
    
            new_time.append(time)
    
        return new_time