Search code examples
gtfs

How To Build Polyline For A Route with Multiple Shapes Per Trip in GTFS data


I am trying to parse GTFS data and build a polyline shape (an array of latitude and longitude pairs) for a single route. But in my sample GTFS data I found that a trip has multiple shape IDs for a single route. Here is a passage from GTFS data:

routes.txt


    route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color
    90,YRT,90,LESLIE,,3,,FDAE35,FFFFFF

trips.txt


    route_id,service_id,trip_id,trip_headsign,trip_short_name,direction_id,block_id,shape_id,wheelchair_accessible,bikes_allowed
    90,1,1286467,Richmond Green Secondary School - NB,,0,131905,59628,1,1
    90,1,1286468,Richmond Green Secondary School - NB,,0,131907,59628,1,1
    90,1,1286380,Richmond Green Secondary School - NB,,0,131906,59629,1,1
    90,1,1286469,Richmond Green Secondary School - NB,,0,131908,59628,1,1
    90,1,1286381,Richmond Green Secondary School - NB,,0,131904,59629,1,1
    90,1,1286382,Richmond Green Secondary School - NB,,0,131905,59629,1,1
    ...
    90,1,1286399,Richmond Green Secondary School - NB,,0,131960,59629,1,1
    90,1,1286400,Richmond Green Secondary School - NB,,0,131961,59629,1,1
    90,1,1286470,Richmond Green Secondary School - NB,,0,131921,59630,1,1
    90,1,1286471,Richmond Green Secondary School - NB,,0,131922,59630,1,1
    90,1,1286401,Richmond Green Secondary School - NB,,0,131962,59629,1,1
    90,1,1286402,Richmond Green Secondary School - NB,,0,131960,59629,1,2

shapes.txt

    

    shape_id,shape_pt_lat,shape_pt_lon,shape_pt_sequence,shape_dist_traveled
    59628,43.902752,-79.398992,72,7.2214
    59628,43.902585,-79.399005,73,7.2405
    59629,43.775996,-79.346326,1,0.0000
    59629,43.775987,-79.346238,2,0.0071
    ...
    59629,43.902752,-79.398992,317,15.7832
    59629,43.902585,-79.399005,318,15.8022
    59630,43.811197,-79.360774,1,0.0000
    59630,43.812373,-79.361259,2,0.1364

I was expecting one shape per trip or at least shapes are in sequential order. But this trip data is throwing me off:


    route_id,service_id,trip_id,trip_headsign,trip_short_name,direction_id,block_id,shape_id,wheelchair_accessible,bikes_allowed
    90,1,1286400,Richmond Green Secondary School - NB,,0,131961,59629,1,1
    90,1,1286470,Richmond Green Secondary School - NB,,0,131921,59630,1,1
    90,1,1286471,Richmond Green Secondary School - NB,,0,131922,59630,1,1
    90,1,1286401,Richmond Green Secondary School - NB,,0,131962,59629,1,1

If you noticed, after shape #59629, #59630 is located. But after that we again see #59629. How can I make sense of this? Is it a data issue?


Solution

  • Shapes are not associated with routes, shapes are only associated with individual trips. It is quite common for a single route to encompass two or more shapes.

    In fact, since shapes explicitly encode a direction of motion, there will always be at least 2 shapes for routes that are split into "there-and-back" trip pairs (which is the most common approach for simple bus routes in practice). More complex possibilities include routes with multiple branches, or routes with some short-turning trips.

    Furthermore, there is no ordering implied by the shape IDs; i.e. there is no sense in which 59630 is "before" or "after" 59629. In principle, these are arbitrary strings.

    In short, the data you are working with looks fine, it's just that there is no unambiguous way to do what you want to do for the general case. However, depending on the particulars of your case, it may be possible to take a more manual approach and combine multiple shapes into a single coherent polyline.