There is a table stop_times.txt
where its format (GTFS) is something like:
+------------------+---------------+
| trip_id | stop_sequence |
+------------------+---------------+
| 4503599630773892 | 0 |
| 4503599630773892 | 1 |
| ... | ... |
| 4503599630773892 | 27 |
| 4503599630810392 | 0 |
| 4503599630810392 | 1 |
| ... | ... |
| 4503599630810392 | 17 |
| 4503599631507892 | 0 |
| 4503599631507892 | 1 |
| ... | ... |
| 4503599631507892 | 29 |
| ... | ... |
+------------------+---------------+
My expecting result is:
+------------------+------------+-----------+
| trip_id | first_stop | last_stop |
+------------------+------------+-----------+
| 4503599630773892 | 0 | 27 |
| 4503599630810392 | 0 | 17 |
| 4503599631507892 | 0 | 19 |
| ... | ... | ... |
+------------------+------------+-----------+
PS: The title might not be precise. Please refine it.
One further question: how can I add stop_name
that corresponds to stop_sequence
to this table?
Here is the incorrect code for the reason that the stop name of first_stop
and last_stop
should be different as corresponding to the different stop_id
:
(SELECT routes.route_short_name, MIN(stop_times.stop_sequence) AS first_stop, stops.stop_name, MAX(stop_times.stop_sequence) AS last_stop, stops.stop_name
FROM stop_times
JOIN stops ON stops.stop_id=stop_times.stop_id
JOIN trips ON stop_times.trip_id=trips.trip_id
JOIN routes ON routes.route_id=trips.route_id
GROUP BY stop_times.trip_id);
EDIT: I make it after several hours' work. Here is the key source code:
SELECT T1.trip_id, T1.stop_sequence, T1.stop_id, T2.stop_sequence, T2.stop_id
FROM
-- create a new table T1: trip_id, stop_sequence=0, stop_id (first stop)
(SELECT st_first1.trip_id, st_first1.stop_sequence, st_first1.stop_id
FROM stop_times st_first1
INNER JOIN
-- filter out the first stop: trip_id, stop_sequence=0
(SELECT stop_times.trip_id, MIN(CAST(stop_times.stop_sequence AS UNSIGNED)) AS first_stop
FROM stop_times
GROUP BY stop_times.trip_id
) st_first2
ON st_first1.trip_id=st_first2.trip_id AND st_first1.stop_sequence=st_first2.first_stop
) T1
LEFT JOIN -- combine T1 and T2
-- create a new table T2: trip_id, stop_sequence=MAX, stop_id (last stop)
(SELECT st_last1.trip_id, st_last1.stop_sequence, st_last1.stop_id
FROM stop_times st_last1
INNER JOIN
-- filter out the last stop: trip_id, stop_sequence=MAX
(SELECT stop_times.trip_id, MAX(CAST(stop_times.stop_sequence AS UNSIGNED)) AS last_stop
FROM stop_times
GROUP BY stop_times.trip_id
) st_last2
ON st_last1.trip_id=st_last2.trip_id AND st_last1.stop_sequence=st_last2.last_stop
) T2
ON T1.trip_id=T2.trip_id
You can GROUP BY
the trip_id
and then take the MIN
and MAX
stop_sequence
values to obtain the first and last stops, respectively.
SELECT DISTINCT st.trip_id, s.stop_name, t.first_stop, t.last_stop
FROM stop_times st INNER JOIN stops s
ON st.stop_id = s.stop_id
RIGHT JOIN
(
SELECT trip_id, MIN(stop_sequence) AS first_stop, MAX(stop_sequence) AS last_stop
FROM stop_times
GROUP BY trip_id
) t
ON t.trip_id = st.trip_id