Search code examples
mysqlscheduletransport

MySQL to select a transport schedule


I'm developing a MySQL database to handle transport schedules. Each train line or bus journey has a unique identifier and the journey from start to finish might not call at all potential stopping off points.

I have three tables: schedule, station and journey. The latter two only contain an auto-increment id and the varchar identifier at present. The schedule table contains an auto-increment id, s_id and j_id (referring to the station and journey id respectively), the stop_order and the arrival/departure times.

Running this SQL:

SELECT schedules.id, journeys.journey, stations.station, schedules.s_order, 
schedules.a_time, schedules.d_time

FROM `schedules` 
JOIN `journeys` ON schedules.j_id = journeys.j_id
JOIN `stations` ON schedules.s_id = stations.s_id

returns this result:

+----+---------+---------+---------+----------+----------+
| id | journey | station | s_order | a_time   | d_time   |
+----+---------+---------+---------+----------+----------+
|  1 | J1      | STA     |       1 | NULL     | 07:00:00 |
|  2 | J1      | STB     |       2 | 07:09:00 | 07:10:00 |
|  3 | J1      | STC     |       3 | 07:29:00 | 07:30:00 |
|  4 | J1      | STD     |       4 | 07:44:00 | 07:45:00 |
|  5 | J1      | STE     |       5 | 07:59:00 | 08:00:00 |
|  6 | J1      | STF     |       6 | 08:15:00 | NULL     |
|  7 | J2      | STA     |       1 | NULL     | 07:30:00 |
|  8 | J2      | STC     |       2 | 07:59:00 | 08:00:00 |
|  9 | J2      | STF     |       3 | 08:45:00 | NULL     |
| 10 | J3      | STA     |       1 | NULL     | 08:00:00 |
| 11 | J3      | STB     |       2 | 08:09:00 | 08:10:00 |
| 12 | J3      | STD     |       3 | 08:44:00 | 08:45:00 |
| 13 | J3      | STE     |       4 | 09:00:00 | NULL     |
| 14 | J4      | STA     |       1 | NULL     | 08:30:00 |
| 15 | J4      | STD     |       2 | 09:14:00 | 09:15:00 |
| 16 | J4      | STF     |       3 | 09:45:00 | NULL     |
| 17 | J5      | STF     |       1 | NULL     | 07:10:00 |
| 18 | J5      | STE     |       2 | 07:24:00 | 07:25:00 |
| 19 | J5      | STD     |       3 | 07:39:00 | 07:40:00 |
| 20 | J5      | STC     |       4 | 07:54:00 | 07:55:00 |
| 21 | J5      | STB     |       5 | 08:14:00 | 08:15:00 |
| 22 | J5      | STA     |       6 | 08:25:00 | NULL     |
| 23 | J6      | STF     |       1 | NULL     | 07:30:00 |
| 24 | J6      | STC     |       2 | 08:14:00 | 08:15:00 |
| 25 | J6      | STA     |       3 | 08:45:00 | NULL     |
| 26 | J7      | STE     |       1 | NULL     | 08:00:00 |
| 27 | J7      | STD     |       2 | 08:14:00 | 08:15:00 |
| 28 | J7      | STB     |       3 | 08:49:00 | 08:50:00 |
| 29 | J7      | STA     |       4 | 09:00:00 | NULL     |
| 30 | J8      | STF     |       1 | NULL     | 08:20:00 |
| 31 | J8      | STD     |       2 | 08:49:00 | 08:50:00 |
| 32 | J8      | STA     |       3 | 09:35:00 | NULL     |
+----+---------+---------+---------+----------+----------+

There are four journeys (J1 to J8) in each direction.

I need to be able to answer questions like:

Which journeys go from STB to STD (answer J1 and J3) but exclude journeys that go in the opposite direction (J5 and J7)?

Which journeys stop at STC (answer J1, J2, J5 and J6)?

I've tried subqueries, a self join, creating a temporary table then querying that but I'm hitting my head against a brick wall. I don't even know if the design of my tables is optimal.

Finally, here's the SQL to create and populate the tables:

CREATE TABLE `journeys` (
  `id` int(11) NOT NULL,
  `j_id` int(11) NOT NULL,
  `journey` varchar(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `journeys` (`id`, `j_id`, `journey`) VALUES
(1, 1, 'J1'),
(2, 2, 'J2'),
(3, 3, 'J3'),
(4, 4, 'J4'),
(5, 5, 'J5'),
(6, 6, 'J6'),
(7, 7, 'J7'),
(8, 8, 'J8');

.

CREATE TABLE `stations` (
  `id` int(11) NOT NULL,
  `s_id` int(11) NOT NULL,
  `station` varchar(3) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `stations` (`id`, `s_id`, `station`) VALUES
(1, 1, 'STA'),
(2, 2, 'STB'),
(3, 3, 'STC'),
(4, 4, 'STD'),
(5, 5, 'STE'),
(6, 6, 'STF');

.

CREATE TABLE `schedules` (
  `id` int(11) NOT NULL,
  `j_id` int(11) NOT NULL,
  `s_id` int(11) NOT NULL,
  `s_order` int(11) NOT NULL,
  `a_time` time DEFAULT NULL,
  `d_time` time DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `schedules` (`id`, `j_id`, `s_id`, `s_order`, `a_time`, `d_time`) 
VALUES
(1, 1, 1, 1, NULL, '07:00:00'),
(2, 1, 2, 2, '07:09:00', '07:10:00'),
(3, 1, 3, 3, '07:29:00', '07:30:00'),
(4, 1, 4, 4, '07:44:00', '07:45:00'),
(5, 1, 5, 5, '07:59:00', '08:00:00'),
(6, 1, 6, 6, '08:15:00', NULL),
(7, 2, 1, 1, NULL, '07:30:00'),
(8, 2, 3, 2, '07:59:00', '08:00:00'),
(9, 2, 6, 3, '08:45:00', NULL),
(10, 3, 1, 1, NULL, '08:00:00'),
(11, 3, 2, 2, '08:09:00', '08:10:00'),
(12, 3, 4, 3, '08:44:00', '08:45:00'),
(13, 3, 5, 4, '09:00:00', NULL),
(14, 4, 1, 1, NULL, '08:30:00'),
(15, 4, 4, 2, '09:14:00', '09:15:00'),
(16, 4, 6, 3, '09:45:00', NULL),
(17, 5, 6, 1, NULL, '07:10:00'),
(18, 5, 5, 2, '07:24:00', '07:25:00'),
(19, 5, 4, 3, '07:39:00', '07:40:00'),
(20, 5, 3, 4, '07:54:00', '07:55:00'),
(21, 5, 2, 5, '08:14:00', '08:15:00'),
(22, 5, 1, 6, '08:25:00', NULL),
(23, 6, 6, 1, NULL, '07:30:00'),
(24, 6, 3, 2, '08:14:00', '08:15:00'),
(25, 6, 1, 3, '08:45:00', NULL),
(26, 7, 5, 1, NULL, '08:00:00'),
(27, 7, 4, 2, '08:14:00', '08:15:00'),
(28, 7, 2, 3, '08:49:00', '08:50:00'),
(29, 7, 1, 4, '09:00:00', NULL),
(30, 8, 6, 1, NULL, '08:20:00'),
(31, 8, 4, 2, '08:49:00', '08:50:00'),
(32, 8, 1, 3, '09:35:00', NULL);

I hope I've given enough information to allow someone to be able to help.

Thanks for your time and patience.

*****UPDATE*****

OK, so I've made some progress. Run the following SQL in separate segments (it errors if I run it all together):

DROP TEMPORARY TABLE IF EXISTS tableSTART;

.

DROP TEMPORARY TABLE IF EXISTS tableEND;

.

CREATE TEMPORARY TABLE IF NOT EXISTS tableSTART AS
(
SELECT schedules.id, journeys.journey, stations.station, schedules.s_order, 
schedules.a_time, schedules.d_time
FROM `schedules` 
    JOIN `journeys` ON schedules.j_id = journeys.j_id
    JOIN `stations` ON schedules.s_id = stations.s_id
WHERE stations.station = "STB"
);

.

CREATE TEMPORARY TABLE IF NOT EXISTS tableEND AS
(
SELECT schedules.id, journeys.journey, stations.station, schedules.s_order, 
schedules.a_time, schedules.d_time
FROM `schedules` 
    JOIN `journeys` ON schedules.j_id = journeys.j_id
    JOIN `stations` ON schedules.s_id = stations.s_id
WHERE stations.station = "STE"
);

.

select ts.journey, ts.station startStn, ts.s_order startOrder, te.journey, 
te.station endStn, te.s_order endOrder
from tableSTART ts
cross join tableEND te

This shows the journeys, start (STB) and end (STE) stations along with the station order of the start and end stations. How do I filter this to leave only those journeys where the start order is less than the end order (i.e travelling from STB to STE and excluding journeys from STE to STB?

Thanks in advance.


Solution

  • Which journeys go from STB to STD?

    We want journeys that:

    1. have some scheduled stop at STB
    2. have some (other) stop scheduled at STD
    3. stop at STB before STD
    select j.journey
      from journeys j
           -- have some scheduled stop at STB
           join schedules sc1 on sc1.j_id = j.id
           join stations  st1 on st1.id = sc1.s_id
                             and st1.station = 'STB'
           -- have some (other) stop scheduled at STD
           join schedules sc2 on sc2.j_id = j.id
           join stations  st2 on st2.id = sc2.s_id
                             and st2.station = 'STD'
     -- stop at STB before STD
     where sc1.s_order < sc2.s_order;

    Note you may also write:

    select j.journey
      from journeys j
           join schedules sc1 on sc1.j_id = j.id
           join stations st1  on st1.id = sc1.s_id
           join schedules sc2 on sc2.j_id = j.id
           join stations st2  on st2.id = sc2.s_id
     where st1.station = 'STB'
       and st2.station = 'STD'
       and sc1.s_order < sc2.s_order;
    

    (putting conditions in the on of inner joins is the same putting them in where)

    Which journeys stop at STC?

    (probably trivial, after the above)

    select j.journey
      from journeys j
           join schedules sc on sc.j_id = j.id
           join stations  st on st.id = sc.s_id
     where st.station = 'STC';
    

    Note that this is exactly your query with less columns and a where :)